In my experience it's better to keep the number of column families low. When
flushes occur, they effect all column families in a table, so when the memstore
fills you'll create an HFile per family. I haven't seen any performance impact
in having two column families though.
As for the number
Mich,
Splice Machine (Open Source) can do this on top of Hbase and we have an example
running a TPC-C benchmark. Might be worth a look.
Regards,
John
> On Nov 28, 2016, at 4:36 PM, Ted Yu wrote:
>
> Not sure if Transactions (beta) | Apache Phoenix is up to date.
>
Thanks Ted.
How does Phoenix provide transaction support?
I have read some docs but sounds like problematic. I need to be sure there
is full commit and rollback if things go wrong!
Also it appears that Phoenix transactional support is in beta phase.
Cheers
Dr Mich Talebzadeh
LinkedIn *
Thanks Richard.
How would one decide on the number of column family and columns?
Is there a ballpark approach
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
Responses inlined.
On Mon, Nov 28, 2016 at 12:45 PM, Stack wrote:
> On Sun, Nov 27, 2016 at 6:53 PM, Timothy Brown
> wrote:
>
> > Hi Everyone,
> >
> > I apologize for starting an additional thread about this but I wasn't
> > subscribed to the users
On Sun, Nov 27, 2016 at 6:53 PM, Timothy Brown wrote:
> Hi Everyone,
>
> I apologize for starting an additional thread about this but I wasn't
> subscribed to the users mailing list when I sent the original and can't
> figure out how to respond to the original :(
>
>
Hi Mich,
If you want to store the file whole, you'll need to enforce a 10MB limit to the
file size, otherwise you will flush too often (each time the me store fills up)
which will slow down writes.
Maybe you could deconstruct the xml by extracting columns from the xml using
xpath?
If the
Hi Mich,
How many files are you looking to store? How often do you need to read
them? What's the total size of all the files you need to serve?
Cheers,
Dima
On Mon, Nov 28, 2016 at 7:04 AM Mich Talebzadeh
wrote:
> Hi,
>
> Storing XML file in Big Data. Are there any
If you truly have no way to predict anything about the distribution of your
data across the row key space, then you are correct that there is no way to
presplit your regions in an effective way. Either you need to make some
starting guess, such as a small number of uniform splits, or wait until
Hi,
Storing XML file in Big Data. Are there any strategies to create multiple
column families or just one column family and in that case how many columns
would be optional?
thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
Hi Ted,
The region server hosting hbase:meta only has the meta region on it so it
has 1 region while other region servers can have more than 100 regions on
them.
I didn't notice anything interesting in the logs in my opinion. Is there
anything in particular I should watch out for?
The hbase:meta
Does the region server hosting hbase:meta have roughly the same number of
regions as the other servers ?
Did you find anything interesting in the server log (where hbase:meta is
hosted) ?
Have you tried major compacting the hbase:meta table ?
In 1.2, DEFAULT_HBASE_META_VERSIONS is still 10. See
Hi jerry
we met the similar issue with HBASE-12949, I guess compaction thread is in
the endless loop, because the CPU load is quite higher than usual.
However, I can not understand why KeyValueHeap.generalizedSeek() has a
endless loop, from the codes:
boolean seekResult;
if
Hi,
I was going though pre-splitting a table article [0] and it is mentioned
that it is generally best practice to presplit your table. But don't we
need to know the data in advance in order to presplit it.
Question: What should be the best practice when we don't know what data is
going to be
14 matches
Mail list logo