Re: Storing XML file in Hbase

2016-11-28 Thread Richard Startin
In my experience it's better to keep the number of column families low. When flushes occur, they effect all column families in a table, so when the memstore fills you'll create an HFile per family. I haven't seen any performance impact in having two column families though. As for the number

Re: Using Hbase as a transactional table

2016-11-28 Thread John Leach
Mich, Splice Machine (Open Source) can do this on top of Hbase and we have an example running a TPC-C benchmark. Might be worth a look. Regards, John > On Nov 28, 2016, at 4:36 PM, Ted Yu wrote: > > Not sure if Transactions (beta) | Apache Phoenix is up to date. >

Re: Using Hbase as a transactional table

2016-11-28 Thread Mich Talebzadeh
Thanks Ted. How does Phoenix provide transaction support? I have read some docs but sounds like problematic. I need to be sure there is full commit and rollback if things go wrong! Also it appears that Phoenix transactional support is in beta phase. Cheers Dr Mich Talebzadeh LinkedIn *

Re: Storing XML file in Hbase

2016-11-28 Thread Mich Talebzadeh
Thanks Richard. How would one decide on the number of column family and columns? Is there a ballpark approach Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: High CPU Utilization by meta region

2016-11-28 Thread Timothy Brown
Responses inlined. On Mon, Nov 28, 2016 at 12:45 PM, Stack wrote: > On Sun, Nov 27, 2016 at 6:53 PM, Timothy Brown > wrote: > > > Hi Everyone, > > > > I apologize for starting an additional thread about this but I wasn't > > subscribed to the users

Re: High CPU Utilization by meta region

2016-11-28 Thread Stack
On Sun, Nov 27, 2016 at 6:53 PM, Timothy Brown wrote: > Hi Everyone, > > I apologize for starting an additional thread about this but I wasn't > subscribed to the users mailing list when I sent the original and can't > figure out how to respond to the original :( > >

Re: Storing XML file in Hbase

2016-11-28 Thread Richard Startin
Hi Mich, If you want to store the file whole, you'll need to enforce a 10MB limit to the file size, otherwise you will flush too often (each time the me store fills up) which will slow down writes. Maybe you could deconstruct the xml by extracting columns from the xml using xpath? If the

Re: Storing XML file in Hbase

2016-11-28 Thread Dima Spivak
Hi Mich, How many files are you looking to store? How often do you need to read them? What's the total size of all the files you need to serve? Cheers, Dima On Mon, Nov 28, 2016 at 7:04 AM Mich Talebzadeh wrote: > Hi, > > Storing XML file in Big Data. Are there any

Re: Creating HBase table with presplits

2016-11-28 Thread Dave Latham
If you truly have no way to predict anything about the distribution of your data across the row key space, then you are correct that there is no way to presplit your regions in an effective way. Either you need to make some starting guess, such as a small number of uniform splits, or wait until

Storing XML file in Hbase

2016-11-28 Thread Mich Talebzadeh
Hi, Storing XML file in Big Data. Are there any strategies to create multiple column families or just one column family and in that case how many columns would be optional? thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: High CPU Utilization by meta region

2016-11-28 Thread Timothy Brown
Hi Ted, The region server hosting hbase:meta only has the meta region on it so it has 1 region while other region servers can have more than 100 regions on them. I didn't notice anything interesting in the logs in my opinion. Is there anything in particular I should watch out for? The hbase:meta

Re: High CPU Utilization by meta region

2016-11-28 Thread Ted Yu
Does the region server hosting hbase:meta have roughly the same number of regions as the other servers ? Did you find anything interesting in the server log (where hbase:meta is hosted) ? Have you tried major compacting the hbase:meta table ? In 1.2, DEFAULT_HBASE_META_VERSIONS is still 10. See

About HBASE-12949 - Scanner can be stuck in infinite loop if the HFile is corrupted

2016-11-28 Thread Chang Chen
Hi jerry we met the similar issue with HBASE-12949, I guess compaction thread is in the endless loop, because the CPU load is quite higher than usual. However, I can not understand why KeyValueHeap.generalizedSeek() has a endless loop, from the codes: boolean seekResult; if

Creating HBase table with presplits

2016-11-28 Thread Sachin Jain
Hi, I was going though pre-splitting a table article [0] and it is mentioned that it is generally best practice to presplit your table. But don't we need to know the data in advance in order to presplit it. Question: What should be the best practice when we don't know what data is going to be