Re: Newbie question: Rowkey design

2013-12-16 Thread Tao Xiao
Sometimes row key design is a trade-off issue between load-balance and query : if you design row key such that you can query it very fast and convenient, maybe the records are not spread evenly across the nodes; if you design row key such that the records are spread evenly across the nodes, maybe i

Re: Why so many unexpected files like partitions_xxxx are created?

2013-12-16 Thread Ted Yu
Should bulk load task clean up partitions_ upon completion ? Cheers On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan wrote: > > I think I should delete these files immediately after I have finished > bulk loading data into HBase since they are useless at that time, right ? > > Ya. I think so. T

RE: Why so many unexpected files like partitions_xxxx are created?

2013-12-16 Thread Bijieshan
> I think I should delete these files immediately after I have finished bulk > loading data into HBase since they are useless at that time, right ? Ya. I think so. They are useless once bulk load task finished. Jieshan. -Original Message- From: Tao Xiao [mailto:xiaotao.cs@gmail.com]

RE: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Bijieshan
In the first step, the files are read correctly and regionGroups is creates as it should. Did you notice the reducer numbers? Did it equal to 2000(Before your extended HFileOutputFormat)? >>> RegionServer logs in the RegionServer that the files are moved to >>> indeed shows that all

Re: Why so many unexpected files like partitions_xxxx are created?

2013-12-16 Thread Tao Xiao
Indeed these files are produced by org.apache.hadoop.hbase.mapreduce. LoadIncrementalHFiles in the directory specified by what job.getWorkingDirectory() returns, and I think I should delete these files immediately after I have finished bulk loading data into HBase since they are useless at that tim

Re: Guava 15

2013-12-16 Thread Pradeep Gollakota
This is kinda tangential, but for very very common dependencies such as guava, jackson, etc. would it make sense to use a shaded jar so as not to affect user dependencies? On Mon, Dec 16, 2013 at 7:47 PM, Ted Yu wrote: > Please try out patch v2 from HBASE-10174 > > Thanks > > On Dec 16, 2013, a

Re: Guava 15

2013-12-16 Thread Ted Yu
Please try out patch v2 from HBASE-10174 Thanks On Dec 16, 2013, at 11:42 AM, Kristoffer Sjögren wrote: > Oh thank you very much Ted! :-) > > Ill give it a try tomorrow. > > Cheers! > > > On Mon, Dec 16, 2013 at 6:05 PM, Ted Yu wrote: > >> I created HBASE-10174 and attached a patch there.

Re: Guava 15

2013-12-16 Thread Kristoffer Sjögren
Oh thank you very much Ted! :-) Ill give it a try tomorrow. Cheers! On Mon, Dec 16, 2013 at 6:05 PM, Ted Yu wrote: > I created HBASE-10174 and attached a patch there. > > Running 0.94 test suite now. > > > On Mon, Dec 16, 2013 at 7:05 AM, Nicolas Liochon > wrote: > > > That means more or les

ANNOUNCE: hbase-0.96.1 available for download

2013-12-16 Thread Stack
hbase-0.96.1 is now available for download. Get it at your nearest Apache mirror [1] or maven repository. 157 bug fixes and performance fixes [2] have been made since 0.96.0. Please use this release rather than 0.96.0 going forward. Yours, Your HBase Team 1. http://www.apache.org/dyn/closer.cg

Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes

2013-12-16 Thread Patrick Schless
Thanks for the tips. I'll play around with this this week and try to get a script that won't affect our performance too bad. I imagine most people do this at off-peak times, but we don't have that so we'll have to figure out how to spread out the load as much as possible. On Fri, Dec 13, 2013 at

Re: Guava 15

2013-12-16 Thread Ted Yu
I created HBASE-10174 and attached a patch there. Running 0.94 test suite now. On Mon, Dec 16, 2013 at 7:05 AM, Nicolas Liochon wrote: > That means more or less backporting the patch to the 0.94, no? > It should work imho. > > > > > On Mon, Dec 16, 2013 at 3:16 PM, Kristoffer Sjögren >wrote:

Re: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Amit Sela
I've managed to isolate the problem. I implemented an extension of HFileOutputFormat - because each bulk load will import data to the newly created regions only, I pass the prefix (MMdd) to MyHFileOutputFormat.configureIncrementalLoad() so that getRegionStartKeys returns only the corresponding

Newbie question: Rowkey design

2013-12-16 Thread Wilm Schumacher
Hi, I'm a newbie to hbase and have a question on the rowkey design and I hope this question isn't to newbie-like for this list. I have a question which cannot be answered by knoledge of code but by experience with large databases, thus this mail. For the sake of explaination I create a small exam

Re: Guava 15

2013-12-16 Thread Nicolas Liochon
That means more or less backporting the patch to the 0.94, no? It should work imho. On Mon, Dec 16, 2013 at 3:16 PM, Kristoffer Sjögren wrote: > Thanks! But we cant really upgrade to HBase 0.96 right now, but we need to > go to Guava 15 :-( > > I was thinking of overriding the classes fixed in

Re: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Amit Sela
Loaded regions are listed in .META. table and the ENCODED field in the table points to an existing directory. But all family directories in this region are empty... On Mon, Dec 16, 2013 at 4:29 PM, Amit Sela wrote: > I ran the hbck tool, and while I do have some inconsistencies they are not > i

Re: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Amit Sela
I ran the hbck tool, and while I do have some inconsistencies they are not in the table that has the bulk load issues. On Mon, Dec 16, 2013 at 4:22 PM, Amit Sela wrote: > RegionServer logs in the RegionServer that the files are moved to indeed > shows that all files are moved to that region (w

Re: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Amit Sela
RegionServer logs in the RegionServer that the files are moved to indeed shows that all files are moved to that region (when it doesn't happen it shows only 1 file per family moved to a RegionServer) On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela wrote: > In the first step, the files are read corre

Re: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Amit Sela
In the first step, the files are read correctly and regionGroups is creates as it should. When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice that ServerCallable's regionName returned from server is the wrong region (the pre-split last region). The previous last region is not su

Re: Guava 15

2013-12-16 Thread Kristoffer Sjögren
Thanks! But we cant really upgrade to HBase 0.96 right now, but we need to go to Guava 15 :-( I was thinking of overriding the classes fixed in the patch in our test environment. Could this work maybe? On Mon, Dec 16, 2013 at 11:01 AM, Kristoffer Sjögren wrote: > Hi > > At the moment HFileWrit

RE: Bulk load moving HFiles to the wrong region

2013-12-16 Thread Bijieshan
As we know, bulk load has two steps: 1. Create HFiles by MapReduce. 2. Load HFiles into HBase. I wonder whether it read the right partitions information during the first step. Have you run hbck tool to check the cluster healthy? You mentioned you see the new regions in the webapp. The files were

RE: Why so many unexpected files like partitions_xxxx are created?

2013-12-16 Thread Bijieshan
The reduce partition information is stored in this partition_ file. See the below code: HFileOutputFormat#configureIncrementalLoad: . Path partitionsPath = new Path(job.getWorkingDirectory(), "partitions_" + UUID.randomUUID())

Re: Guava 15

2013-12-16 Thread Nicolas Liochon
Hi, It's fixed in HBase 0.96 (by HBASE-9667). Cheers, Nicolas On Mon, Dec 16, 2013 at 11:01 AM, Kristoffer Sjögren wrote: > Hi > > At the moment HFileWriterV2.close breaks at startup when using Guava 15. > This is not a client problem - it happens because we start a master node to > do integr

Why so many unexpected files like partitions_xxxx are created?

2013-12-16 Thread Tao Xiao
I imported data into HBase in the fashion of bulk load, but after that I found many unexpected file were created in the HDFS directory of /user/root/, and they like these: /user/root/partitions_fd74866b-6588-468d-8463-474e202db070 /user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d /user/r

Guava 15

2013-12-16 Thread Kristoffer Sjögren
Hi At the moment HFileWriterV2.close breaks at startup when using Guava 15. This is not a client problem - it happens because we start a master node to do integration tests. A bit precarious and wonder if there are any plans to support Guava 15, or if there are clever way around this? Cheers, -K