@J-D: Thanks, this sounds very likely. One more thing, from the logs of one slave, I can see the following: 2013-03-21 22:27:15,041 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 9 file(s) in f of rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into 5418126f3d154ef3aca8027e04512279, size=8.3g; total size for store is 8.3g [...] 2013-03-21 23:34:31,836 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 5 file(s) in f of rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into 3bdeb58c57af4ee1a92d22865e707416, size=8.3g; total size for store is 8.3g
Are not those the sign that a major compaction also occurred? And if so, what could have triggered it? On Thu, Mar 21, 2013 at 8:06 PM, Nicolas Seyvet <[email protected]>wrote: > @Ram: You are entirely correct, I made the exact same mistakes of mixing > up Large and minor compaction. By looking closely, what I see is that at > around 200 HFiles per region it starts minor compacting files per group of > 10 HFiles. The "problem" seems that this minor compacting never stops even > when there are about 20 HFiles left. It just keep on going and on taking > more and more time (I guess because the files to compact are getting > bigger). > > Of course in parallel we keep on adding more and more data. > > @J-D: "It seems to me that it would be better if you were able to do a > single load for all your files." Yes, I agree.. but that is not what we > are testing, our use case is to use 1min batch files. > > > > > >
