Michael, I am not sure, I recommend it as a solid middle ground so that you have room to scale in your cluster. Once you get to 20GB+ from what I understand there are some adverse performance issues. It is the same as recommending 2GB for HFilev1, it is a good middle ground or a 4 max. With that being said we have customer running 10GB region sizes on .90 successfully, but there are known kinks. So it is still just a matter of what works for you!
On Thu, Nov 1, 2012 at 9:50 AM, Michael Segel <[email protected]>wrote: > Just out of curiosity... > > What's the impact on having regions of 10GB or larger? > > What does that do to your footprint in memory and the time it takes to > split or compact a region? > > -Mike > > On Nov 1, 2012, at 8:35 AM, Kevin O'dell <[email protected]> wrote: > > > Couple thoughts(it is still early here so bear with me): > > > > Did you presplit your table? > > > > You are on .92, might as well take advantage of HFilev2 and use 10GB > region > > sizes > > > > Loading over MR, I am assuming puts? Did you tune your memstore and Hlog > > size? > > > > You aren't using a different client version or something strange like > that > > are you? > > > > You can't close hlog messages seem to indicate an inability to talk to > > HDFS. Did you have connection issues there? > > > > > > > > On Thu, Nov 1, 2012 at 5:20 AM, ramkrishna vasudevan < > > [email protected]> wrote: > > > >> Can you try restarting the cluster i mean the master and RS. > >> Also if this things persists try to clear the zk data and restart. > >> > >> Regards > >> Ram > >> > >> On Thu, Nov 1, 2012 at 2:46 PM, Cheng Su <[email protected]> wrote: > >> > >>> Sorry, my mistake. Ignore about the "max store size of a single CF" > >> please. > >>> > >>> m(_ _)m > >>> > >>> On Thu, Nov 1, 2012 at 4:43 PM, Ameya Kantikar <[email protected]> > >> wrote: > >>>> Thanks Cheng. I'll try increasing my max region size limit. > >>>> > >>>> However I am not clear with this math: > >>>> > >>>> "Since you set the max file size to 2G, you can only store 2XN G data > >>>> into a single CF." > >>>> > >>>> Why is that? My assumption is, even though single region can only be 2 > >>> GB, > >>>> I can still have hundreds of regions, and hence can store 200GB+ data > >> in > >>>> single CF on my 10 machine cluster. > >>>> > >>>> Ameya > >>>> > >>>> > >>>> On Thu, Nov 1, 2012 at 1:19 AM, Cheng Su <[email protected]> > wrote: > >>>> > >>>>> I met same problem these days. > >>>>> I'm not very sure the error log is exactly same, but I do have the > >>>>> same exception > >>>>> > >>>>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > >>>>> Failed 1 action: NotServingRegionException: 1 time, servers with > >>>>> issues: smartdeals-hbase8-snc1.snc1:60020, > >>>>> > >>>>> and the table is also neither enabled nor disabled, thus I can't drop > >>> it. > >>>>> > >>>>> I guess the problem is the total store size. > >>>>> How many region server do you have? > >>>>> Since you set the max file size to 2G, you can only store 2XN G data > >>>>> into a single CF. > >>>>> (N is the number of your region servers) > >>>>> > >>>>> You might want to increase the max file size or region servers. > >>>>> > >>>>> On Thu, Nov 1, 2012 at 3:29 PM, Ameya Kantikar <[email protected]> > >>> wrote: > >>>>>> One more thing, the Hbase table in question is neither enabled, nor > >>>>>> disabled: > >>>>>> > >>>>>> hbase(main):006:0> is_disabled 'userTable1' > >>>>>> false > >>>>>> > >>>>>> 0 row(s) in 0.0040 seconds > >>>>>> > >>>>>> hbase(main):007:0> is_enabled 'userTable1' > >>>>>> false > >>>>>> > >>>>>> 0 row(s) in 0.0040 seconds > >>>>>> > >>>>>> Ameya > >>>>>> > >>>>>> On Thu, Nov 1, 2012 at 12:02 AM, Ameya Kantikar <[email protected]> > >>>>> wrote: > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> I am trying to load lot of data (around 1.5 TB) into a single Hbase > >>>>> table. > >>>>>>> I have setup region size at 2 GB. I also > >>>>>>> set hbase.regionserver.handler.count at 30. > >>>>>>> > >>>>>>> When I start loading data via MR, after a while, tasks start > >> failing > >>>>> with > >>>>>>> following error: > >>>>>>> > >>>>>>> > >> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > >>>>> Failed 1 action: NotServingRegionException: 1 time, servers with > >> issues: > >>>>> smartdeals-hbase8-snc1.snc1:60020, > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1641) > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409) > >>>>>>> at > >>>>> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943) > >>>>>>> at > >> org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820) > >>>>>>> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795) > >>>>>>> at > >>>>> > >>> > >> > com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:83) > >>>>>>> at > >>>>> > >>> > >> > com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:33) > >>>>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > >>>>>>> at > >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > >>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.j > >>>>>>> > >>>>>>> On the hbase8 machine I see following in logs: > >>>>>>> > >>>>>>> ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error while > >>>>> syncing, requesting close of hlog > >>>>>>> java.io.IOException: Reflection > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:230) > >>>>>>> at > >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1109) > >>>>>>> at > >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1213) > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1071) > >>>>>>> at java.lang.Thread.run(Thread.java:662) > >>>>>>> Caused by: java.lang.reflect.InvocationTargetException > >>>>>>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown > >>> Source) > >>>>>>> at > >>>>> > >>> > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:228) > >>>>>>> ... 4 more > >>>>>>> > >>>>>>> > >>>>>>> I only have 15 map tasks each on a 10 machine cluster (total 150 > >> map > >>>>> tasks entering data into Hbase table). > >>>>>>> > >>>>>>> Further, I see 2-3 regions perpetually under "Regions in > >> Transitions" > >>>>> in Hbase master web console as follows: > >>>>>>> > >>>>>>> 8dcb3edee4e43faa3dbeac2db4f12274userTable1, > >> [email protected] > >>> ,1351728961461.8dcb3edee4e43faa3dbeac2db4f12274. > >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:39:57 UTC 2012 (409s ago), > >>>>> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514 > >>>>>>> > >>>>>>> > >>>>>>> bb91fd0c855e60dd4159e0ad3fd52cdauserTable1,[email protected] > >>> ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda. > >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago), > >>>>> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016 > >>>>>>> bd44334a11464baf85013c97d673e600userTable1,[email protected] > >>> ,1351728952308.bd44334a11464baf85013c97d673e600. > >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago), > >>>>> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514 > >>>>>>> ed1f7e7908fc232f10d78dd1e796a5d7userTable1,[email protected] > >>> ,1351728971232.ed1f7e7908fc232f10d78dd1e796a5d7. > >>>>> state=PENDING_OPEN, ts=Thu Nov 01 06:37:37 UTC 2012 (549s ago), > >>>>> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016 > >>>>>>> > >>>>>>> > >>>>>>> Note these are not going away even after 30 minutes. > >>>>>>> > >>>>>>> Further after running > >>>>>>> > >>>>>>> hbase hbck -summary I get following: > >>>>>>> > >>>>>>> Summary: > >>>>>>> -ROOT- is okay. > >>>>>>> Number of regions: 1 > >>>>>>> Deployed on: smartdeals-hbase7-snc1.snc1,60020,1351747458782 > >>>>>>> .META. is okay. > >>>>>>> Number of regions: 1 > >>>>>>> Deployed on: smartdeals-hbase7-snc1.snc1,60020,1351747458782 > >>>>>>> test1 is okay. > >>>>>>> Number of regions: 1 > >>>>>>> Deployed on: smartdeals-hbase2-snc1.snc1,60020,1351747457308 > >>>>>>> userTable1 is okay. > >>>>>>> Number of regions: 32 > >>>>>>> Deployed on: smartdeals-hbase10-snc1.snc1,60020,1351747456776 > >>>>> smartdeals-hbase2-snc1.snc1,60020,1351747457308 > >>>>> smartdeals-hbase4-snc1.snc1,60020,1351747455571 > >>>>> smartdeals-hbase5-snc1.snc1,60020,1351747458579 > >>>>> smartdeals-hbase6-snc1.snc1,60020,1351747458186 > >>>>> smartdeals-hbase7-snc1.snc1,60020,1351747458782 > >>>>> smartdeals-hbase8-snc1.snc1,60020,1351747459112 > >>>>> smartdeals-hbase9-snc1.snc1,60020,1351747455106 > >>>>>>> 24 inconsistencies detected. > >>>>>>> Status: INCONSISTENT > >>>>>>> > >>>>>>> In master logs I am seeing following error: > >>>>>>> > >>>>>>> ERROR org.apache.hadoop.hbase.master.AssignmentManager: Failed > >>>>> assignment in: smartdeals-hbase3-snc1.snc1,60020,1351747466016 due to > >>>>>>> > >>>>> > >> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: > >>>>> Received:OPEN for the region:userTable1,[email protected] > >>> ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda. > >>>>> ,which we are already trying to OPEN. > >>>>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.checkIfRegionInTransition(HRegionServer.java:2499) > >>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2457) > >>>>> at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) > >>>>> at > >>>>> > >>> > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>>> at java.lang.reflect.Method.invoke(Method.java:597) at > >>>>> > >>> > >> > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) > >>>>> at > >>>>> > >>> > >> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336) > >>>>>>> > >>>>>>> > >>>>>>> Am I missing something? How do I recover from this? How do I load > >> lot > >>>>> of data via MR into Hbase Tables? > >>>>>>> > >>>>>>> > >>>>>>> I am running under following setup: > >>>>>>> > >>>>>>> hadoop:2.0.0-cdh4.0.1 > >>>>>>> > >>>>>>> hbase: 0.92.1-cdh4.0.1, r > >>>>>>> > >>>>>>> > >>>>>>> Would greatly appreciate any help. > >>>>>>> > >>>>>>> > >>>>>>> Ameya > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> Regards, > >>>>> Cheng Su > >>>>> > >>> > >>> > >>> > >>> -- > >>> > >>> Regards, > >>> Cheng Su > >>> > >> > > > > > > > > -- > > Kevin O'Dell > > Customer Operations Engineer, Cloudera > > -- Kevin O'Dell Customer Operations Engineer, Cloudera
