RE: Region autoSplit when not reach 'hbase.hregion.max.filesize' ?

Ramkrishna.S.Vasudevan Wed, 06 Jun 2012 21:17:40 -0700

Hi
Please file a JIRA so that we can discuss and resolve it. Thanks.

Regards
Ram


> -----Original Message-----
> From: NNever [mailto:[email protected]]
> Sent: Thursday, June 07, 2012 8:26 AM
> To: [email protected]
> Subject: Re: Region autoSplit when not reach
> 'hbase.hregion.max.filesize' ?
> 
> On 0.94.0, In class RegionSplitPolicy, I saw you
> use IncreasingToUpperBoundRegionSplitPolicy as
> DEFAULT_SPLIT_POLICY_CLASS.
> But the javadoc tells that defalut policy is
> ConstantSizeRegionSplitPolicy.
> 
> So is the  DEFAULT_SPLIT_POLICY_CLASS wrong or the javadoc has not
> update
> yet?
> 
> Yours,
> NN
> 
> 2012/6/7 NNever <[email protected]>
> 
> > So  IncreasingToUpperBoundRegionSplitPolicy  will  do split when size
> > reach (square region-num)* flushSize until reach the maxfileSize.
> > We didn't config splitPolicy, will hbase0.94 use  IncreasingToUpper
> > BoundRegionSplitPolicy  as default?
> >
> >
> > 2012/6/7 NNever <[email protected]>
> >
> >> Finally I change the log4j conf and try again, the split log comes
> >> out.......
> >>
> >> 2012-06-07 10:30:52,161 INFO
> >> org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
> flush of
> >> ~128.0m/134221272, currentsize=1.5m/1617744 for region
> >> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8. in
> 3201ms,
> >> sequenceid=176387980, compaction requested=false
> >> 2012-06-07 10:30:52,161 DEBUG
> org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitP
> olicy:
> >> ShouldSplit because info size=138657416, sizeToCheck=134217728,
> >> regionsWithCommonTable=1
> >> 2012-06-07 10:30:52,161 DEBUG
> org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitP
> olicy:
> >> ShouldSplit because info size=138657416, sizeToCheck=134217728,
> >> regionsWithCommonTable=1
> >> 2012-06-07 10:30:52,240 DEBUG
> >> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split
> requested
> >> for
> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8..
> >>  compaction_queue=(0:0), split_queue=0
> >> 2012-06-07 10:30:52,265 INFO
> >> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting
> split of
> >> region
> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8.
> >> 2012-06-07 10:30:52,265 DEBUG
> >> org.apache.hadoop.hbase.regionserver.SplitTransaction:
> >> regionserver:60020-0x137c4929efe0001 Creating ephemeral node for
> >> 7b229abcd0785408251a579e9bdf49c8 in SPLITTING state
> >> 2012-06-07 10:30:52,368 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> regionserver:60020-0x137c4929efe0001 Attempting to transition node
> >> 7b229abcd0785408251a579e9bdf49c8 from RS_ZK_REGION_SPLITTING to
> >> RS_ZK_REGION_SPLITTING
> >> 2012-06-07 10:30:52,382 DEBUG
> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> regionserver:60020-0x137c4929efe0001 Successfully transitioned node
> >> 7b229abcd0785408251a579e9bdf49c8 from RS_ZK_REGION_SPLITTING to
> >> RS_ZK_REGION_SPLITTING
> >> 2012-06-07 10:30:52,410 DEBUG
> >> org.apache.hadoop.hbase.regionserver.HRegion: Closing
> >> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8.:
> disabling
> >> compactions & flushes
> >> 2012-06-07 10:30:52,410 DEBUG
> >> org.apache.hadoop.hbase.regionserver.HRegionServer:
> >> NotServingRegionException;
> >> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8. is
> closing
> >> 2012-06-07 10:30:52,411 DEBUG
> >> org.apache.hadoop.hbase.regionserver.HRegionServer:
> >> NotServingRegionException;
> >> FileStructIndex,,1339032525500.7b229abcd0785408251a579e9bdf49c8. is
> closing
> >>
> >>
> >> Best regards,
> >> NN
> >>
> >>
> >> 2012/6/7 NNever <[email protected]>
> >>
> >>> We use hbase 0.94.0, running on a single mechine.
> >>> The hbase-site.xml:
> >>>
> >>> <configuration>
> >>>> <property>
> >>>> <name>hbase.rootdir</name>
> >>>> <value>hdfs://xxxxx/hbase</value>
> >>>>  </property>
> >>>> <property>
> >>>> <name>hbase.cluster.distributed</name>
> >>>>  <value>true</value>
> >>>> </property>
> >>>> <property>
> >>>>  <name>hbase.zookeeper.quorum</name>
> >>>> <value>xxxxx</value>
> >>>> </property>
> >>>>  <property>
> >>>> <name>hbase.zookeeper.property.dataDir</name>
> >>>> <value>/mybk/zookeeper</value>
> >>>>  </property>
> >>>> <property>
> >>>> <name>hbase.hregion.max.filesize</name>
> >>>>  <value>107374182400</value>
> >>>> </property>
> >>>> <property>
> >>>>  <name>zookeeper.session.timeout</name>
> >>>> <value>60000</value>
> >>>> </property>
> >>>>  <property>
> >>>> <name>hbase.regionserver.handler.count</name>
> >>>> <value>4000</value>
> >>>>  </property>
> >>>> <property>
> >>>> <name>hbase.client.write.buffer</name>
> >>>>  <value>1048576</value>
> >>>> </property>
> >>>> <property>
> >>>>  <name>hbase.client.scanner.caching</name>
> >>>> <value>10</value>
> >>>> </property>
> >>>> </configuration>
> >>>
> >>>
> >>>
> >>>
> >>> 2012/6/7 NNever <[email protected]>
> >>>
> >>>> The rowkey is just like an UUID. no order.  And there is an
> coprocessor
> >>>> trigger datas to another 2 index-tables when doing put...
> >>>>
> >>>> Thanks. yours
> >>>> NN
> >>>>
> >>>>
> >>>> 2012/6/7 Michael Segel <[email protected]>
> >>>>
> >>>>> Just out of curiosity, describe the data?
> >>>>> Sorted?
> >>>>> The more we know, the easier it is to help... Also, can you
> recheck
> >>>>> your math ?
> >>>>>
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>> On Jun 6, 2012, at 6:17 PM, "NNever" <[email protected]> wrote:
> >>>>>
> >>>>> > It comes again. I truncate the table, and put about 10million
> datas
> >>>>> into it
> >>>>> > last night.
> >>>>> > The table auto-split to 4, each has about 3Gb
> >>>>> storefileUncompressedSize.
> >>>>> >
> >>>>> > I grep the log and out but nothing about the split.
> >>>>> >
> >>>>> > the logs are as below:
> >>>>> > 2012-06-06 19:31:15,402 WARN org.apache.hadoop.ipc.HBaseServer:
> >>>>> > (responseTooSlow):
> >>>>> > {"processingtimems":10296,"call":"next(1511657428305700194, 1),
> rpc
> >>>>> > version=1, client version=29,
> >>>>> methodsFingerPrint=-1508511443","client":"
> >>>>> > 192.168.1.145:46456
> >>>>> >
> >>>>>
> ","starttimems":1338982265104,"queuetimems":0,"class":"HRegionServer","
> responsesize":6,"method":"next"}
> >>>>> > 2012-06-06 19:31:15,606 WARN org.apache.hadoop.ipc.HBaseServer:
> >>>>> > (responseTooSlow):
> >>>>> > {"processingtimems":10842,"call":"next(-2954106234340837837,
> 1), rpc
> >>>>> > version=1, client version=29,
> >>>>> methodsFingerPrint=-1508511443","client":"
> >>>>> > 192.168.1.145:46456
> >>>>> >
> >>>>>
> ","starttimems":1338982264763,"queuetimems":1,"class":"HRegionServer","
> responsesize":6,"method":"next"}
> >>>>> > 2012-06-06 19:31:29,795 WARN org.apache.hadoop.ipc.HBaseServer:
> >>>>> > (responseTooSlow):
> >>>>> > {"processingtimems":10668,"call":"next(2455689470981850756, 1),
> rpc
> >>>>> > version=1, client version=29,
> >>>>> methodsFingerPrint=-1508511443","client":"
> >>>>> > 192.168.1.145:46456
> >>>>> >
> >>>>>
> ","starttimems":1338982279126,"queuetimems":0,"class":"HRegionServer","
> responsesize":6,"method":"next"}
> >>>>> > 2012-06-06 20:24:54,157 WARN org.apache.hadoop.ipc.HBaseServer:
> >>>>> > (responseTooSlow):
> >>>>> >
> >>>>>
> {"processingtimems":2920400,"call":"multi(org.apache.hadoop.hbase.clien
> t.MultiAction@6b39de40
> >>>>> ),
> >>>>> > rpc version=1, client version=29,
> >>>>> methodsFingerPrint=-1508511443","client":"
> >>>>> > 192.168.1.145:46456
> >>>>> >
> >>>>>
> ","starttimems":1338982573756,"queuetimems":0,"class":"HRegionServer","
> responsesize":0,"method":"multi"}
> >>>>> > 2012-06-06 20:24:54,251 WARN org.apache.hadoop.ipc.HBaseServer:
> IPC
> >>>>> Server
> >>>>> > Responder, call
> >>>>> multi(org.apache.hadoop.hbase.client.MultiAction@6b39de40),
> >>>>> > rpc version=1, client version=29, methodsFingerPrint=-
> 1508511443 from
> >>>>> > 192.168.1.145:46456: output error
> >>>>> > 2012-06-06 20:24:54,294 WARN org.apache.hadoop.ipc.HBaseServer:
> IPC
> >>>>> Server
> >>>>> > handler 2159 on 60020 caught a ClosedChannelException, this
> means
> >>>>> that the
> >>>>> > server was processing a request but the client went away. The
> error
> >>>>> message
> >>>>> > was: null
> >>>>> > 2012-06-06 20:25:00,868 WARN org.apache.hadoop.ipc.HBaseServer:
> >>>>> > (responseTooSlow):
> >>>>> {"processingtimems":2927114,"call":"multi(org.apache.hado
> >>>>> >
> >>>>> > You can see on 19:31:29 the log stop for about 1 hour, It may
> doing
> >>>>> split
> >>>>> > there.(this is the regionserver.log)
> >>>>> > And in the regionServer.out I cannot find any information about
> >>>>> split, Only
> >>>>> > lots of 'org.apache.hadoop.hbase.NotServingRegionException'
> when
> >>>>> split. No
> >>>>> > log about start doing split and why do split.
> >>>>> >
> >>>>> > logs are too large to upload somewhere.
> >>>>> >
> >>>>> > I'll dig into it....It really confuse me...
> >>>>> >
> >>>>> > Thanks, yours
> >>>>> > NN
> >>>>> >
> >>>>> >
> >>>>> > 2012/6/6 NNever <[email protected]>
> >>>>> >
> >>>>> >> I'll. I changed the log level.
> >>>>> >> Putting datas and waiting for the strange split now   :).....
> >>>>> >>
> >>>>> >> Yours,
> >>>>> >> NN
> >>>>> >>
> >>>>> >> 2012/6/6 dong.yajun <[email protected]>
> >>>>> >>
> >>>>> >> Hi NNever
> >>>>> >>>
> >>>>> >>> If you find any issues, please let us known, thanks.
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> On Wed, Jun 6, 2012 at 5:09 PM, NNever <[email protected]>
> >>>>> wrote:
> >>>>> >>>
> >>>>> >>>> I'm sorry, the log4j now is WARN, not INFO
> >>>>> >>>>
> >>>>> >>>> 2012/6/6 NNever <[email protected]>
> >>>>> >>>>
> >>>>> >>>>> We currently run in INFO mode.
> >>>>> >>>>> It actully did the split, but I cannot find any logs about
> this
> >>>>> split.
> >>>>> >>>>> I will change the log4j to DEBUG, if got any log valuable,
> I will
> >>>>> >>> paste
> >>>>> >>>>> here...
> >>>>> >>>>>
> >>>>> >>>>> Thanks Ram,
> >>>>> >>>>> NN
> >>>>> >>>>>
> >>>>> >>>>> 2012/6/6 Ramkrishna.S.Vasudevan
> <[email protected]
> >>>>> >
> >>>>> >>>>>
> >>>>> >>>>> You have any logs corresponding to this?
> >>>>> >>>>>>
> >>>>> >>>>>> Regards
> >>>>> >>>>>> Ram
> >>>>> >>>>>>
> >>>>> >>>>>>> -----Original Message-----
> >>>>> >>>>>>> From: NNever [mailto:[email protected]]
> >>>>> >>>>>>> Sent: Wednesday, June 06, 2012 2:12 PM
> >>>>> >>>>>>> To: [email protected]
> >>>>> >>>>>>> Subject: Region autoSplit when not reach
> >>>>> >>> 'hbase.hregion.max.filesize'
> >>>>> >>>> ?
> >>>>> >>>>>>>
> >>>>> >>>>>>> The 'hbase.hregion.max.filesize' are set to 100G (The
> recommed
> >>>>> >>> value
> >>>>> >>>> to
> >>>>> >>>>>>> act
> >>>>> >>>>>>> as auto-split turn off). And there is a table, we keep
> put
> >>>>> datas
> >>>>> >>> into
> >>>>> >>>>>>> it.
> >>>>> >>>>>>> When the storefileUncompressedSizeMB reached about 1Gb,
> the
> >>>>> region
> >>>>> >>>> auto
> >>>>> >>>>>>> splite to 2.
> >>>>> >>>>>>> I don't know how it happened? 1G is far more less than
> >>>>> >>> max.filesize-
> >>>>> >>>>>>> 100G.
> >>>>> >>>>>>> So if there is any possible scenery that will
> >>>>> >>>>>>> ignore hbase.hregion.max.filesize and do split?
> >>>>> >>>>>>>
> >>>>> >>>>>>> How can I totally shutdown the autoSplit?
> >>>>> >>>>>>>
> >>>>> >>>>>>>
> >>>>> >>>>>>> -----------------
> >>>>> >>>>>>> Best regards,
> >>>>> >>>>>>> NN
> >>>>> >>>>>>
> >>>>> >>>>>>
> >>>>> >>>>>
> >>>>> >>>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> --
> >>>>> >>> *Ric Dong *
> >>>>> >>> Newegg Ecommerce, MIS department
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >

RE: Region autoSplit when not reach 'hbase.hregion.max.filesize' ?

Reply via email to