Re: Problems with hbase.hregion.max.filesize

Ted Yu Wed, 18 Dec 2013 10:05:09 -0800

Timo:
I went through namenode log and didn't find much clue.

Cheers



On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <[email protected]> wrote:

> Hey Ted Yu,
>
> I had digging the name node log and so far I've found nothing special. No
> Exception, FATAL or ERROR message nor anything other peculiarities.
> Only I see a lot of messages like this:
>
> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: Removing
> lease on
>  
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> ,60020,1386712527761_1295065721_26
> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> completeFile:
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> ,60020,1386712527761_1295065721_26
>
> But maybe that is normal. If you wanna have a look, you can find the log
> snippet at
> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
>
> Thanks,
>
>         Timo
>
>
>
> Am 14.12.2013 um 09:12 schrieb Ted Yu <[email protected]>:
>
> > Timo:
> > Other than two occurrences of 'Took too long to split the files'
> > @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
> log.
> >
> > If you have time, mind checking namenode log for 1 minute interval
> leading
> > up to 13:54:20,194 and 13:55:10,533, respectively ?
> >
> > Thanks
> >
> >
> > On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <[email protected]>
> wrote:
> >
> >> Hey,
> >>
> >> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
> the
> >> moment (the import is actually working) and after I splittet the
> specific
> >> regions manually, we do not have growing regions anymore.
> >>
> >> hbase hbck says, all things are going fine.
> >> 0 inconsistencies detected.
> >> Status: OK
> >>
> >> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> >> The relevant tablename ist data_1091.
> >>
> >> Thanks for your time.
> >>
> >>        Timo
> >>
> >> Am 13.12.2013 um 20:18 schrieb Ted Yu <[email protected]>:
> >>
> >>> Timo:
> >>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
> >> can
> >>> see what happened ?
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> >>> [email protected]> wrote:
> >>>
> >>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
> >> its
> >>>> default value after.
> >>>>
> >>>> Default value is 30 seconds. I think it's not normal for a split to
> take
> >>>> more than that.
> >>>>
> >>>> What is your hardware configuration?
> >>>>
> >>>> Have you run hbck to see if everything is correct?
> >>>>
> >>>> JM
> >>>>
> >>>>
> >>>> 2013/12/13 Timo Schaepe <[email protected]>
> >>>>
> >>>>> Hello again,
> >>>>>
> >>>>> digging in the logs of the specific regionserver shows me that:
> >>>>>
> >>>>> 2013-12-12 13:54:20,194 INFO
> >>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >>>> rollback/cleanup
> >>>>> of failed split of
> >>>>>
> >>>>
> >>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>>>> Took too long to split the files and create the references, aborting
> >>>> split
> >>>>>
> >>>>> This message appears two time, so it seems, that HBase tried to split
> >> the
> >>>>> region but it failed. I don't know why. How is the behaviour of
> HBase,
> >>>> if a
> >>>>> region split fails? Are there more tries to split this region again?
> I
> >>>>> didn't find any new tries in the log. Now I split the big regions
> >>>> manually
> >>>>> and this works. And also it seems, that HBase split the new regions
> >> again
> >>>>> to crunch they down to the given limit.
> >>>>>
> >>>>> But also it is a mystery for me, why the split size in Hannibal shows
> >> me
> >>>>> 10 GB and in base-site.xml I put 2 GB…
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>       Timo
> >>>>>
> >>>>>
> >>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <[email protected]>:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> during the loading of data in our cluster I noticed some strange
> >>>>> behavior of some regions, that I don't understand.
> >>>>>>
> >>>>>> Scenario:
> >>>>>> We convert data from a mysql database to HBase. The data is inserted
> >>>>> with a put to the specific HBase table. The row key is a timestamp. I
> >>>> know
> >>>>> the problem with timestamp keys, but in our requirement it works
> quiet
> >>>>> well. The problem is now, that there are some regions, which are
> >> growing
> >>>>> and growing.
> >>>>>>
> >>>>>> For example the table on the picture [1]. First, all data was
> >>>>> distributed over regions and node. And now, the data is written into
> >> only
> >>>>> one region, which is growing and I can see no splitting at all.
> >> Actually
> >>>>> the size of the big region is nearly 60 GB.
> >>>>>>
> >>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
> >> not
> >>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
> to
> >> 2
> >>>> GB
> >>>>> and HBase accepted this value.
> >>>>>>
> >>>>>> <property>
> >>>>>>     <!--Loaded from hbase-site.xml-->
> >>>>>>     <name>hbase.hregion.max.filesize</name>
> >>>>>>     <value>2147483648</value>
> >>>>>> </property>
> >>>>>>
> >>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
> >>>>> screenshot).
> >>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
> >> nor
> >>>>> 10 GB.
> >>>>>>
> >>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>     Timo
> >>>>>>
> >>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: Problems with hbase.hregion.max.filesize

Reply via email to