@Ted Yu: Yep, nevertheless thanks a lot!
Am 18.12.2013 um 10:03 schrieb Ted Yu <[email protected]>: > Timo: > I went through namenode log and didn't find much clue. > > Cheers > > > On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <[email protected]> wrote: > >> Hey Ted Yu, >> >> I had digging the name node log and so far I've found nothing special. No >> Exception, FATAL or ERROR message nor anything other peculiarities. >> Only I see a lot of messages like this: >> >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: Removing >> lease on >> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7 >> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de >> ,60020,1386712527761_1295065721_26 >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR* >> completeFile: >> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7 >> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de >> ,60020,1386712527761_1295065721_26 >> >> But maybe that is normal. If you wanna have a look, you can find the log >> snippet at >> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip >> >> Thanks, >> >> Timo >> >> >> >> Am 14.12.2013 um 09:12 schrieb Ted Yu <[email protected]>: >> >>> Timo: >>> Other than two occurrences of 'Took too long to split the files' >>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted >> log. >>> >>> If you have time, mind checking namenode log for 1 minute interval >> leading >>> up to 13:54:20,194 and 13:55:10,533, respectively ? >>> >>> Thanks >>> >>> >>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <[email protected]> >> wrote: >>> >>>> Hey, >>>> >>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At >> the >>>> moment (the import is actually working) and after I splittet the >> specific >>>> regions manually, we do not have growing regions anymore. >>>> >>>> hbase hbck says, all things are going fine. >>>> 0 inconsistencies detected. >>>> Status: OK >>>> >>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU >>>> The relevant tablename ist data_1091. >>>> >>>> Thanks for your time. >>>> >>>> Timo >>>> >>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <[email protected]>: >>>> >>>>> Timo: >>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we >>>> can >>>>> see what happened ? >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari < >>>>> [email protected]> wrote: >>>>> >>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to >>>> its >>>>>> default value after. >>>>>> >>>>>> Default value is 30 seconds. I think it's not normal for a split to >> take >>>>>> more than that. >>>>>> >>>>>> What is your hardware configuration? >>>>>> >>>>>> Have you run hbck to see if everything is correct? >>>>>> >>>>>> JM >>>>>> >>>>>> >>>>>> 2013/12/13 Timo Schaepe <[email protected]> >>>>>> >>>>>>> Hello again, >>>>>>> >>>>>>> digging in the logs of the specific regionserver shows me that: >>>>>>> >>>>>>> 2013-12-12 13:54:20,194 INFO >>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running >>>>>> rollback/cleanup >>>>>>> of failed split of >>>>>>> >>>>>> >>>> >> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.; >>>>>>> Took too long to split the files and create the references, aborting >>>>>> split >>>>>>> >>>>>>> This message appears two time, so it seems, that HBase tried to split >>>> the >>>>>>> region but it failed. I don't know why. How is the behaviour of >> HBase, >>>>>> if a >>>>>>> region split fails? Are there more tries to split this region again? >> I >>>>>>> didn't find any new tries in the log. Now I split the big regions >>>>>> manually >>>>>>> and this works. And also it seems, that HBase split the new regions >>>> again >>>>>>> to crunch they down to the given limit. >>>>>>> >>>>>>> But also it is a mystery for me, why the split size in Hannibal shows >>>> me >>>>>>> 10 GB and in base-site.xml I put 2 GB… >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Timo >>>>>>> >>>>>>> >>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <[email protected]>: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> during the loading of data in our cluster I noticed some strange >>>>>>> behavior of some regions, that I don't understand. >>>>>>>> >>>>>>>> Scenario: >>>>>>>> We convert data from a mysql database to HBase. The data is inserted >>>>>>> with a put to the specific HBase table. The row key is a timestamp. I >>>>>> know >>>>>>> the problem with timestamp keys, but in our requirement it works >> quiet >>>>>>> well. The problem is now, that there are some regions, which are >>>> growing >>>>>>> and growing. >>>>>>>> >>>>>>>> For example the table on the picture [1]. First, all data was >>>>>>> distributed over regions and node. And now, the data is written into >>>> only >>>>>>> one region, which is growing and I can see no splitting at all. >>>> Actually >>>>>>> the size of the big region is nearly 60 GB. >>>>>>>> >>>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is >>>> not >>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize >> to >>>> 2 >>>>>> GB >>>>>>> and HBase accepted this value. >>>>>>>> >>>>>>>> <property> >>>>>>>> <!--Loaded from hbase-site.xml--> >>>>>>>> <name>hbase.hregion.max.filesize</name> >>>>>>>> <value>2147483648</value> >>>>>>>> </property> >>>>>>>> >>>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see >>>>>>> screenshot). >>>>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB >>>> nor >>>>>>> 10 GB. >>>>>>>> >>>>>>>> Any ideas? Could be the timestamp rowkey cause this problem? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Timo >>>>>>>> >>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png >>>>>>> >>>>>>> >>>>>> >>>> >>>> >> >>
smime.p7s
Description: S/MIME cryptographic signature
