Hey, @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import is actually working) and after I splittet the specific regions manually, we do not have growing regions anymore.
hbase hbck says, all things are going fine. 0 inconsistencies detected. Status: OK @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU The relevant tablename ist data_1091. Thanks for your time. Timo Am 13.12.2013 um 20:18 schrieb Ted Yu <[email protected]>: > Timo: > Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can > see what happened ? > > Thanks > > > On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari < > [email protected]> wrote: > >> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its >> default value after. >> >> Default value is 30 seconds. I think it's not normal for a split to take >> more than that. >> >> What is your hardware configuration? >> >> Have you run hbck to see if everything is correct? >> >> JM >> >> >> 2013/12/13 Timo Schaepe <[email protected]> >> >>> Hello again, >>> >>> digging in the logs of the specific regionserver shows me that: >>> >>> 2013-12-12 13:54:20,194 INFO >>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running >> rollback/cleanup >>> of failed split of >>> >> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.; >>> Took too long to split the files and create the references, aborting >> split >>> >>> This message appears two time, so it seems, that HBase tried to split the >>> region but it failed. I don't know why. How is the behaviour of HBase, >> if a >>> region split fails? Are there more tries to split this region again? I >>> didn't find any new tries in the log. Now I split the big regions >> manually >>> and this works. And also it seems, that HBase split the new regions again >>> to crunch they down to the given limit. >>> >>> But also it is a mystery for me, why the split size in Hannibal shows me >>> 10 GB and in base-site.xml I put 2 GB… >>> >>> Thanks, >>> >>> Timo >>> >>> >>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <[email protected]>: >>> >>>> Hello, >>>> >>>> during the loading of data in our cluster I noticed some strange >>> behavior of some regions, that I don't understand. >>>> >>>> Scenario: >>>> We convert data from a mysql database to HBase. The data is inserted >>> with a put to the specific HBase table. The row key is a timestamp. I >> know >>> the problem with timestamp keys, but in our requirement it works quiet >>> well. The problem is now, that there are some regions, which are growing >>> and growing. >>>> >>>> For example the table on the picture [1]. First, all data was >>> distributed over regions and node. And now, the data is written into only >>> one region, which is growing and I can see no splitting at all. Actually >>> the size of the big region is nearly 60 GB. >>>> >>>> HBase version is 0.94.11. I cannot understand, why the splitting is not >>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 >> GB >>> and HBase accepted this value. >>>> >>>> <property> >>>> <!--Loaded from hbase-site.xml--> >>>> <name>hbase.hregion.max.filesize</name> >>>> <value>2147483648</value> >>>> </property> >>>> >>>> First mystery: Hannibal shows me the split size is 10 GB (see >>> screenshot). >>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor >>> 10 GB. >>>> >>>> Any ideas? Could be the timestamp rowkey cause this problem? >>>> >>>> Thanks, >>>> >>>> Timo >>>> >>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png >>> >>> >>
smime.p7s
Description: S/MIME cryptographic signature
