[ https://issues.apache.org/jira/browse/HBASE-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391939#comment-15391939 ]
wangyongqiang commented on HBASE-16282: --------------------------------------- {quote} 2016-07-25 08:25:11,257 INFO [regionserver60020-splits-1466239518933] regionserver.SplitRequest: Running rollback/cleanup of failed split of ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.; Took too long to split the files and create the references, aborting split {quote} 1. you can see if there are many hfiles in region crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. 2. this problem is solved in 0.98.14 in hbase-13959 > java.io.IOException: Took too long to split the files and create the > references, aborting split > ----------------------------------------------------------------------------------------------- > > Key: HBASE-16282 > URL: https://issues.apache.org/jira/browse/HBASE-16282 > Project: HBase > Issue Type: Bug > Components: master, regionserver > Affects Versions: 0.98.8 > Reporter: dcswinner > > Recently,I found some exception in my HBase cluser when some regions are > spliting,in the regionserver node logs,the exception log like below: > 2016-07-25 08:24:30,502 INFO [regionserver60020-splits-1466239518933] > regionserver.SplitTransaction: Starting split of region > ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. > > 2016-07-25 08:24:30,938 INFO [regionserver60020-splits-1466239518933] > regionserver.HRegion: Started memstore flush for > ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa., > current region memstore size 28.0 K > 2016-07-25 08:24:36,137 INFO [regionserver60020-splits-1466239518933] > regionserver.DefaultStoreFlusher: Flushed, sequenceid=15546530, memsize=28.0 > K, hasBloomFilter=true, into tmp file > hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.tmp/6ee8bb3e4c0a4af591f94a163b272f5f > > 2016-07-25 08:24:36,590 INFO [regionserver60020-splits-1466239518933] > regionserver.HStore: Added > hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/noverison/6ee8bb3e4c0a4af591f94a163b272f5f, > entries=24, sequenceid=15546530, filesize=25.9 K > 2016-07-25 08:24:36,591 INFO [regionserver60020-splits-1466239518933] > regionserver.HRegion: Finished memstore flush of ~28.0 K/28624, > currentsize=0/0 for region > ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. > in 5652ms, sequenceid=15546530, compaction requested=true > 2016-07-25 08:24:36,647 INFO > [StoreCloserThread-ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.-1] > regionserver.HStore: Closed noverison > 2016-07-25 08:24:36,647 INFO [regionserver60020-splits-1466239518933] > regionserver.HRegion: Closed > ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. > > 2016-07-25 08:24:43,264 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not > complete > /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa > retrying... > 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not > complete > /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa > retrying... > 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not > complete > /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa > retrying... > 2016-07-25 08:24:55,334 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not > complete > /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa > retrying... > 2016-07-25 08:25:11,257 INFO [regionserver60020-splits-1466239518933] > regionserver.SplitRequest: Running rollback/cleanup of failed split of > ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.; > Took too long to split the files and create the references, aborting split > java.io.IOException: Took too long to split the files and create the > references, aborting split > at > org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:825) > > at > org.apache.hadoop.hbase.regionserver.SplitTransaction.stepsBeforePONR(SplitTransaction.java:429) > > at > org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:303) > > at > org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:655) > > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > ========================================================= > and in the master node logs,the exception log like below: > 2016-07-25 08:24:30,504 INFO [AM.ZK.Worker-pool2-t501] master.RegionStates: > Transition null to {e142341c56805aed68d3f99bae3e14f3 state=SPLITTING_NEW, > ts=1469406270504, server=slave77-prd3.cn > suning.com,60020,1466236968700} > 2016-07-25 08:24:30,504 INFO [AM.ZK.Worker-pool2-t501] master.RegionStates: > Transition null to {fb78cda7e8fbf0cb12e9c0407626f7a6 state=SPLITTING_NEW, > ts=1469406270504, server=slave77-prd3.cnsuning.com,60020,1466236968700} > 2016-07-25 08:24:30,504 INFO [AM.ZK.Worker-pool2-t501] master.RegionStates: > Transition {b318fc37c2aac4705007200cc454e7fa state=OPEN, ts=1469320084942, > server=slave77-prd3.cnsuning.com,60020,1466236968700} to > {b318fc37c2aac4705007200cc454e7fa state=SPLITTING, ts=1469406270504, > server=slave77-prd3.cnsuning.com,60020,1466236968700} > 2016-07-25 08:25:17,728 INFO [AM.ZK.Worker-pool2-t503] master.RegionStates: > Transition {b318fc37c2aac4705007200cc454e7fa state=SPLITTING, > ts=1469406270506, server=slave77-prd3.cnsuning.com,60020,1466236968700} to > {b318fc37c2aac4705007200cc454e7fa state=OPEN, ts=1469406317728, > server=slave77-prd3.cnsuning.com,60020,1466236968700} -- This message was sent by Atlassian JIRA (v6.3.4#6332)