[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053391#comment-16053391 ] dcswinner commented on HBASE-7404: -- mark > Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE > -- > > Key: HBASE-7404 > URL: https://issues.apache.org/jira/browse/HBASE-7404 > Project: HBase > Issue Type: New Feature >Affects Versions: 0.94.3 >Reporter: chunhui shen >Assignee: chunhui shen > Fix For: 0.95.0 > > Attachments: 7404-0.94-fixed-lines.txt, 7404-trunk-v10.patch, > 7404-trunk-v11.patch, 7404-trunk-v12.patch, 7404-trunk-v13.patch, > 7404-trunk-v13.txt, 7404-trunk-v14.patch, BucketCache.pdf, > hbase-7404-94v2.patch, HBASE-7404-backport-0.94.patch, > hbase-7404-trunkv2.patch, hbase-7404-trunkv9.patch, Introduction of Bucket > Cache.pdf > > > First, thanks @neil from Fusion-IO share the source code. > Usage: > 1.Use bucket cache as main memory cache, configured as the following: > –"hbase.bucketcache.ioengine" "heap" (or "offheap" if using offheap memory to > cache block ) > –"hbase.bucketcache.size" 0.4 (size for bucket cache, 0.4 is a percentage of > max heap size) > 2.Use bucket cache as a secondary cache, configured as the following: > –"hbase.bucketcache.ioengine" "file:/disk1/hbase/cache.data"(The file path > where to store the block data) > –"hbase.bucketcache.size" 1024 (size for bucket cache, unit is MB, so 1024 > means 1GB) > –"hbase.bucketcache.combinedcache.enabled" false (default value being true) > See more configurations from org.apache.hadoop.hbase.io.hfile.CacheConfig and > org.apache.hadoop.hbase.io.hfile.bucket.BucketCache > What's Bucket Cache? > It could greatly decrease CMS and heap fragment by GC > It support a large cache space for High Read Performance by using high speed > disk like Fusion-io > 1.An implementation of block cache like LruBlockCache > 2.Self manage blocks' storage position through Bucket Allocator > 3.The cached blocks could be stored in the memory or file system > 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), > combined with LruBlockCache to decrease CMS and fragment by GC. > 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to > store block) to enlarge cache space > How about SlabCache? > We have studied and test SlabCache first, but the result is bad, because: > 1.SlabCache use SingleSizeCache, its use ratio of memory is low because kinds > of block size, especially using DataBlockEncoding > 2.SlabCache is uesd in DoubleBlockCache, block is cached both in SlabCache > and LruBlockCache, put the block to LruBlockCache again if hit in SlabCache , > it causes CMS and heap fragment don't get any better > 3.Direct heap performance is not good as heap, and maybe cause OOM, so we > recommend using "heap" engine > See more in the attachment and in the patch -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-3269) HBase table truncate semantics seems broken as "disable" table is now async by default.
[ https://issues.apache.org/jira/browse/HBASE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500493#comment-15500493 ] dcswinner commented on HBASE-3269: -- In 0.98.8 versions,I execute:truncate_preserve at 04:00 AM. every day,frequently i it occur the below error: The error log is: truncate_preserve 'ns_ztbd:tb_pis_cshopprice' Truncating 'ns_ztbd:tb_pis_cshopprice' table (it may take a while): SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/bigdata/software/hbase-0.98.8-hadoop2.4/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/bigdata/software/hadoop-2.4.0.4/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. - Disabling table... - Truncating table... - Dropping table... ERROR: org.apache.hadoop.hbase.TableNotDisabledException: ns_ztbd:tb_pis_cshopprice at org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:2077) at org.apache.hadoop.hbase.master.handler.TableEventHandler.prepare(TableEventHandler.java:83) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1822) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1832) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:41471) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Here is some help for this command: Disables, drops and recreates the specified table while still maintaing the previous region boundaries. > HBase table truncate semantics seems broken as "disable" table is now async > by default. > --- > > Key: HBASE-3269 > URL: https://issues.apache.org/jira/browse/HBASE-3269 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.0 > Environment: RHEL5 x86_64 >Reporter: Suraj Varma >Assignee: stack >Priority: Critical > Fix For: 0.90.0, 0.92.0 > > > The new async design for disable table seems to have caused a side effect on > the truncate command. (IRC chat with jdcryans) > Apparent Cause: > "Disable" is now async by default. When truncate is called, the disable > operation returns immediately and when the drop is called, the disable > operation is still not completed. This results in > HMaster.checkTableModifiable() throwing a TableNotDisabledException. > With earlier versions, disable returned only after Table was disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16282) java.io.IOException: Took too long to split the files and create the references, aborting split
[ https://issues.apache.org/jira/browse/HBASE-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dcswinner updated HBASE-16282: -- Description: Recently,I found some exception in my HBase cluser when some regions are spliting,in the regionserver node logs,the exception log like below: 2016-07-25 08:24:30,502 INFO [regionserver60020-splits-1466239518933] regionserver.SplitTransaction: Starting split of region ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. 2016-07-25 08:24:30,938 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Started memstore flush for ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa., current region memstore size 28.0 K 2016-07-25 08:24:36,137 INFO [regionserver60020-splits-1466239518933] regionserver.DefaultStoreFlusher: Flushed, sequenceid=15546530, memsize=28.0 K, hasBloomFilter=true, into tmp file hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.tmp/6ee8bb3e4c0a4af591f94a163b272f5f 2016-07-25 08:24:36,590 INFO [regionserver60020-splits-1466239518933] regionserver.HStore: Added hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/noverison/6ee8bb3e4c0a4af591f94a163b272f5f, entries=24, sequenceid=15546530, filesize=25.9 K 2016-07-25 08:24:36,591 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Finished memstore flush of ~28.0 K/28624, currentsize=0/0 for region ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. in 5652ms, sequenceid=15546530, compaction requested=true 2016-07-25 08:24:36,647 INFO [StoreCloserThread-ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.-1] regionserver.HStore: Closed noverison 2016-07-25 08:24:36,647 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Closed ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. 2016-07-25 08:24:43,264 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:55,334 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:25:11,257 INFO [regionserver60020-splits-1466239518933] regionserver.SplitRequest: Running rollback/cleanup of failed split of ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.; Took too long to split the files and create the references, aborting split java.io.IOException: Took too long to split the files and create the references, aborting split at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:825) at org.apache.hadoop.hbase.regionserver.SplitTransaction.stepsBeforePONR(SplitTransaction.java:429) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:303) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:655) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) = and in the master node logs,the exception log like below: 2016-07-25 08:24:30,504 INFO [AM.ZK.Worker-pool2-t501] master.RegionStates: Transition null to {e142341c56805aed68d3f99bae3e14f3 state=SPLITTING_NEW, ts=1469406270504, server=slave77-prd3.cn suning.com,60020,1466236968700} 2016-07-25 08:24:30,504 INFO
[jira] [Updated] (HBASE-16282) java.io.IOException: Took too long to split the files and create the references, aborting split
[ https://issues.apache.org/jira/browse/HBASE-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dcswinner updated HBASE-16282: -- Description: HBase region split took too long to split the files and create the references, and aborting split > java.io.IOException: Took too long to split the files and create the > references, aborting split > --- > > Key: HBASE-16282 > URL: https://issues.apache.org/jira/browse/HBASE-16282 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Affects Versions: 0.98.8 >Reporter: dcswinner > > HBase region split took too long to split the files and create the > references, and aborting split -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16282) java.io.IOException: Took too long to split the files and create the references, aborting split
[ https://issues.apache.org/jira/browse/HBASE-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391525#comment-15391525 ] dcswinner commented on HBASE-16282: --- Recently,I found my HBase cluster has some exception when region spliting,the log in regionserver like this: 2016-07-25 08:24:30,502 INFO [regionserver60020-splits-1466239518933] regionserver.SplitTransaction: Starting split of region ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. 2016-07-25 08:24:30,938 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Started memstore flush for ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa., current region memstore size 28.0 K 2016-07-25 08:24:36,137 INFO [regionserver60020-splits-1466239518933] regionserver.DefaultStoreFlusher: Flushed, sequenceid=15546530, memsize=28.0 K, hasBloomFilter=true, into tmp file hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.tmp/6ee8bb3e4c0a4af591f94a163b272f5f 2016-07-25 08:24:36,590 INFO [regionserver60020-splits-1466239518933] regionserver.HStore: Added hdfs://suninghadoop2/hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/noverison/6ee8bb3e4c0a4af591f94a163b272f5f, entries=24, sequenceid=15546530, filesize=25.9 K 2016-07-25 08:24:36,591 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Finished memstore flush of ~28.0 K/28624, currentsize=0/0 for region ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. in 5652ms, sequenceid=15546530, compaction requested=true 2016-07-25 08:24:36,647 INFO [StoreCloserThread-ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.-1] regionserver.HStore: Closed noverison 2016-07-25 08:24:36,647 INFO [regionserver60020-splits-1466239518933] regionserver.HRegion: Closed ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa. 2016-07-25 08:24:43,264 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:47,842 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:24:55,334 INFO [StoreFileSplitter-0] hdfs.DFSClient: Could not complete /hbase/data/ns_spider/crawl_task_exception_detail/b318fc37c2aac4705007200cc454e7fa/.splits/e142341c56805aed68d3f99bae3e14f3/noverison/14126e7af90e4d4cbcbdc45d98e130d0.b318fc37c2aac4705007200cc454e7fa retrying... 2016-07-25 08:25:11,257 INFO [regionserver60020-splits-1466239518933] regionserver.SplitRequest: Running rollback/cleanup of failed split of ns_spider:crawl_task_exception_detail,8\xFF\xE3\x0D\x00\x00\x00\x00,1463915449300.b318fc37c2aac4705007200cc454e7fa.; Took too long to split the files and create the references, aborting split java.io.IOException: Took too long to split the files and create the references, aborting split at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:825) at org.apache.hadoop.hbase.regionserver.SplitTransaction.stepsBeforePONR(SplitTransaction.java:429) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:303) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:655) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-07-25 08:25:12,436 INFO [StoreOpener-b318fc37c2aac4705007200cc454e7fa-1] compactions.CompactionConfiguration: size [134217728, 9223372036854775807); files [3, 10); ratio 1.20; off-peak ratio 5.00; throttle point 2684354560; major period 60480, major jitter 0.50 2016-07-25 08:25:16,461 INFO
[jira] [Created] (HBASE-16282) java.io.IOException: Took too long to split the files and create the references, aborting split
dcswinner created HBASE-16282: - Summary: java.io.IOException: Took too long to split the files and create the references, aborting split Key: HBASE-16282 URL: https://issues.apache.org/jira/browse/HBASE-16282 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.98.8 Reporter: dcswinner -- This message was sent by Atlassian JIRA (v6.3.4#6332)