[jira] [Commented] (HBASE-7398) [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5
[ https://issues.apache.org/jira/browse/HBASE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536879#comment-13536879 ] nkeywal commented on HBASE-7398: The patch replaces only one of the numerous {code}while (!ZKAssign.verifyRegionState(this.watcher, REGIONINFO, EventType.M_ZK_REGION_OFFLINE)){Threads.sleep(1);}{code} Is that the intend? [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5 Key: HBASE-7398 URL: https://issues.apache.org/jira/browse/HBASE-7398 Project: HBase Issue Type: Bug Components: Region Assignment, test Affects Versions: 0.94.4 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-7398_v1.patch TestAssignmentManager#testBalance() fails pretty frequently on CentOS 5 for 0.94. The root cause is that ClosedRegionHandler is executed by an executor, and before it finishes, the region transition is done for OPENING and OPENED. This seems to be just a test problem, not an actual bug, since the region server won't open the region unless it get's it from the assign call on ClosedRegionHandler.process(). I've seen that HBASE-6109 has a fix for this already, will just backport those changes. This is 0.94 only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7402) java.io.IOException: Got error in response to OP_READ_BLOCK
samar created HBASE-7402: Summary: java.io.IOException: Got error in response to OP_READ_BLOCK Key: HBASE-7402 URL: https://issues.apache.org/jira/browse/HBASE-7402 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.0, 0.90.4 Reporter: samar Getting this error on our hbase version 0.90.4-cdh3u3 2012-12-18 02:35:39,082 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /x.x.x.x:x for file /hbase/table_x/37bea13d03ed9fa611941cc4aad6e8c2/scores/7355825801969613604 for block 3174705353677971357:java.io.IOException: Got error in response to OP_READ_BLOCK self=/x.x.x.x, remote=/x.x.x.x: for file /hbase/table_x/37bea13d03ed9fa611941cc4aad6e8c2/scores/7355825801969613604 for block 3174705353677971357_1028665 at org.apache.hadoop.hdfs.DFSClient$RemoteBlockReader.newBlockReader(DFSClient.java:1673) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.getBlockReader(DFSClient.java:2383) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:2272) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2438) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:141) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1094) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1036) at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1446) at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1303) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:136) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:96) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:77) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1405) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.init(HRegion.java:2467) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateInternalScanner(HRegion.java:1192) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1184) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1168) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:3215) this causes the HBase RS to hang and hence stops responding. In NameNode the block was delete before.. ( as per the timestamp) 2012-12-18 02:25:19,027 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask x.x.x.x:x to delete blk_3174705353677971357_1028665 blk_-9072685530813588257_1028824 2012-12-18 02:25:19,027 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask x.x.x.x:x to delete blk_5651962510569886604_1028711 2012-12-18 02:25:22,027 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask x.x.x.x:x to delete blk_3174705353677971357_1028665 Looks like org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream is cacheing the block location and causing this issue -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7397) HTable.coprocessorService() should allow null values in returned Map
[ https://issues.apache.org/jira/browse/HBASE-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536995#comment-13536995 ] Hudson commented on HBASE-7397: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #305 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/305/]) HBASE-7397 HTable.coprocessorService() should allow null values in returned Map (Revision 1424282) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java HTable.coprocessorService() should allow null values in returned Map Key: HBASE-7397 URL: https://issues.apache.org/jira/browse/HBASE-7397 Project: HBase Issue Type: Task Components: Coprocessors Affects Versions: 0.96.0 Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-7397.patch The conversion of coprocessor endpoints to PB services (HBASE-5448) changed the semantics of {{HTable.coprocessorService()}} (version that returns {{Mapbyte[],R}}) to disallow {{null}} values. This should be fixed to allow {{null}} values for the map entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7392) Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol
[ https://issues.apache.org/jira/browse/HBASE-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536996#comment-13536996 ] Anoop Sam John commented on HBASE-7392: --- Andrew.. No I am using the same old account. Stack I will have a look at it... Will check with JDK7 Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol - Key: HBASE-7392 URL: https://issues.apache.org/jira/browse/HBASE-7392 Project: HBase Issue Type: Task Reporter: stack Attachments: 7392.txt Jenkins run https://builds.apache.org/job/HBase-TRUNK/3638/ turned up two broke example tests. They pass on a jdk6 machine locally but not on my jdk7 laptop. Somethings up. My guess is that these failures have been there a while but only surfaced because we got further than we normally do on a jenkins run. Tests have no output on jenkins. If I run w/ $ MAVEN_OPTS= -Xmx3g mvn test -PlocalTests -Dtest=TestBulkDeleteProtocol -Dtest.output.tofile=false I get: {code} --- T E S T S --- Running org.apache.hadoop.hbase.coprocessor.example.TestBulkDeleteProtocol 2012-12-19 09:34:36,340 INFO [main] hbase.HBaseTestingUtility(713): Starting up minicluster with 1 master(s) and 2 regionserver(s) and 2 datanode(s) 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(343): Created new mini-cluster data directory: /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/dfscluster_1c4634ed-2333-48ee-807d-c56f8c4ff20f 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(553): Setting test.cache.data to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/cache_data in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.tmp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_tmp in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.log.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_logs in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.local.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_local in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.temp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_temp in system properties and HBase conf 2012-12-19 09:34:46,365 INFO [main] hbase.HBaseTestingUtility(536): read short circuit is ON for user stack 2012-12-19 09:34:46.438 java[16837:1703] Unable to load realm info from SCDynamicStore 2012-12-19 09:34:56,540 DEBUG [main] fs.HFileSystem(199): Starting addLocationsOrderInterceptor with class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks 2012-12-19 09:34:56,541 WARN [main] fs.HFileSystem(215): The file system is not a DistributedFileSystem.Not adding block location reordering 2012-12-19 09:34:56,669 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:06,962 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:07,076 INFO [main] log.Slf4jLog(67): Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2012-12-19 09:35:07,151 INFO [main] log.Slf4jLog(67): jetty-6.1.26 2012-12-19 09:35:07,184 INFO [main] log.Slf4jLog(67): Extract jar:file:/Users/stack/.m2/repository/org/apache/hadoop/hadoop-core/1.1.1/hadoop-core-1.1.1.jar!/webapps/hdfs to /var/folders/bp/2z1cykc92rs6j24251cg__phgp/T/Jetty_localhost_57924_hdfsg9mqyr/webapp 2012-12-19 09:35:07,402 INFO [main] log.Slf4jLog(67): Started SelectChannelConnector@localhost:57924 Starting DataNode 0 with dfs.data.dir:
[jira] [Created] (HBASE-7403) Online Merge
chunhui shen created HBASE-7403: --- Summary: Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7403: Attachment: hbase-7403-94v1.patch hbase-7403-trunkv1.patch merge region.pdf Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge region.pdf We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
chunhui shen created HBASE-7404: --- Summary: Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7404: Attachment: BucketCache.pdf hbase-7404-trunkv1.patch hbase-7404-0.94v1.patch Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: BucketCache.pdf, hbase-7404-0.94v1.patch, hbase-7404-trunkv1.patch First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7392) Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol
[ https://issues.apache.org/jira/browse/HBASE-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537021#comment-13537021 ] Anoop Sam John commented on HBASE-7392: --- TestBulkDeleteProtocol and TestRowCountEndpoint having Endpoint execution. TestZooKeeperScanPolicyObserver is having normal table ops. Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol - Key: HBASE-7392 URL: https://issues.apache.org/jira/browse/HBASE-7392 Project: HBase Issue Type: Task Reporter: stack Attachments: 7392.txt Jenkins run https://builds.apache.org/job/HBase-TRUNK/3638/ turned up two broke example tests. They pass on a jdk6 machine locally but not on my jdk7 laptop. Somethings up. My guess is that these failures have been there a while but only surfaced because we got further than we normally do on a jenkins run. Tests have no output on jenkins. If I run w/ $ MAVEN_OPTS= -Xmx3g mvn test -PlocalTests -Dtest=TestBulkDeleteProtocol -Dtest.output.tofile=false I get: {code} --- T E S T S --- Running org.apache.hadoop.hbase.coprocessor.example.TestBulkDeleteProtocol 2012-12-19 09:34:36,340 INFO [main] hbase.HBaseTestingUtility(713): Starting up minicluster with 1 master(s) and 2 regionserver(s) and 2 datanode(s) 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(343): Created new mini-cluster data directory: /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/dfscluster_1c4634ed-2333-48ee-807d-c56f8c4ff20f 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(553): Setting test.cache.data to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/cache_data in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.tmp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_tmp in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.log.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_logs in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.local.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_local in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.temp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_temp in system properties and HBase conf 2012-12-19 09:34:46,365 INFO [main] hbase.HBaseTestingUtility(536): read short circuit is ON for user stack 2012-12-19 09:34:46.438 java[16837:1703] Unable to load realm info from SCDynamicStore 2012-12-19 09:34:56,540 DEBUG [main] fs.HFileSystem(199): Starting addLocationsOrderInterceptor with class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks 2012-12-19 09:34:56,541 WARN [main] fs.HFileSystem(215): The file system is not a DistributedFileSystem.Not adding block location reordering 2012-12-19 09:34:56,669 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:06,962 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:07,076 INFO [main] log.Slf4jLog(67): Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2012-12-19 09:35:07,151 INFO [main] log.Slf4jLog(67): jetty-6.1.26 2012-12-19 09:35:07,184 INFO [main] log.Slf4jLog(67): Extract jar:file:/Users/stack/.m2/repository/org/apache/hadoop/hadoop-core/1.1.1/hadoop-core-1.1.1.jar!/webapps/hdfs to /var/folders/bp/2z1cykc92rs6j24251cg__phgp/T/Jetty_localhost_57924_hdfsg9mqyr/webapp 2012-12-19 09:35:07,402 INFO [main] log.Slf4jLog(67): Started SelectChannelConnector@localhost:57924 Starting DataNode 0 with dfs.data.dir:
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537032#comment-13537032 ] Jean-Daniel Cryans commented on HBASE-5778: --- I have a long flight today, I'll try to repro, but it passes all the time for me. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537043#comment-13537043 ] Ted Yu commented on HBASE-7404: --- In slide titled 'Test Results of First Usage', TPS is write request per second and QPS is read request per second Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: BucketCache.pdf, hbase-7404-0.94v1.patch, hbase-7404-trunkv1.patch First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537060#comment-13537060 ] Ted Yu commented on HBASE-7403: --- Nice work, Chunhui. This is related to HBASE-5487: Generic framework for Master-coordinated tasks Slides 12 to 14 give flow chart. It would be nice if true / false conditions are labeled for branch node. 'hbase.master.thread.merge': would 'hbase.master.merge.threads' be better name for the config ? For trunk patch, the new classes should be annotated for audience and stability. Will take a closer look at the patch. Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge region.pdf We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5534) HBase shell's return value is almost always 0
[ https://issues.apache.org/jira/browse/HBASE-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5534: -- Component/s: shell HBase shell's return value is almost always 0 - Key: HBASE-5534 URL: https://issues.apache.org/jira/browse/HBASE-5534 Project: HBase Issue Type: Improvement Components: shell Reporter: Alex Newman So I was trying to write some simple scripts to verify client connections to HBase using the shell and I noticed that the HBase shell always returns 0 even when it can't connect to an HBase server. I'm not sure if this is the best option. What would be neat is if you had some capability to run commands like hbase shell --command='disable table;\ndrop table;' and it would error out if any of the commands fail to succeed. echo disable table | hbase shell could continue to work as it does now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537086#comment-13537086 ] ramkrishna.s.vasudevan commented on HBASE-7403: --- Chunhui, nice work. Will surely go through the patch to understand your implementation. Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge region.pdf We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537087#comment-13537087 ] ramkrishna.s.vasudevan commented on HBASE-7404: --- @Chunhui So you are back with a bang. :) Some terms in that test result YGC, YGCT? Sorry if am ignorant. Do you mean Young Generation here? Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: BucketCache.pdf, hbase-7404-0.94v1.patch, hbase-7404-trunkv1.patch First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537101#comment-13537101 ] chunhui shen commented on HBASE-7403: - Thanks ted and ram I upload the patch to review board https://reviews.apache.org/r/8716/ Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge region.pdf We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537103#comment-13537103 ] chunhui shen commented on HBASE-7404: - [~ram_krish] Hoho, I'm here all the same!! Yes, YGC=Young Generation Count YGCT=Young Generation Total Time reviewboard https://reviews.apache.org/r/8717/ Waiting for yours comments~~~ Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: BucketCache.pdf, hbase-7404-0.94v1.patch, hbase-7404-trunkv1.patch First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7351: - Fix Version/s: 0.94.4 Since it is default off, that should be fine actually. Do you think that you include a rudimentary health checker script (one that you find useful) as an example? Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537184#comment-13537184 ] Lars Hofhansl commented on HBASE-5778: -- If I revert this change TestReplication always passes locally in 0.94, whereas with this patch is never passed (so far - and it did pass only once in many runs with the increased SLEEP_TIME). I would be more comfortable if the patch was reverted from 0.94. I know this is frustrating, but I would like to spin 0.94.4 soon (hopefully by tomorrow). We can put this back into 0.94.5. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537182#comment-13537182 ] Lars Hofhansl edited comment on HBASE-7351 at 12/20/12 5:38 PM: Since it is default off, that should be fine actually. Do you think that you could include a rudimentary health checker script (one that you find useful) as an example? was (Author: lhofhansl): Since it is default off, that should be fine actually. Do you think that you include a rudimentary health checker script (one that you find useful) as an example? Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7390) Add extra test cases for assignement on the region server and fix the related issues
[ https://issues.apache.org/jira/browse/HBASE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537185#comment-13537185 ] nkeywal commented on HBASE-7390: [~saint@gmail.com] bq. Yes. In case the master went away on us, so the new master gets the SPLIT callback...and its handler does the SPLIT clean-up. What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately. For this patch, I am waiting for a +1 before committing, as it touches a critical part. Add extra test cases for assignement on the region server and fix the related issues Key: HBASE-7390 URL: https://issues.apache.org/jira/browse/HBASE-7390 Project: HBase Issue Type: Bug Components: Region Assignment, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 7390.v1.patch, 7390.v2.patch, 7390.v3.patch, 7390.v4.patch, assignment_zk_states.jpg We don't have a lot of tests on the region server itself. Here are some. Some of them are failing, feedback welcome. See as well the attached state diagram for the ZK nodes on assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7373) table should not be required in AccessControlService
[ https://issues.apache.org/jira/browse/HBASE-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7373: --- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) table should not be required in AccessControlService Key: HBASE-7373 URL: https://issues.apache.org/jira/browse/HBASE-7373 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 Attachments: trunk-7373.patch We should fix the proto file, add unit test for this case, and verify it works from hbase shell with table to be nil. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7405) Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits
Devaraj Das created HBASE-7405: -- Summary: Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits Key: HBASE-7405 URL: https://issues.apache.org/jira/browse/HBASE-7405 Project: HBase Issue Type: Bug Components: Coprocessors, Protobufs Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0 Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537197#comment-13537197 ] stack commented on HBASE-5448: -- [~ghelmling] One other question G. How to do exceptions in a coprocessor? I see where we set the exception on the controller if there is one, but should we then abandon further processing -- return? We need to call the RpcCallback done, though, right? Here is example from tail of an endpoint cp implementation: {code} ... } catch (IOException e) { ResponseConverter.setControllerException(controller, e); // Set result to -1 to indicate error. sumResult = -1; LOG.info(Setting sum result to -1 to indicate error, e); } finally { if (scanner != null) { try { scanner.close(); } catch (IOException e) { ResponseConverter.setControllerException(controller, e); sumResult = -1; LOG.info(Setting sum result to -1 to indicate error, e); } } } done.run(SumResponse.newBuilder().setSum(sumResult).build()); } {code} Is that how you'd do it? Thanks. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: IPC/RPC, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448_4.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7406) Example health checker script
Andrew Purtell created HBASE-7406: - Summary: Example health checker script Key: HBASE-7406 URL: https://issues.apache.org/jira/browse/HBASE-7406 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7405) Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits
[ https://issues.apache.org/jira/browse/HBASE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-7405: --- Attachment: 7405-1.patch Work-in-progress patch. Testing yet to be done. Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits Key: HBASE-7405 URL: https://issues.apache.org/jira/browse/HBASE-7405 Project: HBase Issue Type: Bug Components: Coprocessors, Protobufs Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 7405-1.patch Enforce PB ser/de for Aggregate protocol and associated ColumnInterpreter user code bits -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537203#comment-13537203 ] Andrew Purtell commented on HBASE-7351: --- bq. Do you think that you could include a rudimentary health checker script (one that you find useful) as an example? That's a good idea. I opened subtask HBASE-7406 for this, it would be a separate commit already on trunk. Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537208#comment-13537208 ] ramkrishna.s.vasudevan commented on HBASE-5416: --- Ok groked up the patch. {code} if (!scan.getAllowLazyCfLoading() || this.filter == null || this.filter.isFamilyEssential(entry.getKey())) { {code} Move the this.filter == null as first condition. Because when you don have filters then the entire joinedHeap is not going to used right? {code} correct_row = this.joinedHeap.seek(KeyValue.createFirstOnRow(currentRow, offset, length)); {code} So here we move on to the KV just before the row we got in the current next() call? After this suppose due to limits it says that joinedHeapHasMoreData =true, now when the next call comes {code} else if (joinedHeapHasMoreData) { joinedHeapHasMoreData = populateResult(this.joinedHeap, limit, currentRow, offset, length, metric); return true; {code} I think we should get the return val from the populateResult and if it returns a false we may need to check if we have reached the stopRow or not right? Filters need not be checked anyway. So one thing is if i say in my Scan that i need LazyLoading but my filter is NOT of the type SCVF and the ones that implement isFamilyEssential then it goes thro normal flow. May be this we need to document clearly as user may think that setting that property is going to give him a better optimized scan. Reg, the TestHRegion testcases. Actually the testcases does not test the behaviour of joinedScanners. Is it intended? But the testcase names suggests it tests joinedScanners. I will leave it to other scan experts in deciding whether this can go in. Overall a very good improvment. Thanks to Max, Sergey and Ted. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others
nkeywal created HBASE-7407: -- Summary: TestMasterFailover under tests some cases and over tests some others Key: HBASE-7407 URL: https://issues.apache.org/jira/browse/HBASE-7407 Project: HBase Issue Type: Bug Components: master, test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor The tests are done with this settings: conf.setInt(hbase.master.assignment.timeoutmonitor.period, 2000); conf.setInt(hbase.master.assignment.timeoutmonitor.timeout, 4000); As a results: 1) some tests seems to work, but in real life, the recovery would take 5 minutes or more, as in production there always higher. So we don't see the real issues. 2) The tests include specific cases that should not happen in production. It works because the timeout catches everything, but these scenarios do not need to be optimized, as they cannot happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6789: - Attachment: 6789v3.txt This patches is complete but in need of edit. Will pass it by hadoopqa to see how well lit does too (There may be issue in testclassloading). I converted 'GenericProtocol' by removing it. I don't think we can do Generics with new cp engine -- waiting on response from Gary. Added more 'deprecateds' too. Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6789: - Status: Patch Available (was: Open) Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537230#comment-13537230 ] ramkrishna.s.vasudevan commented on HBASE-5416: --- @Sergey The split related failure has to be investigated. Will try looking into the possible reason for failure. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7390) Add extra test cases for assignement on the region server and fix the related issues
[ https://issues.apache.org/jira/browse/HBASE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537232#comment-13537232 ] Sergey Shelukhin commented on HBASE-7390: - +1 on v4 patch Add extra test cases for assignement on the region server and fix the related issues Key: HBASE-7390 URL: https://issues.apache.org/jira/browse/HBASE-7390 Project: HBase Issue Type: Bug Components: Region Assignment, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 7390.v1.patch, 7390.v2.patch, 7390.v3.patch, 7390.v4.patch, assignment_zk_states.jpg We don't have a lot of tests on the region server itself. Here are some. Some of them are failing, feedback welcome. See as well the attached state diagram for the ZK nodes on assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537238#comment-13537238 ] Hadoop QA commented on HBASE-6789: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561951/6789v3.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 29 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.protobuf.TestProtobufUtil {color:red}-1 core zombie tests{color}. There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3632//console This message is automatically generated. Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7398) [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5
[ https://issues.apache.org/jira/browse/HBASE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537246#comment-13537246 ] Enis Soztutar commented on HBASE-7398: -- Only the check in testBalance() failed for the last runs, I did not check the others. I'll convert the other places as well. [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5 Key: HBASE-7398 URL: https://issues.apache.org/jira/browse/HBASE-7398 Project: HBase Issue Type: Bug Components: Region Assignment, test Affects Versions: 0.94.4 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-7398_v1.patch TestAssignmentManager#testBalance() fails pretty frequently on CentOS 5 for 0.94. The root cause is that ClosedRegionHandler is executed by an executor, and before it finishes, the region transition is done for OPENING and OPENED. This seems to be just a test problem, not an actual bug, since the region server won't open the region unless it get's it from the assign call on ClosedRegionHandler.process(). I've seen that HBASE-6109 has a fix for this already, will just backport those changes. This is 0.94 only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7408) Make compaction to use pread instead of sequential read
Rishir Shroff created HBASE-7408: Summary: Make compaction to use pread instead of sequential read Key: HBASE-7408 URL: https://issues.apache.org/jira/browse/HBASE-7408 Project: HBase Issue Type: Improvement Reporter: Rishir Shroff Priority: Minor As we discovered lately, HFile compactions use sequential reads to fetch blocks. It cause unwanted streaming of data from HDFS to region servers when there is a cache hit. Lets change to use preads to reduce iops on disks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537250#comment-13537250 ] Lars Hofhansl commented on HBASE-7351: -- Cool... Let's commit this (so it gets into 0.94.4). And add an example script later. (Or I'd be fine with committing this whole thing to 0.94.5, which just means it'll be about 6 weeks or so later) Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537252#comment-13537252 ] Andrew Purtell commented on HBASE-5778: --- TestReplication was flapping on a private Jenkins at a previous employer over a span of 6 months. We triaged the problem by increasing SLEEP_TIME and by increasing the number of retries. The result still was not 100% effective. TestReplication sets up two miniclusters and runs replication between them. Whenever we change replication itself, the master, HTable/HConnection, etc., etc., the timing of various actions changes underneath it through complex interactions. Maybe we should move this out of LargeTests into an integration test instead? Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537254#comment-13537254 ] Andrew Purtell commented on HBASE-7351: --- Sounds good to me either way [~lhofhansl]. I'll put up a trivial patch on HBASE-7399 today and commit it to trunk. Perhaps [~avandana] can contribute a script for HBASE-7406 before we cut 0.94.4? Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7351) Periodic health check chore
[ https://issues.apache.org/jira/browse/HBASE-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537262#comment-13537262 ] Vandana Ayyalasomayajula commented on HBASE-7351: - Thanks [~lhofhansl] and [~apurtell]. I will work on getting an example health check script. Periodic health check chore --- Key: HBASE-7351 URL: https://issues.apache.org/jira/browse/HBASE-7351 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7331_94_1.patch, HBASE-7331_94_2.patch, HBASE-7331_94_3.patch, HBASE-7331_94.patch, HBASE-7331_trunk.patch, HBASE-7351_trunk_2.patch Similar to MAPREDUCE-211, region server should also have a mechanism to check the health of the node. It should run the health check script periodically and if there is any errors, it should stop itself gracefully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7329) remove flush-related records from WAL
[ https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537263#comment-13537263 ] Sergey Shelukhin commented on HBASE-7329: - TestDelayedRpc is flaky. TestLogRolling failure looks like it could be caused by this patch but I cannot reproduce it. Looking... remove flush-related records from WAL - Key: HBASE-7329 URL: https://issues.apache.org/jira/browse/HBASE-7329 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7329-v0.patch Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush records in WAL are not useful. If so, they should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537266#comment-13537266 ] Ted Yu commented on HBASE-5778: --- I ran TestReplication locally and it failed on second run: {code} testVerifyRepJob(org.apache.hadoop.hbase.replication.TestReplication) Time elapsed: 16.781 sec FAILURE! java.lang.AssertionError: Waited too much time for truncate at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:180) ... queueFailover(org.apache.hadoop.hbase.replication.TestReplication) Time elapsed: 14.587 sec FAILURE! java.lang.AssertionError: Waited too much time for truncate at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:180) {code} If I remember correctly, testVerifyRepJob used to pass. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7404) Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE
[ https://issues.apache.org/jira/browse/HBASE-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537271#comment-13537271 ] Andrew Purtell commented on HBASE-7404: --- Wow. So if using the heap engine this is an alternative or replacement for HBASE-4027 aka SlabCache? Second to last slide is results of heap engine tests, correct? Have you done any direct comparisons between this and the SlabCache? Bucket Cache:A solution about CMS,Heap Fragment and Big Cache on HBASE -- Key: HBASE-7404 URL: https://issues.apache.org/jira/browse/HBASE-7404 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: BucketCache.pdf, hbase-7404-0.94v1.patch, hbase-7404-trunkv1.patch First, thanks @neil from Fusion-IO share the source code. What's Bucket Cache? It could greatly decrease CMS and heap fragment by GC It support a large cache space for High Read Performance by using high speed disk like Fusion-io 1.An implementation of block cache like LruBlockCache 2.Self manage blocks' storage position through Bucket Allocator 3.The cached blocks could be stored in the memory or file system 4.Bucket Cache could be used as a mainly block cache(see CombinedBlockCache), combined with LruBlockCache to decrease CMS and fragment by GC. 5.BucketCache also could be used as a secondary cache(e.g. using Fusionio to store block) to enlarge cache space See more in the attachment and in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7254) Consider replacing AccessController ZK-mediated permissions cache
[ https://issues.apache.org/jira/browse/HBASE-7254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7254: -- Summary: Consider replacing AccessController ZK-mediated permissions cache (was: Use new Globally Barriered Procedure mechanism to replace AccessController ZK-mediated permissions cache ) Consider replacing AccessController ZK-mediated permissions cache - Key: HBASE-7254 URL: https://issues.apache.org/jira/browse/HBASE-7254 Project: HBase Issue Type: Task Components: Coprocessors, security Affects Versions: 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell After HBASE-7212 goes in, we could tighten up the permissions cache using a barrier for grant and revoke ops. We should consider replacing the current ZK watcher based permissions cache RPC via ZK with this Procedure mechanism that provides much the same, but with the added benefit that we can fail the grant or revoke op if one or more RSes fail to ack the update. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7373) table should not be required in AccessControlService
[ https://issues.apache.org/jira/browse/HBASE-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537274#comment-13537274 ] Hudson commented on HBASE-7373: --- Integrated in HBase-TRUNK #3643 (See [https://builds.apache.org/job/HBase-TRUNK/3643/]) HBASE-7373 table should not be required in AccessControlService (Revision 1424604) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AccessControlProtos.java * /hbase/trunk/hbase-protocol/src/main/protobuf/AccessControl.proto * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java table should not be required in AccessControlService Key: HBASE-7373 URL: https://issues.apache.org/jira/browse/HBASE-7373 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 Attachments: trunk-7373.patch We should fix the proto file, add unit test for this case, and verify it works from hbase shell with table to be nil. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7254) Consider replacing AccessController ZK-mediated permissions cache
[ https://issues.apache.org/jira/browse/HBASE-7254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7254: -- Description: After HBASE-5487 or HBASE-7212 goes in, we could replace the AccessController's permissions cache update via ZK using one of these more general frameworks, thus reducing functional duplication and code. (was: After HBASE-7212 goes in, we could tighten up the permissions cache using a barrier for grant and revoke ops. We should consider replacing the current ZK watcher based permissions cache RPC via ZK with this Procedure mechanism that provides much the same, but with the added benefit that we can fail the grant or revoke op if one or more RSes fail to ack the update.) Consider replacing AccessController ZK-mediated permissions cache - Key: HBASE-7254 URL: https://issues.apache.org/jira/browse/HBASE-7254 Project: HBase Issue Type: Task Components: Coprocessors, security Affects Versions: 0.96.0 Reporter: Andrew Purtell Assignee: Andrew Purtell After HBASE-5487 or HBASE-7212 goes in, we could replace the AccessController's permissions cache update via ZK using one of these more general frameworks, thus reducing functional duplication and code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7398) [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5
[ https://issues.apache.org/jira/browse/HBASE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-7398: - Attachment: hbase-7398_v2.patch v2 changes all instances to use Mocking.waitForRegionPendingOpenInRIT() as done in trunk TestAssignmentManager. [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5 Key: HBASE-7398 URL: https://issues.apache.org/jira/browse/HBASE-7398 Project: HBase Issue Type: Bug Components: Region Assignment, test Affects Versions: 0.94.4 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-7398_v1.patch, hbase-7398_v2.patch TestAssignmentManager#testBalance() fails pretty frequently on CentOS 5 for 0.94. The root cause is that ClosedRegionHandler is executed by an executor, and before it finishes, the region transition is done for OPENING and OPENED. This seems to be just a test problem, not an actual bug, since the region server won't open the region unless it get's it from the assign call on ClosedRegionHandler.process(). I've seen that HBASE-6109 has a fix for this already, will just backport those changes. This is 0.94 only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7403) Online Merge
[ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537276#comment-13537276 ] Andrew Purtell commented on HBASE-7403: --- bq. This is related to HBASE-5487: Generic framework for Master-coordinated tasks Agreed. We shouldn't have more ZooKeeper mediated frameworks like this than necessary. Right now I can think of three: one for snapshots, one for merging here, one for security policy updates (see HBASE-7254). Online Merge Key: HBASE-7403 URL: https://issues.apache.org/jira/browse/HBASE-7403 Project: HBase Issue Type: New Feature Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: hbase-7403-94v1.patch, hbase-7403-trunkv1.patch, merge region.pdf We need merge in the following cases: 1.Region hole or region overlap, can’t be fix by hbck 2.Region become empty because of TTL and not reasonable Rowkey design 3.Region is always empty or very small because of presplit when create table 4.Too many empty or small regions would reduce the system performance(e.g. mslab) Current merge tools only support offline and are not able to redo if exception is thrown in the process of merging, causing a dirty data For online system, we need a online merge. This implement logic of this patch for Online Merge is : For example, merge regionA and regionB into regionC 1.Offline the two regions A and B 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory) 3.Add the merged regionC to .META. 4.Assign the merged regionC As design of this patch , once we do the merge work in the HDFS,we could redo it until successful if it throws exception or abort or server restart, but couldn’t be rolled back. It depends on Use zookeeper to record the transaction journal state, make redo easier Use zookeeper to send/receive merge request Merge transaction is executed on the master Support calling merge request through API or shell tool About the merge process, please see the attachment and patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7091) support custom GC options in hbase-env.sh
[ https://issues.apache.org/jira/browse/HBASE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates resolved HBASE-7091. Resolution: Fixed Committed to both 0.94 and 0.96 support custom GC options in hbase-env.sh - Key: HBASE-7091 URL: https://issues.apache.org/jira/browse/HBASE-7091 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.4 Reporter: Jesse Yates Assignee: Jesse Yates Labels: newbie Fix For: 0.96.0, 0.94.4 Attachments: hbase-7091-v1.patch When running things like bin/start-hbase and bin/hbase-daemon.sh start [master|regionserver|etc] we end up setting HBASE_OPTS property a couple times via calling hbase-env.sh. This is generally not a problem for most cases, but when you want to set your own GC log properties, one would think you should set HBASE_GC_OPTS, which get added to HBASE_OPTS. NOPE! That would make too much sense. Running bin/hbase-daemons.sh will run bin/hbase-daemon.sh with the daemons it needs to start. Each time through hbase-daemon.sh we also call bin/hbase. This isn't a big deal except for each call to hbase-daemon.sh, we also source hbase-env.sh twice (once in the script and once in bin/hbase). This is important for my next point. Note that to turn on GC logging, you uncomment: {code} # export HBASE_OPTS=$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS {code} and then to log to a gc file for each server, you then uncomment: {code} # export HBASE_USE_GC_LOGFILE=true {code} in hbase-env.sh On the first pass through hbase-daemon.sh, HBASE_GC_OPTS isn't set, so HBASE_OPTS doesn't get anything funky, but we set HBASE_USE_GC_LOGFILE, which then sets HBASE_GC_OPTS to the log file (-Xloggc:...). Then in bin/hbase we again run hbase-env.sh, which now hs HBASE_GC_OPTS set, adding the GC file. This isn't a general problem because HBASE_OPTS is set without prefixing the existing HBASE_OPTS (eg. HBASE_OPTS=$HBASE_OPTS ...), allowing easy updating. However, GC OPTS don't work the same and this is really odd behavior when you want to set your own GC opts, which can include turning on GC log rolling (yes, yes, they really are jvm opts, but they ought to support their own param, to help minimize clutter). The simple version of this patch will just add an idempotent GC option to hbase-env.sh and some comments that uncommenting {code} # export HBASE_USE_GC_LOGFILE=true {code} will lead to a custom gc log file per server (along with an example name), so you don't need to set -Xloggc. The more complex solution does the above and also solves the multiple calls to hbase-env.sh so we can be sane about how all this works. Note that to fix this, hbase-daemon.sh just needs to read in HBASE_USE_GC_LOGFILE after sourcing hbase-env.sh and then update HBASE_OPTS. Oh and also not source hbase-env.sh in bin/hbase. Even further, we might want to consider adding options just for cases where we don't need gc logging - i.e. the shell, the config reading tool, hcbk, etc. This is the hardest version to handle since the first couple will willy-nilly apply the gc options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7091) support custom GC options in hbase-env.sh
[ https://issues.apache.org/jira/browse/HBASE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-7091: --- Fix Version/s: 0.96.0 support custom GC options in hbase-env.sh - Key: HBASE-7091 URL: https://issues.apache.org/jira/browse/HBASE-7091 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.4 Reporter: Jesse Yates Assignee: Jesse Yates Labels: newbie Fix For: 0.96.0, 0.94.4 Attachments: hbase-7091-v1.patch When running things like bin/start-hbase and bin/hbase-daemon.sh start [master|regionserver|etc] we end up setting HBASE_OPTS property a couple times via calling hbase-env.sh. This is generally not a problem for most cases, but when you want to set your own GC log properties, one would think you should set HBASE_GC_OPTS, which get added to HBASE_OPTS. NOPE! That would make too much sense. Running bin/hbase-daemons.sh will run bin/hbase-daemon.sh with the daemons it needs to start. Each time through hbase-daemon.sh we also call bin/hbase. This isn't a big deal except for each call to hbase-daemon.sh, we also source hbase-env.sh twice (once in the script and once in bin/hbase). This is important for my next point. Note that to turn on GC logging, you uncomment: {code} # export HBASE_OPTS=$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS {code} and then to log to a gc file for each server, you then uncomment: {code} # export HBASE_USE_GC_LOGFILE=true {code} in hbase-env.sh On the first pass through hbase-daemon.sh, HBASE_GC_OPTS isn't set, so HBASE_OPTS doesn't get anything funky, but we set HBASE_USE_GC_LOGFILE, which then sets HBASE_GC_OPTS to the log file (-Xloggc:...). Then in bin/hbase we again run hbase-env.sh, which now hs HBASE_GC_OPTS set, adding the GC file. This isn't a general problem because HBASE_OPTS is set without prefixing the existing HBASE_OPTS (eg. HBASE_OPTS=$HBASE_OPTS ...), allowing easy updating. However, GC OPTS don't work the same and this is really odd behavior when you want to set your own GC opts, which can include turning on GC log rolling (yes, yes, they really are jvm opts, but they ought to support their own param, to help minimize clutter). The simple version of this patch will just add an idempotent GC option to hbase-env.sh and some comments that uncommenting {code} # export HBASE_USE_GC_LOGFILE=true {code} will lead to a custom gc log file per server (along with an example name), so you don't need to set -Xloggc. The more complex solution does the above and also solves the multiple calls to hbase-env.sh so we can be sane about how all this works. Note that to fix this, hbase-daemon.sh just needs to read in HBASE_USE_GC_LOGFILE after sourcing hbase-env.sh and then update HBASE_OPTS. Oh and also not source hbase-env.sh in bin/hbase. Even further, we might want to consider adding options just for cases where we don't need gc logging - i.e. the shell, the config reading tool, hcbk, etc. This is the hardest version to handle since the first couple will willy-nilly apply the gc options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7390) Add extra test cases for assignement on the region server and fix the related issues
[ https://issues.apache.org/jira/browse/HBASE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537280#comment-13537280 ] stack commented on HBASE-7390: -- bq. What's strange to me is that it seems similar to a RS_ZK_REGION_CLOSED, and in this case we don't retickle at all: we start the close handler immediately. Agree we should do one or the other. The tickle does not seem necessary if when new master joins cluster and it sees a SPLIT, it does the cleanup immediately -- queues the split handler as is don3 for CLOSED (?). Add extra test cases for assignement on the region server and fix the related issues Key: HBASE-7390 URL: https://issues.apache.org/jira/browse/HBASE-7390 Project: HBase Issue Type: Bug Components: Region Assignment, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 7390.v1.patch, 7390.v2.patch, 7390.v3.patch, 7390.v4.patch, assignment_zk_states.jpg We don't have a lot of tests on the region server itself. Here are some. Some of them are failing, feedback welcome. See as well the attached state diagram for the ZK nodes on assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537282#comment-13537282 ] Devaraj Das commented on HBASE-5448: [~stack], yes, this is how I think it needs to be done.. For examples, please have a look at AggregateClient and AggregateImplementation classes. On the client side, you could do the following (to check/signal exceptions): AggregateResponse response = rpcCallback.get(); if (controller.failedOnException()) { throw controller.getFailedOn(); } [[~ghelmling], please chime in if I missed anything.. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: IPC/RPC, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448_4.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7294) Check for snapshot file cleaners on start
[ https://issues.apache.org/jira/browse/HBASE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537288#comment-13537288 ] Jesse Yates commented on HBASE-7294: minor nit: we have a lot of snapshot tests that need the cleaners, maybe centralize to a single method to add the cleaners in a SnapshotTestingUtilitity (or something similar)? Could even be a setupCluster(Configuration) method so we can easily add on other things later (though this may all be excessive :)) +1 if tests are passing. Check for snapshot file cleaners on start - Key: HBASE-7294 URL: https://issues.apache.org/jira/browse/HBASE-7294 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 Attachments: HBASE-7294-v1.patch, HBASE-7294-v2.patch, HBASE-7294-v3.patch, HBASE-7294-v4.patch Snapshots currently use the SnaphotHfileCleaner and SnapshotHLogCleaner to ensure that any hfiles or hlogs (respectively) that are currently part of a snapshot are not removed from their respective archive directories (.archive and .oldlogs). From Matteo Bertozzi: {quote} currently the snapshot cleaner is not in hbase-default.xml and there's no warning/exception on snapshot/restore operation, if not enabled. even if we add the cleaner to the hbase-default.xml how do we ensure that the user doesn't remove it? Do we want to hardcode the cleaner at master startup? Do we want to add a check in snapshot/restore that throws an exception if the cleaner is not enabled? {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7329) remove flush-related records from WAL
[ https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7329: Attachment: HBASE-7329-v0-tmp.patch try to nudge Hudson with some additional logging. Test might have a timing issue, don't think there's really a problem. remove flush-related records from WAL - Key: HBASE-7329 URL: https://issues.apache.org/jira/browse/HBASE-7329 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7329-v0.patch, HBASE-7329-v0-tmp.patch Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush records in WAL are not useful. If so, they should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7210) Backport HBASE-6059 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7210: - Fix Version/s: (was: 0.94.4) 0.94.5 Let me push to 0.94.5 Backport HBASE-6059 to 0.94 --- Key: HBASE-7210 URL: https://issues.apache.org/jira/browse/HBASE-7210 Project: HBase Issue Type: Bug Affects Versions: 0.94.2 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.5 Attachments: 6059-94.patch HBASE-6059 seems to be an important issue. Chunhui has already given a patch for 94. Need to rebase if it does not apply cleanly. Raising a new one as the old issue is already closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7294) Check for snapshot file cleaners on start
[ https://issues.apache.org/jira/browse/HBASE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537293#comment-13537293 ] Jonathan Hsieh commented on HBASE-7294: --- Another consideration is to turn these cleaners on by default, or have a single is-snapshots-on? config var that does this work. This would be a follow-on jira, and not necessary here. Check for snapshot file cleaners on start - Key: HBASE-7294 URL: https://issues.apache.org/jira/browse/HBASE-7294 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jesse Yates Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 Attachments: HBASE-7294-v1.patch, HBASE-7294-v2.patch, HBASE-7294-v3.patch, HBASE-7294-v4.patch Snapshots currently use the SnaphotHfileCleaner and SnapshotHLogCleaner to ensure that any hfiles or hlogs (respectively) that are currently part of a snapshot are not removed from their respective archive directories (.archive and .oldlogs). From Matteo Bertozzi: {quote} currently the snapshot cleaner is not in hbase-default.xml and there's no warning/exception on snapshot/restore operation, if not enabled. even if we add the cleaner to the hbase-default.xml how do we ensure that the user doesn't remove it? Do we want to hardcode the cleaner at master startup? Do we want to add a check in snapshot/restore that throws an exception if the cleaner is not enabled? {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537300#comment-13537300 ] Gary Helmling commented on HBASE-5448: -- [~saint@gmail.com] Comments below: bq. A thought. I think this pb-based way of doing dynamic cps elegant after looking at it a while. I know 'elegant' is not the first thing that comes to mind when you have pb and rpc in the mix, but hey, can't keep it to myself. I'm glad someone sees it that way :) It is at least consistent, if a little verbose. I think it could be more elegant if we did a custom PB compiler plugin and required service implementors to compile their endpoint definitions with our own script or build target. Then I think we could control the generated service method signatures. Maybe if I'm feeling especially crazy I'll check that out over the holidays. But I wouldn't consider actually shipping that unless it significantly simplified these cases and didn't require additional mass changes. bq. One thing I think we have lost though going to this new mechanism is the ability to do generics: i.e. GenericProtocol over in hbase-server/src/test can't be made work now. I believe this so because pb requires you specify a type: https://developers.google.com/protocol-buffers/docs/proto#simple Do you agree G? Yes, I agree, though in a way generics don't even apply with the use of protobufs. Services could do more dynamic interpretation of messages, but it would be up to them to implement that in a way that made sense for the specific case. I don't think there's anything we need to do to support this. bq. How to do exceptions in a coprocessor? I see where we set the exception on the controller if there is one, but should we then abandon further processing – return? We need to call the RpcCallback done, though, right? Yes, the exception should be set on the controller in order to be returned to the client. It seems to be good practice to always call RpcCallback.done(), but it's not strictly required for endpoint implementations and it should also be fine to pass a null argument in the case of an exception. Your implementation looks fine to me, assuming that sum is a required field in the proto message, otherwise you could skip setting the dummy value in the response on an exception. One additional idea would be to define a custom unchecked exception (EndpointException extends RuntimeException?) which we could watch for and use to set the exception in the controller, but either with this or the current ResponseConverter.setControllerException() we're relying on convention over a real contract, which doesn't seem great. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: IPC/RPC, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448_4.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7321) Simple Flush Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537301#comment-13537301 ] Jesse Yates commented on HBASE-7321: This strikes me as 'not really a snapshot' in that its very hard to reason about what writes will or won't be included in the snapshot, from a client perspective. To me (and maybe this is a group of 1) snapshot means 'at a point in time', but this is a far more ragged form. Traditionally, I believe snapshots are (the current state of the DB) - (uncommitted transactions), which for our case would be mutations that haven't completed when the snapshot starts. Because we don't coordinate between regionservers, we can't give 'a point in time' as a reference, but rather a just a 'fuzzy' approximation (as you mention in the review, its an optimization on copytable), and therefore I don't feel that 'snapshot' is the best name for this operation as people could easily be confused by this. I'd be ok if you went with 'Fuzzy Snapshot' and a decent description (brings up case where we need to add a section to the refguide about the different types of snapshots with good descriptions) Simple Flush Snapshot - Key: HBASE-7321 URL: https://issues.apache.org/jira/browse/HBASE-7321 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-7321.v2.patch, pre-hbase-7321.v2.patch This snapshot style just issues a region flush and then snapshots the region. This is a simple implementation that gives the equivalent of copytable consistency. While by most definitions of consistency if a client writes A and then write B to different region servers, only neither, only A, or both A+B writes should be present, this one allows the only B case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7390) Add extra test cases for assignement on the region server and fix the related issues
[ https://issues.apache.org/jira/browse/HBASE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537311#comment-13537311 ] stack commented on HBASE-7390: -- On the patch, what Sergey said about different closeRegion returns. Could the one that throws an exception be private? Would putIfAbsent be easier to read if it took an enum rather than a boolean to say whether open or close? For example, what was there previous was opaque enough and I appreciate your collecting together tests into a single method but the below could be better: -if (this.regionsInTransitionInRS.containsKey(region.getEncodedNameAsBytes())) { - LOG.warn(Received close for region we are already opening or closing; + -region.getEncodedName()); +if (putIfAbsent(region, false)){ + // We're already closing this region. This below continue is right? + builder.addOpeningState(RegionOpeningState.OPENED); + continue; I see you do it again later if we continue... so above looks like it makes sense. Review comments. Some error. e.g. + * @param expectedVersion expecpted version og the znode It looks like klingon. I appreciate these changes: - String regionName) + String encodedRegionName) I always have to back up to find what what is needed making these calls. I suppose we should make an encodedRegionName type one of these days. Do we need to do this? +zkw.sync(encoded); Its to be 'sure'... Its expensive though... Nice test. Can you say more on this @nkeywal This code there is very very smart: if there is a open in progress, it changes the internal state to close, then raises an exception through the call to checkIfRegionInTransition. As we changed the state, we will have, if we were currently opening, a message saying that we were trying to close a region already closing. I'm not sure I follow. The patch looks great, cleaning up fuzzy state. Let me get Jimmy to take a looksee too. He LOVEs this stuff. [~jxiang] Any chance of your taking a look here boss? Add extra test cases for assignement on the region server and fix the related issues Key: HBASE-7390 URL: https://issues.apache.org/jira/browse/HBASE-7390 Project: HBase Issue Type: Bug Components: Region Assignment, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 7390.v1.patch, 7390.v2.patch, 7390.v3.patch, 7390.v4.patch, assignment_zk_states.jpg We don't have a lot of tests on the region server itself. Here are some. Some of them are failing, feedback welcome. See as well the attached state diagram for the ZK nodes on assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7218) Rename Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537312#comment-13537312 ] Jesse Yates commented on HBASE-7218: {quote} 1) Junk on failure. This is unaccepatble IMO. If rename crashes we should not leave any corrupted snapshots behind. I'm concerned about the original ss if (not sure about this) if the recursive hdfs delete is not atomic. {quote} With the cleanup of the /hbase/.snapshot/.tmp directory (HBASE-7240) we remove any of the junk from a previous failure. As the directory rename is atomic WRT the namenode, we then have atomic rename with 'rollback in case of failure' semantics (unless I'm missing something). Rename Snapshot --- Key: HBASE-7218 URL: https://issues.apache.org/jira/browse/HBASE-7218 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: hbase-6055 Attachments: HBASE-7218-v0.patch, HBASE-7218-v1.patch Add the ability to rename a snapshot. HBaseAdmin.renameSnapshot(oldName, newName) shell: snapshot_rename 'oldName', 'newName' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537313#comment-13537313 ] Lars Hofhansl commented on HBASE-5778: -- Ok. Now it's failing locally again, even with patch reverted. Sigh. Let's not revert then. I triggered another 0.94 jenkins build. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7218) Rename Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537317#comment-13537317 ] Jesse Yates commented on HBASE-7218: Since we are just doing copies for the rename, this should be fine except in cases of cloning/export, which should recieve some thorough commenting to avoid contention. However, I don't think we need to have server-side exclusion of the actions - your a bad admin if you try to export a snapshot and rename it at the the same time. To the end of helping admins, maybe a follow-on jira to surface ALL running snapshot operations to the UI? Rename Snapshot --- Key: HBASE-7218 URL: https://issues.apache.org/jira/browse/HBASE-7218 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: hbase-6055 Attachments: HBASE-7218-v0.patch, HBASE-7218-v1.patch Add the ability to rename a snapshot. HBaseAdmin.renameSnapshot(oldName, newName) shell: snapshot_rename 'oldName', 'newName' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537316#comment-13537316 ] Gary Helmling commented on HBASE-5448: -- bq. if (controller.failedOnException()) { throw controller.getFailedOn(); } Since this was becoming such a common pattern, I recently added ServerRpcContoller.checkFailed() to do this in a single step. Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: IPC/RPC, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448_4.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537318#comment-13537318 ] Lars Hofhansl commented on HBASE-5778: -- I also do not see anything obvious in the patch that would cause new problems in the tests. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5448) Support for dynamic coprocessor endpoints with PB-based RPC
[ https://issues.apache.org/jira/browse/HBASE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537332#comment-13537332 ] stack commented on HBASE-5448: -- bq. Maybe if I'm feeling especially crazy I'll check that out over the holidays. I'd say wait for some demand. Drink eggnog instead! Thanks for input on the exception handling (and you too DD). Support for dynamic coprocessor endpoints with PB-based RPC --- Key: HBASE-5448 URL: https://issues.apache.org/jira/browse/HBASE-5448 Project: HBase Issue Type: Sub-task Components: IPC/RPC, master, migration, regionserver Reporter: Todd Lipcon Assignee: Gary Helmling Fix For: 0.96.0 Attachments: HBASE-5448_2.patch, HBASE-5448_3.patch, HBASE-5448_4.patch, HBASE-5448.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7091) support custom GC options in hbase-env.sh
[ https://issues.apache.org/jira/browse/HBASE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537349#comment-13537349 ] Hudson commented on HBASE-7091: --- Integrated in HBase-0.94 #649 (See [https://builds.apache.org/job/HBase-0.94/649/]) HBASE-7091: Support custom GC options in hbase-env.sh (Revision 1424646) Result = FAILURE jyates : Files : * /hbase/branches/0.94/bin/hbase * /hbase/branches/0.94/conf/hbase-env.sh support custom GC options in hbase-env.sh - Key: HBASE-7091 URL: https://issues.apache.org/jira/browse/HBASE-7091 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.4 Reporter: Jesse Yates Assignee: Jesse Yates Labels: newbie Fix For: 0.96.0, 0.94.4 Attachments: hbase-7091-v1.patch When running things like bin/start-hbase and bin/hbase-daemon.sh start [master|regionserver|etc] we end up setting HBASE_OPTS property a couple times via calling hbase-env.sh. This is generally not a problem for most cases, but when you want to set your own GC log properties, one would think you should set HBASE_GC_OPTS, which get added to HBASE_OPTS. NOPE! That would make too much sense. Running bin/hbase-daemons.sh will run bin/hbase-daemon.sh with the daemons it needs to start. Each time through hbase-daemon.sh we also call bin/hbase. This isn't a big deal except for each call to hbase-daemon.sh, we also source hbase-env.sh twice (once in the script and once in bin/hbase). This is important for my next point. Note that to turn on GC logging, you uncomment: {code} # export HBASE_OPTS=$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS {code} and then to log to a gc file for each server, you then uncomment: {code} # export HBASE_USE_GC_LOGFILE=true {code} in hbase-env.sh On the first pass through hbase-daemon.sh, HBASE_GC_OPTS isn't set, so HBASE_OPTS doesn't get anything funky, but we set HBASE_USE_GC_LOGFILE, which then sets HBASE_GC_OPTS to the log file (-Xloggc:...). Then in bin/hbase we again run hbase-env.sh, which now hs HBASE_GC_OPTS set, adding the GC file. This isn't a general problem because HBASE_OPTS is set without prefixing the existing HBASE_OPTS (eg. HBASE_OPTS=$HBASE_OPTS ...), allowing easy updating. However, GC OPTS don't work the same and this is really odd behavior when you want to set your own GC opts, which can include turning on GC log rolling (yes, yes, they really are jvm opts, but they ought to support their own param, to help minimize clutter). The simple version of this patch will just add an idempotent GC option to hbase-env.sh and some comments that uncommenting {code} # export HBASE_USE_GC_LOGFILE=true {code} will lead to a custom gc log file per server (along with an example name), so you don't need to set -Xloggc. The more complex solution does the above and also solves the multiple calls to hbase-env.sh so we can be sane about how all this works. Note that to fix this, hbase-daemon.sh just needs to read in HBASE_USE_GC_LOGFILE after sourcing hbase-env.sh and then update HBASE_OPTS. Oh and also not source hbase-env.sh in bin/hbase. Even further, we might want to consider adding options just for cases where we don't need gc logging - i.e. the shell, the config reading tool, hcbk, etc. This is the hardest version to handle since the first couple will willy-nilly apply the gc options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537352#comment-13537352 ] Lars Hofhansl commented on HBASE-5778: -- Fact is, since this patch we did not have single jenkins run where these tests did not fail. So here's what I am going to do. I'll revert this from 0.94, to see whether the tests pass. If they don't we're none the wiser. If they do, we can regroup for 0.94.5. Unless I hear objections I'll do that within the next hour or so. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-5778: -- Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6789: - Attachment: 6789v4.txt Fixed failed test. Let me load to rb. 8 Added more deprecations of CoprocessorProtocol mentions. 7 6 Added new protos and generated classes necessary converting 5 over PingProtocol and ColumnAggregateProtocol. 4 3 Removed GenericProtocol and supporting classes and references 2 in tests. Generics not supported in pb way of doing dynamic 1 endpoints. Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt, 6789v4.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537364#comment-13537364 ] stack commented on HBASE-6789: -- I put up the patch at https://reviews.apache.org/r/8727/ Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt, 6789v4.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537369#comment-13537369 ] stack commented on HBASE-5778: -- +1 on trying anything to get a green test. +1 on test replication going over to integration tests. Even when it does fail, only J-D can make sense of it (smile). It has found issues in the past though Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537374#comment-13537374 ] Lars Hofhansl commented on HBASE-5778: -- LOL... I revert but now @%#^#$ Jenkins is down. I can't win. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537374#comment-13537374 ] Lars Hofhansl edited comment on HBASE-5778 at 12/20/12 9:01 PM: LOL... I reverted but now @%#^#$ Jenkins is down. I can't win. was (Author: lhofhansl): LOL... I revert but now @%#^#$ Jenkins is down. I can't win. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537378#comment-13537378 ] Ted Yu commented on HBASE-5778: --- I think Jenkins is on vacation again, Lars :-) Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7236) add per-table/per-cf configuration via metadata
[ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7236: Attachment: HBASE-7236-v4.patch CR feedback (also updated review board); most changes come from type change to String, and rename add per-table/per-cf configuration via metadata --- Key: HBASE-7236 URL: https://issues.apache.org/jira/browse/HBASE-7236 Project: HBase Issue Type: New Feature Components: Compaction Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly. We might want to add support for compaction configuration via metadata on table/cf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7398) [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5
[ https://issues.apache.org/jira/browse/HBASE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537385#comment-13537385 ] Ted Yu commented on HBASE-7398: --- I looped TestAssignmentManager twice with patch v2 on MacBook and they passed. +1 on patch v2. [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5 Key: HBASE-7398 URL: https://issues.apache.org/jira/browse/HBASE-7398 Project: HBase Issue Type: Bug Components: Region Assignment, test Affects Versions: 0.94.4 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-7398_v1.patch, hbase-7398_v2.patch TestAssignmentManager#testBalance() fails pretty frequently on CentOS 5 for 0.94. The root cause is that ClosedRegionHandler is executed by an executor, and before it finishes, the region transition is done for OPENING and OPENED. This seems to be just a test problem, not an actual bug, since the region server won't open the region unless it get's it from the assign call on ClosedRegionHandler.process(). I've seen that HBASE-6109 has a fix for this already, will just backport those changes. This is 0.94 only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7409) [snapshots] Add documentation about online snapshots
Jonathan Hsieh created HBASE-7409: - Summary: [snapshots] Add documentation about online snapshots Key: HBASE-7409 URL: https://issues.apache.org/jira/browse/HBASE-7409 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh This will include additions to the ref guide about the different kinds of snapshots and thier pros and cons wrt to consistency, point-in-time-ness, and availability concerns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7410) [snapshots] add snapshot/clone/restore/export docs to ref guide
Jonathan Hsieh created HBASE-7410: - Summary: [snapshots] add snapshot/clone/restore/export docs to ref guide Key: HBASE-7410 URL: https://issues.apache.org/jira/browse/HBASE-7410 Project: HBase Issue Type: Sub-task Components: documentation, snapshots Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Priority: Blocker This will include additions to the ref guide about the different operations provided and how to use them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7321) Simple Flush Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537405#comment-13537405 ] Jonathan Hsieh commented on HBASE-7321: --- In my opinion, definition of point-in-time is a bit fuzzy a well. If you are talking about the timestamps snapshot the way I currently understand this there are no gurantees between regionservers so the timestamp snapshot is 'not really a snapshot' either because causality could potentially be violated. (the limitation in the second paragraph of the description above). An improvement on copy table is a good improvement and I'm fine with calling this a fuzzy snapshot or describing it as such in the docs. I'm hesitant to go and change all the names or add add new commands (fuzzy_snapshot command seems like overkill). I've filed docs jiras HBASE-7409, HBASE-7410 to put explanations there and marked them as blockers. Good enough? I don't plan on working through the other flavors until we have this, likely the simplest online snapshot, and all its tooling and plumbing robust and resolved. Simple Flush Snapshot - Key: HBASE-7321 URL: https://issues.apache.org/jira/browse/HBASE-7321 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-7321.v2.patch, pre-hbase-7321.v2.patch This snapshot style just issues a region flush and then snapshots the region. This is a simple implementation that gives the equivalent of copytable consistency. While by most definitions of consistency if a client writes A and then write B to different region servers, only neither, only A, or both A+B writes should be present, this one allows the only B case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7398) [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5
[ https://issues.apache.org/jira/browse/HBASE-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar resolved HBASE-7398. -- Resolution: Fixed Fix Version/s: 0.94.4 Hadoop Flags: Reviewed Committed this. Thank you guys for the review. [0.94 UNIT TESTS] TestAssignmentManager fails frequently on CentOS 5 Key: HBASE-7398 URL: https://issues.apache.org/jira/browse/HBASE-7398 Project: HBase Issue Type: Bug Components: Region Assignment, test Affects Versions: 0.94.4 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.94.4 Attachments: hbase-7398_v1.patch, hbase-7398_v2.patch TestAssignmentManager#testBalance() fails pretty frequently on CentOS 5 for 0.94. The root cause is that ClosedRegionHandler is executed by an executor, and before it finishes, the region transition is done for OPENING and OPENED. This seems to be just a test problem, not an actual bug, since the region server won't open the region unless it get's it from the assign call on ClosedRegionHandler.process(). I've seen that HBASE-6109 has a fix for this already, will just backport those changes. This is 0.94 only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537413#comment-13537413 ] Andrew Purtell commented on HBASE-5778: --- Given the above logic, I think the revert is the right call. Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537421#comment-13537421 ] Hudson commented on HBASE-5778: --- Integrated in HBase-0.94 #650 (See [https://builds.apache.org/job/HBase-0.94/650/]) HBASE-5778 Revert, to check on test failures potentially caused by this. (Revision 1424702) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationHLogReaderManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/FaultySequenceFileLogReader.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplitCompressed.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationWithCompression.java Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537423#comment-13537423 ] Hadoop QA commented on HBASE-6789: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561981/6789v4.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 32 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer {color:red}-1 core zombie tests{color}. There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3635//console This message is automatically generated. Convert test CoprocessorProtocol implementations to protocol buffer services Key: HBASE-6789 URL: https://issues.apache.org/jira/browse/HBASE-6789 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Gary Helmling Assignee: stack Fix For: 0.96.0 Attachments: 6789.txt, 6789v2.txt, 6789v3.txt, 6789v4.txt With coprocessor endpoints now exposed as protobuf defined services, we should convert over all of our built-in endpoints to PB services. Several CoprocessorProtocol implementations are defined for tests: * ColumnAggregationProtocol * GenericProtocol * TestServerCustomProtocol.PingProtocol These should either be converted to PB services or removed if they duplicate other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata
[ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537432#comment-13537432 ] Hadoop QA commented on HBASE-7236: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561983/HBASE-7236-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 25 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication {color:red}-1 core zombie tests{color}. There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3634//console This message is automatically generated. add per-table/per-cf configuration via metadata --- Key: HBASE-7236 URL: https://issues.apache.org/jira/browse/HBASE-7236 Project: HBase Issue Type: New Feature Components: Compaction Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, HBASE-7236-v2.patch, HBASE-7236-v3.patch, HBASE-7236-v4.patch Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly. We might want to add support for compaction configuration via metadata on table/cf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Fix HLog compression's incompatibilities
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537436#comment-13537436 ] Lars Hofhansl commented on HBASE-5778: -- Latest run still failed in TestReplication.queueFailover and is despite the increase SLEEP_TIME that I did not revert. Hmm... Fix HLog compression's incompatibilities Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.96.0, 0.94.4 Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, HBASE-5778-0.94-v5.patch, HBASE-5778-0.94-v6.patch, HBASE-5778-0.94-v7.patch, HBASE-5778.patch, HBASE-5778-trunk-v6.patch, HBASE-5778-trunk-v7.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7411) Use Netflix's Curator zookeeper library
Enis Soztutar created HBASE-7411: Summary: Use Netflix's Curator zookeeper library Key: HBASE-7411 URL: https://issues.apache.org/jira/browse/HBASE-7411 Project: HBase Issue Type: New Feature Components: Zookeeper Affects Versions: 0.96.0 Reporter: Enis Soztutar Assignee: Enis Soztutar We have mentioned using the Curator library (https://github.com/Netflix/curator) elsewhere but we can continue the discussion in this. The advantages for the curator lib over ours are the recipes. We have very similar retrying mechanism, and we don't need much of the nice client-API layer. We also have similar Listener interface, etc. I think we can decide on one of the following options: 1. Do not depend on curator. We have some of the recipes, and some custom recipes (ZKAssign, Leader election, etc already working, locks in HBASE-5991, etc). We can also copy / fork some code from there. 2. Replace all of our zk usage / connection management to curator. We may keep the current set of API's as a thin wrapper. 3. Use our own connection management / retry logic, and build a custom CuratorFramework implementation for the curator recipes. This will keep the current zk logic/code intact, and allow us to use curator-recipes as we see fit. 4. Allow both curator and our zk layer to manage the connection. We will still have 1 connection, but 2 abstraction layers sharing it. This is the easiest to implement, but a freak show? I have a patch for 4, and now prototyping 2 or 3 whichever will be less painful. Related issues: HBASE-5547 HBASE-7305 HBASE-7212 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537444#comment-13537444 ] Sergey Shelukhin commented on HBASE-5416: - Hmm... I am trying to clean up code a bit now and write comments. It seems that this patch shouldn't work with limits at all... in the big else clause, if we get false from populateResult on storeHeap, we'd go on to start getting stuff from joinedMap. Suppose stopRow is true e.g. storeHeap.peek() now points at the stop-row. Suppose now we hit the limit and set joinedHeapHasMoreData, and return true. On the next call, storeHeap is still pointing to stop-row, so we won't even reach else if (joinedHeapHasMoreData) condition (well, and if we did we'd populate nothing because matchingRow will always return false). Can someone please sanity check me? I'll see how to fix it. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch, HBASE-5416-v10.patch, HBASE-5416-v11.patch, HBASE-5416-v7-rebased.patch, HBASE-5416-v8.patch, HBASE-5416-v9.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7321) Simple Flush Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537463#comment-13537463 ] Jesse Yates commented on HBASE-7321: Yeah, that works for me :) Thanks Jon. Simple Flush Snapshot - Key: HBASE-7321 URL: https://issues.apache.org/jira/browse/HBASE-7321 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-7321.v2.patch, pre-hbase-7321.v2.patch This snapshot style just issues a region flush and then snapshots the region. This is a simple implementation that gives the equivalent of copytable consistency. While by most definitions of consistency if a client writes A and then write B to different region servers, only neither, only A, or both A+B writes should be present, this one allows the only B case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7412) IncreasingToUpperBoundRegionSplitPolicy may not use the right region flush size
Jimmy Xiang created HBASE-7412: -- Summary: IncreasingToUpperBoundRegionSplitPolicy may not use the right region flush size Key: HBASE-7412 URL: https://issues.apache.org/jira/browse/HBASE-7412 Project: HBase Issue Type: Bug Components: regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 If the region flush size is not set in the table, IncreasingToUpperBoundRegionSplitPolicy will most likely always use the default value: 128MB, even if the flush size is set to a different value in hbase-site.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7091) support custom GC options in hbase-env.sh
[ https://issues.apache.org/jira/browse/HBASE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537468#comment-13537468 ] Hudson commented on HBASE-7091: --- Integrated in HBase-TRUNK #3644 (See [https://builds.apache.org/job/HBase-TRUNK/3644/]) HBASE-7091: Support custom GC options in hbase-env.sh (Revision 1424640) Result = FAILURE jyates : Files : * /hbase/trunk/bin/hbase * /hbase/trunk/conf/hbase-env.sh support custom GC options in hbase-env.sh - Key: HBASE-7091 URL: https://issues.apache.org/jira/browse/HBASE-7091 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.4 Reporter: Jesse Yates Assignee: Jesse Yates Labels: newbie Fix For: 0.96.0, 0.94.4 Attachments: hbase-7091-v1.patch When running things like bin/start-hbase and bin/hbase-daemon.sh start [master|regionserver|etc] we end up setting HBASE_OPTS property a couple times via calling hbase-env.sh. This is generally not a problem for most cases, but when you want to set your own GC log properties, one would think you should set HBASE_GC_OPTS, which get added to HBASE_OPTS. NOPE! That would make too much sense. Running bin/hbase-daemons.sh will run bin/hbase-daemon.sh with the daemons it needs to start. Each time through hbase-daemon.sh we also call bin/hbase. This isn't a big deal except for each call to hbase-daemon.sh, we also source hbase-env.sh twice (once in the script and once in bin/hbase). This is important for my next point. Note that to turn on GC logging, you uncomment: {code} # export HBASE_OPTS=$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps $HBASE_GC_OPTS {code} and then to log to a gc file for each server, you then uncomment: {code} # export HBASE_USE_GC_LOGFILE=true {code} in hbase-env.sh On the first pass through hbase-daemon.sh, HBASE_GC_OPTS isn't set, so HBASE_OPTS doesn't get anything funky, but we set HBASE_USE_GC_LOGFILE, which then sets HBASE_GC_OPTS to the log file (-Xloggc:...). Then in bin/hbase we again run hbase-env.sh, which now hs HBASE_GC_OPTS set, adding the GC file. This isn't a general problem because HBASE_OPTS is set without prefixing the existing HBASE_OPTS (eg. HBASE_OPTS=$HBASE_OPTS ...), allowing easy updating. However, GC OPTS don't work the same and this is really odd behavior when you want to set your own GC opts, which can include turning on GC log rolling (yes, yes, they really are jvm opts, but they ought to support their own param, to help minimize clutter). The simple version of this patch will just add an idempotent GC option to hbase-env.sh and some comments that uncommenting {code} # export HBASE_USE_GC_LOGFILE=true {code} will lead to a custom gc log file per server (along with an example name), so you don't need to set -Xloggc. The more complex solution does the above and also solves the multiple calls to hbase-env.sh so we can be sane about how all this works. Note that to fix this, hbase-daemon.sh just needs to read in HBASE_USE_GC_LOGFILE after sourcing hbase-env.sh and then update HBASE_OPTS. Oh and also not source hbase-env.sh in bin/hbase. Even further, we might want to consider adding options just for cases where we don't need gc logging - i.e. the shell, the config reading tool, hcbk, etc. This is the hardest version to handle since the first couple will willy-nilly apply the gc options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7201) Convert HLog / HFile metadata content to PB
[ https://issues.apache.org/jira/browse/HBASE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537470#comment-13537470 ] stack commented on HBASE-7201: -- Yeah, looking at this again, hfile has the following: {code} // The below three methods take Writables. We'd like to undo Writables but undoing the below would be pretty // painful. Could take a byte [] or a Message but we want to be backward compatible around hfiles so would need // to map between Message and Writable or byte [] and current Writable serialization. This would be a bit of work // to little gain. Thats my thinking at moment. St.Ack 20121129 void appendMetaBlock(String bloomFilterMetaKey, Writable metaWriter); /** * Store general Bloom filter in the file. This does not deal with Bloom filter * internals but is necessary, since Bloom filters are stored differently * in HFile version 1 and version 2. */ void addGeneralBloomFilter(BloomFilterWriter bfw); /** * Store delete family Bloom filter in the file, which is only supported in * HFile V2. */ void addDeleteFamilyBloomFilter(BloomFilterWriter bfw) throws IOException; {code} Am thinking its not the end the of the world if hfile does Writables past 0.96. On WAL side, ditto. I'd think that HLogKey and WALEdit need to evolve anyways -- i.e. write DataBlocks rather than just kvs. As part of evolution can write different formats. Meantime WAL and HFile using Writable I don't think a blocker on 0.96. Knocking down the priority. Convert HLog / HFile metadata content to PB --- Key: HBASE-7201 URL: https://issues.apache.org/jira/browse/HBASE-7201 Project: HBase Issue Type: Sub-task Components: HFile, Protobufs, wal Reporter: Enis Soztutar Priority: Blocker Fix For: 0.96.0 Attachments: 7201.txt Some of the remaining discussions for PB conversions: - Convert the HFile/HLog metadata to PB. - WALEdit, HLogKey should be converted? We don't want to repeat the PBMagic, and the PB overhead can be high, but this is needed for replication? - We said no to converting KV. These should not block 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7201) Convert HLog / HFile metadata content to PB
[ https://issues.apache.org/jira/browse/HBASE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537471#comment-13537471 ] stack commented on HBASE-7201: -- In fact, I'm going to close out this issue and make two smaller ones to replace. As is, it is intimidating. Convert HLog / HFile metadata content to PB --- Key: HBASE-7201 URL: https://issues.apache.org/jira/browse/HBASE-7201 Project: HBase Issue Type: Sub-task Components: HFile, Protobufs, wal Reporter: Enis Soztutar Priority: Blocker Fix For: 0.96.0 Attachments: 7201.txt Some of the remaining discussions for PB conversions: - Convert the HFile/HLog metadata to PB. - WALEdit, HLogKey should be converted? We don't want to repeat the PBMagic, and the PB overhead can be high, but this is needed for replication? - We said no to converting KV. These should not block 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7412) IncreasingToUpperBoundRegionSplitPolicy may not use the right region flush size
[ https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537473#comment-13537473 ] Jimmy Xiang commented on HBASE-7412: HTableDescriptor#getMaxFileSize and HTableDescriptor#getMemStoreFlushSize return defaults if not set. The caller checks if the returned value is the same as the default, if so, it assumes it is not set, which may not be right, because the value could happen to be the same as the default. IncreasingToUpperBoundRegionSplitPolicy may not use the right region flush size --- Key: HBASE-7412 URL: https://issues.apache.org/jira/browse/HBASE-7412 Project: HBase Issue Type: Bug Components: regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 If the region flush size is not set in the table, IncreasingToUpperBoundRegionSplitPolicy will most likely always use the default value: 128MB, even if the flush size is set to a different value in hbase-site.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7413) Convert WAL to pb
stack created HBASE-7413: Summary: Convert WAL to pb Key: HBASE-7413 URL: https://issues.apache.org/jira/browse/HBASE-7413 Project: HBase Issue Type: Task Reporter: stack From HBASE-7201 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7414) Convert HFile to pb
stack created HBASE-7414: Summary: Convert HFile to pb Key: HBASE-7414 URL: https://issues.apache.org/jira/browse/HBASE-7414 Project: HBase Issue Type: Task Reporter: stack See HBASE-7201 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size
[ https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7412: --- Summary: Fix how HTableDescriptor handles default max file size and flush size (was: IncreasingToUpperBoundRegionSplitPolicy may not use the right region flush size) Fix how HTableDescriptor handles default max file size and flush size - Key: HBASE-7412 URL: https://issues.apache.org/jira/browse/HBASE-7412 Project: HBase Issue Type: Bug Components: regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 If the region flush size is not set in the table, IncreasingToUpperBoundRegionSplitPolicy will most likely always use the default value: 128MB, even if the flush size is set to a different value in hbase-site.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7413) Convert WAL to pb
[ https://issues.apache.org/jira/browse/HBASE-7413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7413: - Component/s: wal Tags: noob Labels: noob (was: ) Convertion needs to work in a manner such that we can continue to read old WALs written in old style w/ Writable. Convert WAL to pb - Key: HBASE-7413 URL: https://issues.apache.org/jira/browse/HBASE-7413 Project: HBase Issue Type: Task Components: wal Reporter: stack Labels: noob From HBASE-7201 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size
[ https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537476#comment-13537476 ] Jimmy Xiang commented on HBASE-7412: Changed the title. Fix how HTableDescriptor handles default max file size and flush size - Key: HBASE-7412 URL: https://issues.apache.org/jira/browse/HBASE-7412 Project: HBase Issue Type: Bug Components: regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 If the region flush size is not set in the table, IncreasingToUpperBoundRegionSplitPolicy will most likely always use the default value: 128MB, even if the flush size is set to a different value in hbase-site.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7414) Convert HFile to pb
[ https://issues.apache.org/jira/browse/HBASE-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7414: - Component/s: HFile Description: See HBASE-7201 Convertion should be in a manner that does not prevent our being able to read old style hfiles with Writable metadata. was:See HBASE-7201 Tags: noob Labels: noob (was: ) Convert HFile to pb --- Key: HBASE-7414 URL: https://issues.apache.org/jira/browse/HBASE-7414 Project: HBase Issue Type: Task Components: HFile Reporter: stack Labels: noob See HBASE-7201 Convertion should be in a manner that does not prevent our being able to read old style hfiles with Writable metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7201) Convert HLog / HFile metadata content to PB
[ https://issues.apache.org/jira/browse/HBASE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-7201. -- Resolution: Later Closing as later. Closing because scope is broad. Made two smaller, non-blocker issues to take its place: HBASE-7413 and HBASE-7414 Convert HLog / HFile metadata content to PB --- Key: HBASE-7201 URL: https://issues.apache.org/jira/browse/HBASE-7201 Project: HBase Issue Type: Sub-task Components: HFile, Protobufs, wal Reporter: Enis Soztutar Priority: Blocker Fix For: 0.96.0 Attachments: 7201.txt Some of the remaining discussions for PB conversions: - Convert the HFile/HLog metadata to PB. - WALEdit, HLogKey should be converted? We don't want to repeat the PBMagic, and the PB overhead can be high, but this is needed for replication? - We said no to converting KV. These should not block 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira