[jira] [Commented] (HBASE-5355) Compressed RPC's for HBase
[ https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205284#comment-13205284 ] Andrew Purtell commented on HBASE-5355: --- bq. A while back I saw someone who had proposed a compressed representation of KV that had 'natural prefix' compression. It took advantage of the fact that KVs are typically stored sorted, so one could have a 'this KV has the same row as the previous' flag, and ditto for columns, etc. I think this was something I mentioned in some random JIRA. Compressed RPC's for HBase -- Key: HBASE-5355 URL: https://issues.apache.org/jira/browse/HBASE-5355 Project: HBase Issue Type: Improvement Components: ipc Affects Versions: 0.89.20100924 Reporter: Karthik Ranganathan Assignee: Karthik Ranganathan Some application need ability to do large batched writes and reads from a remote MR cluster. These eventually get bottlenecked on the network. These results are also pretty compressible sometimes. The aim here is to add the ability to do compressed calls to the server on both the send and receive paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205288#comment-13205288 ] Hadoop QA commented on HBASE-5209: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514085/HBASE-5209-v1.diff against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestActiveMasterManager org.apache.hadoop.hbase.master.TestMasterZKSessionRecovery org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/939//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/939//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/939//console This message is automatically generated. HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0, 0.90.5, 0.92.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205318#comment-13205318 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog(). src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205317#comment-13205317 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1545 This is the initialization code in the constructor that assumes that we always verify hbase checksums. In the next line, it will be set to false if the minor version is an old one. Similarly, If there is a HFileSystem and the called has voluntarily cleared hfs.useHBaseChecksum, then we respect the caller's wishes src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not know of nay performance penalty. For hbase code, this wrapper is traversed only once when an HFile is opened of an HLog is created. Since the number of times we open/create a file is miniscule compared to the number of reads/writes to those files, the overhead (if any) should not show up in any benchmark. I will validate this on my cluster and report if I see any. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:1 I do not yet see a package o.apache.hadoop.hbase.fs Do you want m to create it? There is a pre-exising class o.a.h.h.utils.FSUtils, that's why I created HFileSystem inside that package. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:40 We would create a method HFileSystem.getLogFs(). The implementation of this method can open a new filesystem object (for storing transaction logs) Then, HRegionServer will pass in HFileSystem.getLogFs() into the constructor of HLog(). src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:49 Currently, the only place HFileSystem is created is inside HRegionServer src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:107 You would see that readfs is the filesystem object that will be used to avoid checksum verification inside of hdfs. src/main/java/org/apache/hadoop/hbase/util/HFileSystem.java:172 The hadoop code base recently introduced the method FileSystem.createNonRecursive. But whoever added it to FileSystem forgot to add it to FilterFileSystem. Apache hadoop trunk should roll out a patch for this one soon. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5330) TestCompactSelection - adding 2 test cases to testCompactionRatio
[ https://issues.apache.org/jira/browse/HBASE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205337#comment-13205337 ] Hudson commented on HBASE-5330: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) hbase-5330. Update to TestCompactSelection unit test for selection SF assertions. TestCompactSelection - adding 2 test cases to testCompactionRatio - Key: HBASE-5330 URL: https://issues.apache.org/jira/browse/HBASE-5330 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: TestCompactSelection_hbase_5330.java.patch, TestCompactSelection_hbase_5330_v2.java.patch There were three existing assertions in TestCompactSelection testCompactionRatio that did max # of files assertions... {code} assertEquals(maxFiles, store.compactSelection(sfCreate(7,6,5,4,3,2,1)).getFilesToCompact().size()); {code} ... and for references ... {code} assertEquals(maxFiles, store.compactSelection(sfCreate(true, 7,6,5,4,3,2,1)).getFilesToCompact().size()); {code} ... but they didn't assert against which StoreFiles got selected. While the number of StoreFiles is the same, the files selected are actually different, and I thought that there should be explicit assertions showing that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5221) bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout
[ https://issues.apache.org/jira/browse/HBASE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205336#comment-13205336 ] Hudson commented on HBASE-5221: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5221 bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout -- REVERTED HBASE-5221 bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout stack : Files : * /hbase/trunk/bin/hbase stack : Files : * /hbase/trunk/bin/hbase bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout Key: HBASE-5221 URL: https://issues.apache.org/jira/browse/HBASE-5221 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.94.0 Attachments: hbase-5221.txt Running against an 0.24.0-SNAPSHOT hadoop: ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-common*.jar: No such file or directory ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-hdfs*.jar: No such file or directory ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-mapred*.jar: No such file or directory The jars are rooted deeper in the heirarchy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default
[ https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205338#comment-13205338 ] Hudson commented on HBASE-5267: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5267 Add a configuration to disable the slab cache by default (Li Pi) tedyu : Files : * /hbase/trunk/conf/hbase-env.sh * /hbase/trunk/src/docbkx/upgrading.xml * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * /hbase/trunk/src/main/resources/hbase-default.xml Add a configuration to disable the slab cache by default Key: HBASE-5267 URL: https://issues.apache.org/jira/browse/HBASE-5267 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Li Pi Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: 5267.txt, 5267v2.txt, 5267v3.txt, 5267v4.txt From what I commented at the tail of HBASE-4027: {quote} I changed the release note, the patch doesn't have a hbase.offheapcachesize configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize (which is actually a big problem when you consider this: http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). {quote} We need to add hbase.offheapcachesize and set it to false by default. Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5298) Add thrift metrics to thrift2
[ https://issues.apache.org/jira/browse/HBASE-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205339#comment-13205339 ] Hudson commented on HBASE-5298: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5298 Add thrift metrics to thrift2 tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift/TestCallQueue.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Add thrift metrics to thrift2 - Key: HBASE-5298 URL: https://issues.apache.org/jira/browse/HBASE-5298 Project: HBase Issue Type: Improvement Components: metrics, thrift Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.94.0 Attachments: 5298-v3.txt, HBASE-5298.D1629.1.patch, HBASE-5298.D1629.2.patch, HBASE-5298.D1629.3.patch, HBASE-5298.D1629.4.patch We have added thrift metrics collection in HBASE-5186. It will be good to have them in thrift2 as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5229) Provide basic building blocks for multi-row local transactions.
[ https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205340#comment-13205340 ] Hudson commented on HBASE-5229: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5229 Provide basic building blocks for 'multi-row' local transactions. larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/MultiRowMutationEndpoint.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/MultiRowMutationProtocol.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java Provide basic building blocks for multi-row local transactions. - Key: HBASE-5229 URL: https://issues.apache.org/jira/browse/HBASE-5229 Project: HBase Issue Type: New Feature Components: client, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5229-endpoint.txt, 5229-final.txt, 5229-multiRow-v2.txt, 5229-multiRow.txt, 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt In the final iteration, this issue provides a generalized, public mutateRowsWithLocks method on HRegion, that can be used by coprocessors to implement atomic operations efficiently. Coprocessors are already region aware, which makes this is a good pairing of APIs. This feature is by design not available to the client via the HTable API. It took a long time to arrive at this and I apologize for the public exposure of my (erratic in retrospect) thought processes. Was: HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here. After a bit of discussion two solutions have emerged: 1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows. 2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix. #1 is true to the current storage paradigm of HBase. #2 is true to the current client side API. I will explore these two with sample patches here. Was: As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple. I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere. Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5288) Security source code dirs missing from 0.92.0 release tarballs.
[ https://issues.apache.org/jira/browse/HBASE-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205344#comment-13205344 ] Hudson commented on HBASE-5288: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5288 Security source code dirs missing from 0.92.0 release tarballs jmhsieh : Files : * /hbase/trunk/src/assembly/all.xml Security source code dirs missing from 0.92.0 release tarballs. --- Key: HBASE-5288 URL: https://issues.apache.org/jira/browse/HBASE-5288 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: hbase-5288.patch The release tarballs have a compiled version of the hbase jars and the security tarball seems to have the compiled security bits. However, the source code and resources for security implementation are missing from the release tarballs in both distributions. They should be included in both. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5365) [book] adding description of compaction file selection to refGuide in Arch/Regions/Store
[ https://issues.apache.org/jira/browse/HBASE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205342#comment-13205342 ] Hudson commented on HBASE-5365: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) hbase-5365. book - Arch/Region/Store adding description of compaction file selection [book] adding description of compaction file selection to refGuide in Arch/Regions/Store Key: HBASE-5365 URL: https://issues.apache.org/jira/browse/HBASE-5365 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Attachments: docbkx_hbase_5365.patch book.xml * adding description of compaction selection algorithm with examples (based on existing unit tests) * also added a few links to the compaction section from other places in the book that already mention compaction. configuration.xml * added link to compaction section from the entry that discusses configuring major compaction interval. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5367) [book] small formatting changes to compaction description in Arch/Regions/Store
[ https://issues.apache.org/jira/browse/HBASE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205341#comment-13205341 ] Hudson commented on HBASE-5367: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) hbase-5367 book.xml - this time, really fixing the default compaction.min.size hbase-5367. book.xml - minor formatting in Arch/Region/Store compaction description [book] small formatting changes to compaction description in Arch/Regions/Store --- Key: HBASE-5367 URL: https://issues.apache.org/jira/browse/HBASE-5367 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_hbase_5367.xml.patch, book_hbase_5367_2.xml.patch Fixing a few small-but-important things that came out of a post-commit comment in HBASE-5365 book.xml * corrected default region flush size (it's actually 64mb) * removed trailing 'F' in a ratio discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5345) CheckAndPut doesn't work when value is empty byte[]
[ https://issues.apache.org/jira/browse/HBASE-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205343#comment-13205343 ] Hudson commented on HBASE-5345: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5345 CheckAndPut doesn't work when value is empty byte[] (Evert Arckens) tedyu : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java CheckAndPut doesn't work when value is empty byte[] --- Key: HBASE-5345 URL: https://issues.apache.org/jira/browse/HBASE-5345 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Evert Arckens Assignee: Evert Arckens Fix For: 0.94.0, 0.92.1 Attachments: 5345-v2.txt, 5345.txt, checkAndMutateEmpty-HBASE-5345.patch When a value contains an empty byte[] and then a checkAndPut is performed with an empty byte[] , the operation will fail. For example: Put put = new Put(row1); put.add(fam1, qf1, new byte[0]); table.put(put); put = new Put(row1); put.add(fam1, qf1, val1); table.checkAndPut(row1, fam1, qf1, new byte[0], put); --- false I think this is related to HBASE-3793 and HBASE-3468. Note that you will also get into this situation when first putting a null value ( put.add(fam1,qf1,null) ), as this value will then be regarded and returned as an empty byte[] upon a get. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3850) Log more details when a scanner lease expires
[ https://issues.apache.org/jira/browse/HBASE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205345#comment-13205345 ] Hudson commented on HBASE-3850: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-3850 Log more details when a scanner lease expires (Darren Haas) tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Log more details when a scanner lease expires - Key: HBASE-3850 URL: https://issues.apache.org/jira/browse/HBASE-3850 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Benoit Sigoure Assignee: Darren Haas Priority: Critical Fix For: 0.94.0 Attachments: 3850-v3.txt, HBASE-3850.trunk.v1.patch, HBASE-3850.trunk.v2.patch The message logged by the RegionServer when a Scanner lease expires isn't as useful as it could be. {{Scanner 4765412385779771089 lease expired}} - most clients don't log their scanner ID, so it's really hard to figure out what was going on. I think it would be useful to at least log the name of the region on which the Scanner was open, and it would be great to have the ip:port of the client that had that lease too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5348) Constraint configuration loaded with bloat
[ https://issues.apache.org/jira/browse/HBASE-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205346#comment-13205346 ] Hudson commented on HBASE-5348: --- Integrated in HBase-TRUNK #2656 (See [https://builds.apache.org/job/HBase-TRUNK/2656/]) HBASE-5348 Constraint configuration loaded with bloat stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/constraint/CheckConfigurationConstraint.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/constraint/TestConstraints.java Constraint configuration loaded with bloat -- Key: HBASE-5348 URL: https://issues.apache.org/jira/browse/HBASE-5348 Project: HBase Issue Type: Bug Components: coprocessors, regionserver Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.94.0 Attachments: java_HBASE-5348.patch, java_HBASE-5348.patch Constraints load the configuration but don't load the 'correct' configuration, but instead instantiate the default configuration (via new Configuration). It should just be Configuration(false) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205433#comment-13205433 ] Zhihong Yu commented on HBASE-5209: --- @David: There are new test failures reported by Hadoop QA, can you double check your patch ? HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0, 0.90.5, 0.92.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David S. Wang updated HBASE-5209: - Status: Open (was: Patch Available) I will address the unit test failures. HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0, 0.90.5, 0.94.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3039) Stuck in regionsInTransition because rebalance came in at same time as a split
[ https://issues.apache.org/jira/browse/HBASE-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-3039: -- Assignee: stack Stuck in regionsInTransition because rebalance came in at same time as a split -- Key: HBASE-3039 URL: https://issues.apache.org/jira/browse/HBASE-3039 Project: HBase Issue Type: Bug Components: master Reporter: stack Assignee: stack Fix For: 0.90.0 Attachments: 3039.txt Saw this doing cluster tests: {code} 2010-09-25 21:31:48,212 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because regions in transition: {73781e505e452221c9cd0e03585eb5d1=usertable,user800184056, 128... {code} Here's the problem: {code} 2010-09-25 08:16:48,186 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=usertable,user800184056,1285397376525.73781e505e452221c9cd0e03585eb5d1., src=su184,60020, 1285371621579, dest=sv2borg189,60020,1285371621577 2010-09-25 08:16:48,186 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region usertable,user800184056,1285397376525. 73781e505e452221c9cd0e03585eb5d1. (offlining) 2010-09-25 08:16:52,656 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT: usertable,user800184056,1285397376525.73781e505e452221c9cd0e03585eb5d1.: Daughters; usertable,user800184056,1285402609029.c05825561e7ea3cc6507c70bfb21541a., usertable,user804024623,1285402609029.28f64903a7875bdafc1e7ee344b225b0. 2010-09-25 08:17:11,414 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: usertable,user800184056,1285397376525. 73781e505e452221c9cd0e03585eb5d1. state=PENDING_CLOSE, ts=1285402608186 {code} just as we were doing a balance, the region split. Over on RS, I see the split starting up and then in comes the balance 'close' message. By the time the close handler runs on regionserver the split is well underway and close handler actually doesn't find an online region to split. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5329) addRowLock() may allocate duplicate lock id, causing the client to be blocked
[ https://issues.apache.org/jira/browse/HBASE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205502#comment-13205502 ] liaoxiangui commented on HBASE-5329: sorry, i didn`t describle it clearly. I thinks there is possibility that Random.nextLong() generate same number. To get consequences what the same number lead to, I changed code follow this, and got the Leases$LeaseStillHeldException. {code} protected long addRowLock(Integer r, HRegion region) throws LeaseStillHeldException { long lockId = -1L; lockId = 99; String lockName = String.valueOf(lockId); rowlocks.put(lockName, r); this.leases.createLease(lockName, new RowLockListener(lockName, region)); return lockId; } {code} addRowLock() may allocate duplicate lock id, causing the client to be blocked - Key: HBASE-5329 URL: https://issues.apache.org/jira/browse/HBASE-5329 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Environment: Red Hat Enterprise Linux Server release 5.4 Reporter: liaoxiangui Assignee: Zhihong Yu Priority: Critical {code} protected long addRowLock(Integer r, HRegion region) throws LeaseStillHeldException { long lockId = -1L; lockId = rand.nextLong(); //!!!may generate duplicated id,bug? String lockName = String.valueOf(lockId); rowlocks.put(lockName, r); this.leases.createLease(lockName, new RowLockListener(lockName, region)); return lockId; } {code} In addRowLock(),rand may generate duplicated lock id, it may cause regionserver throw exception(Leases$LeaseStillHeldException).The client will be blocked until old rowlock is released. {code} 2012-02-03 15:21:50,084 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining row lock (fsOk: true) org.apache.hadoop.hbase.regionserver.Leases$LeaseStillHeldException at org.apache.hadoop.hbase.regionserver.Leases.createLease(Leases.java:150) at org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:1986) at org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:1963) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5329) addRowLock() may allocate duplicate lock id, causing the client to be blocked
[ https://issues.apache.org/jira/browse/HBASE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liaoxiangui updated HBASE-5329: --- Priority: Minor (was: Critical) addRowLock() may allocate duplicate lock id, causing the client to be blocked - Key: HBASE-5329 URL: https://issues.apache.org/jira/browse/HBASE-5329 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.3 Environment: Red Hat Enterprise Linux Server release 5.4 Reporter: liaoxiangui Assignee: Zhihong Yu Priority: Minor {code} protected long addRowLock(Integer r, HRegion region) throws LeaseStillHeldException { long lockId = -1L; lockId = rand.nextLong(); //!!!may generate duplicated id,bug? String lockName = String.valueOf(lockId); rowlocks.put(lockName, r); this.leases.createLease(lockName, new RowLockListener(lockName, region)); return lockId; } {code} In addRowLock(),rand may generate duplicated lock id, it may cause regionserver throw exception(Leases$LeaseStillHeldException).The client will be blocked until old rowlock is released. {code} 2012-02-03 15:21:50,084 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining row lock (fsOk: true) org.apache.hadoop.hbase.regionserver.Leases$LeaseStillHeldException at org.apache.hadoop.hbase.regionserver.Leases.createLease(Leases.java:150) at org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:1986) at org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:1963) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205514#comment-13205514 ] Hadoop QA commented on HBASE-4720: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12512066/HBASE-4720.trunk.v7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestHFileBlock org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/940//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/940//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/940//console This message is automatically generated. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.trunk.v7.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205527#comment-13205527 ] Jonathan Hsieh commented on HBASE-5209: --- Might be out of scope for this patch but the ServerName structure doesn't have the info (http) port of master or rs's (not sure about the history on this). It would be great if we could add some information in cluster stats that also provides the info port of the active master. HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0, 0.90.5, 0.92.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205575#comment-13205575 ] Zhihong Yu commented on HBASE-5209: --- For ServerName to provide info port, we can open a separate JIRA. @David: Can you explain the rationale behind the new parameters ? {code} + final boolean isMasterRunning, + final boolean isActiveMaster, {code} Looking at the current fields related to servers in ClusterStatus, they are all of Collection. I would expect a ServerName to represent the active master. Once design passes review, please use review board for further discussion. HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0, 0.90.5, 0.92.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5378) [book] book.xml - added link to coprocessor blog entry
[book] book.xml - added link to coprocessor blog entry --- Key: HBASE-5378 URL: https://issues.apache.org/jira/browse/HBASE-5378 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5378.xml.patch book.xml * added section under Arch/RegionServer for Coprocessors, and a link to the blog entry on this subject. * updated the schema design chapter that mentioned coprocessors link to this new section. * minor update to compaction explanation in the 3rd example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5378) [book] book.xml - added link to coprocessor blog entry
[ https://issues.apache.org/jira/browse/HBASE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5378: - Attachment: book_hbase_5378.xml.patch [book] book.xml - added link to coprocessor blog entry --- Key: HBASE-5378 URL: https://issues.apache.org/jira/browse/HBASE-5378 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5378.xml.patch book.xml * added section under Arch/RegionServer for Coprocessors, and a link to the blog entry on this subject. * updated the schema design chapter that mentioned coprocessors link to this new section. * minor update to compaction explanation in the 3rd example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5378) [book] book.xml - added link to coprocessor blog entry
[ https://issues.apache.org/jira/browse/HBASE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5378: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] book.xml - added link to coprocessor blog entry --- Key: HBASE-5378 URL: https://issues.apache.org/jira/browse/HBASE-5378 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5378.xml.patch book.xml * added section under Arch/RegionServer for Coprocessors, and a link to the blog entry on this subject. * updated the schema design chapter that mentioned coprocessors link to this new section. * minor update to compaction explanation in the 3rd example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5312) Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205580#comment-13205580 ] ramkrishna.s.vasudevan commented on HBASE-5312: --- +1 on adding logs but need to ensure we don't slow down the hlog append and flush flow. By the way Jimmy, you have any suspects in this issue? Closed parent region present in Hlog.lastSeqWritten --- Key: HBASE-5312 URL: https://issues.apache.org/jira/browse/HBASE-5312 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Fix For: 0.90.7 This is in reference to the mail sent in the dev mailing list Closed parent region present in Hlog.lastSeqWritten. The sceanrio described is We had a region that was split into two daughters. When the hlog roll tried to flush the region there was an entry in the HLog.lastSeqWritten that was not flushed or removed from the lastSeqWritten during the parent close. Because this flush was not happening subsequent flushes were getting blocked {code} 05:06:44,422 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=122, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:06:44,422 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:10:48,666 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=123, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:10:48,666 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:14:46,075 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=124, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:14:46,075 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:15:41,584 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=125, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:15:41,584 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, {code} Lets see what happened for the region 2acaf8e3acfd2e8a5825a1f6f0aca4a8 {code} 2012-01-06 00:30:55,214 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/.tmp/1755862026714756815 to hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123 2012-01-06 00:30:58,946 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated Htable_UFDR_016,049790700093168-0456520,1325809837958.0ebe5bd7fcbc09ee074d5600b9d4e062. 2012-01-06 00:30:59,614 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123, entries=7537, sequenceid=20312223, memsize=4.2m, filesize=2.9m 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting, commencing flushing stores 2012-01-06 00:30:59,787 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~133.5m for region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. in 21816ms, sequenceid=20312223, compaction requested=true 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. because regionserver20020.cacheFlusher; priority=0, compaction queue size=5840 {code} A user triggered split has been issued to this region which can be seen in the above logs. The flushing of this region has resulted in a seq id 20312223. The region has been splitted and the parent region has been closed {code} 00:31:12,607 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.: disabling compactions flushes 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 00:31:13,718 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 00:31:39,552 INFO
[jira] [Created] (HBASE-5379) Backport HBASE-4287 to 0.90 - If region opening fails, try to transition region back to offline in ZK
Backport HBASE-4287 to 0.90 - If region opening fails, try to transition region back to offline in ZK --- Key: HBASE-5379 URL: https://issues.apache.org/jira/browse/HBASE-5379 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7 This issue is needed in 0.90 also. Else if region assignment fails then need to wait for 30 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh reassigned HBASE-5364: - Assignee: Elliott Clark Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5364: -- Affects Version/s: (was: 0.90.5) Summary: Fix source files missing licenses in 0.92 and trunk (was: Fix source files missing licenses) Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5312) Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205596#comment-13205596 ] Jimmy Xiang commented on HBASE-5312: I checked the lock mechanism and it looks fine. If it is not a bug in java reentrant lock, I suspect the region is removed from the online regions list before it is properly closed, either during region spliting, or region closing. Closed parent region present in Hlog.lastSeqWritten --- Key: HBASE-5312 URL: https://issues.apache.org/jira/browse/HBASE-5312 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Fix For: 0.90.7 This is in reference to the mail sent in the dev mailing list Closed parent region present in Hlog.lastSeqWritten. The sceanrio described is We had a region that was split into two daughters. When the hlog roll tried to flush the region there was an entry in the HLog.lastSeqWritten that was not flushed or removed from the lastSeqWritten during the parent close. Because this flush was not happening subsequent flushes were getting blocked {code} 05:06:44,422 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=122, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:06:44,422 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:10:48,666 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=123, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:10:48,666 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:14:46,075 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=124, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:14:46,075 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, requester=null 05:15:41,584 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=125, maxlogs=32; forcing flush of 1 regions(s): 2acaf8e3acfd2e8a5825a1f6f0aca4a8 05:15:41,584 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null, {code} Lets see what happened for the region 2acaf8e3acfd2e8a5825a1f6f0aca4a8 {code} 2012-01-06 00:30:55,214 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/.tmp/1755862026714756815 to hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123 2012-01-06 00:30:58,946 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated Htable_UFDR_016,049790700093168-0456520,1325809837958.0ebe5bd7fcbc09ee074d5600b9d4e062. 2012-01-06 00:30:59,614 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123, entries=7537, sequenceid=20312223, memsize=4.2m, filesize=2.9m 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting, commencing flushing stores 2012-01-06 00:30:59,787 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~133.5m for region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. in 21816ms, sequenceid=20312223, compaction requested=true 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. because regionserver20020.cacheFlusher; priority=0, compaction queue size=5840 {code} A user triggered split has been issued to this region which can be seen in the above logs. The flushing of this region has resulted in a seq id 20312223. The region has been splitted and the parent region has been closed {code} 00:31:12,607 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.: disabling compactions flushes 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. 00:31:13,718 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed
[jira] [Created] (HBASE-5380) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks
[book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks - Key: HBASE-5380 URL: https://issues.apache.org/jira/browse/HBASE-5380 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor book.xml * Adding comment in KeyValue section about KV's not being split across blocks. This was a recent question on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5380) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks
[ https://issues.apache.org/jira/browse/HBASE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5380: - Attachment: book_hbase_5380.xml.patch [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks - Key: HBASE-5380 URL: https://issues.apache.org/jira/browse/HBASE-5380 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_hbase_5380.xml.patch book.xml * Adding comment in KeyValue section about KV's not being split across blocks. This was a recent question on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5380) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks
[ https://issues.apache.org/jira/browse/HBASE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5380: - Status: Patch Available (was: Open) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks - Key: HBASE-5380 URL: https://issues.apache.org/jira/browse/HBASE-5380 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_hbase_5380.xml.patch book.xml * Adding comment in KeyValue section about KV's not being split across blocks. This was a recent question on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5380) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks
[ https://issues.apache.org/jira/browse/HBASE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5380: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks - Key: HBASE-5380 URL: https://issues.apache.org/jira/browse/HBASE-5380 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_hbase_5380.xml.patch book.xml * Adding comment in KeyValue section about KV's not being split across blocks. This was a recent question on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master
[ https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205610#comment-13205610 ] Zhihong Yu commented on HBASE-5323: --- Some comments for patch v2. In MasterFileSystem.java, the following check should be put as first in if statement because it is fast: {code} e instanceof HLogLengthMisMatchException {code} In HLog.java: {code} + if (e instanceof HLogLengthMisMatchException) { +throw new HLogLengthMisMatchException( {code} I don't think we need the above. Instead, we should check for e.cause in MasterFileSystem.java For HLogLengthMisMatchException.java, the M in Match should be lower case. I think it should be a generic exception. So maybe rename it to FileLengthMismatchException ? For InstrumentedSequenceLogReader, please add javadoc for the class. This class should be generic as well. How about passing the exception that getPos() is supposed to throw to ctor of InstrumentedSequenceLogReader ? Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master Key: HBASE-5323 URL: https://issues.apache.org/jira/browse/HBASE-5323 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.94.0, 0.90.7 Attachments: HBASE-5323.patch, HBASE-5323.patch We know that while parsing the HLog we expect the proper length from HDFS. In WALReaderFSDataInputStream {code} assert(realLength = this.length); {code} We are trying to come out if the above condition is not satisfied. But if SSH.splitLog() gets this problem then it lands in the run method of EventHandler. This kills the SSH thread and so further assignment does not happen. If ROOT and META are to be assigned they cannot be. I think in this condition we abort the master by catching such exceptions. Please do suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5381) Make memstore.flush.size as a table level configuration
Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master
[ https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205612#comment-13205612 ] Zhihong Yu commented on HBASE-5323: --- I would suggest starting with a patch for TRUNK which can be verified by Hadoop QA. Once code review passes, you can backport to 0.90 branch. Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master Key: HBASE-5323 URL: https://issues.apache.org/jira/browse/HBASE-5323 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.94.0, 0.90.7 Attachments: HBASE-5323.patch, HBASE-5323.patch We know that while parsing the HLog we expect the proper length from HDFS. In WALReaderFSDataInputStream {code} assert(realLength = this.length); {code} We are trying to come out if the above condition is not satisfied. But if SSH.splitLog() gets this problem then it lands in the run method of EventHandler. This kills the SSH thread and so further assignment does not happen. If ROOT and META are to be assigned they cannot be. I think in this condition we abort the master by catching such exceptions. Please do suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5381) Make memstore.flush.size as a table level configuration
[ https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205617#comment-13205617 ] Jean-Daniel Cryans commented on HBASE-5381: --- It already is, see MEMSTORE_FLUSHSIZE in the shell or HTD.setMemStoreFlushSize(). Am I missing something? Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5382) Test that we always cache index and bloom blocks
Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5381) Make memstore.flush.size as a table level configuration
[ https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205621#comment-13205621 ] Zhihong Yu commented on HBASE-5381: --- I would suggest putting more attention to HBASE-5349. Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4683) Always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205622#comment-13205622 ] Phabricator commented on HBASE-4683: jdcryans has accepted the revision [jira] [HBASE-4683] Test that we always cache index and bloom blocks. +1, sorry about that. Please open a new jira. REVISION DETAIL https://reviews.facebook.net/D1695 Always cache index and bloom blocks --- Key: HBASE-4683 URL: https://issues.apache.org/jira/browse/HBASE-4683 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Mikhail Bautin Priority: Minor Fix For: 0.94.0, 0.92.0 Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt, 4683.txt, D1695.1.patch, D807.1.patch, D807.2.patch, D807.3.patch, HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch This would add a new boolean config option: hfile.block.cache.datablocks Default would be true. Setting this to false allows HBase in a mode where only index blocks are cached, which is useful for analytical scenarios where a useful working set of the data cannot be expected to fit into the (aggregate) cache. This is the equivalent of setting cacheBlocks to false on all scans (including scans on behalf of gets). I would like to get a general feeling about what folks think about this. The change itself would be simple. Update (Mikhail): we probably don't need a new conf option. Instead, we will make index blocks cached by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5369) Compaction selection based on the hotness of the HFile's block in the block cache
[ https://issues.apache.org/jira/browse/HBASE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5369: -- Description: HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. was: HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. Furthermore, the HBase could compact multiple HFiles into two HFiles. One of them only contains hot blocks which are supposed be cached directly. Compaction selection based on the hotness of the HFile's block in the block cache - Key: HBASE-5369 URL: https://issues.apache.org/jira/browse/HBASE-5369 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205636#comment-13205636 ] Jimmy Xiang commented on HBASE-5376: I was thinking to use YCSB to load lots of data while set the region size small, so that lots of region split will be triggered. How is that? Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten -- Key: HBASE-5376 URL: https://issues.apache.org/jira/browse/HBASE-5376 Project: HBase Issue Type: Sub-task Reporter: Jimmy Xiang Priority: Minor Fix For: 0.90.7 It is hard to find out what exactly caused HBASE-5312. Some logging will be helpful to shine some lights. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-5376: --- Attachment: hbase-5376.txt I added some warnings. Anywhere else should I add too? Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten -- Key: HBASE-5376 URL: https://issues.apache.org/jira/browse/HBASE-5376 Project: HBase Issue Type: Sub-task Reporter: Jimmy Xiang Priority: Minor Fix For: 0.90.7 Attachments: hbase-5376.txt It is hard to find out what exactly caused HBASE-5312. Some logging will be helpful to shine some lights. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5369) Compaction selection based on the hotness of the HFile's block in the block cache
[ https://issues.apache.org/jira/browse/HBASE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-5369: -- Description: HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. For example, if there is a HFile and 80% of blocks for this HFile is be cached, which means this HFile is really hot, then just skip this HFile during the compaction selection. The percentage of hot blocks should be configured as a high bar to make sure that HBase are still making progress for the compaction. was: HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. Compaction selection based on the hotness of the HFile's block in the block cache - Key: HBASE-5369 URL: https://issues.apache.org/jira/browse/HBASE-5369 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang HBase reserves a large set memory for the block cache and the cached blocks will be age out in a LRU fashion. Obviously, we don't want to age out the blocks which are still hot. However, when the compactions are starting, these hot blocks may naturally be invalid. Considering that the block cache has already known which HFiles these hot blocks come from, the compaction selection algorithm could just simply skip compact these HFiles until these block cache become cold. For example, if there is a HFile and 80% of blocks for this HFile is be cached, which means this HFile is really hot, then just skip this HFile during the compaction selection. The percentage of hot blocks should be configured as a high bar to make sure that HBase are still making progress for the compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5263) Preserving cached data on compactions through cache-on-write
[ https://issues.apache.org/jira/browse/HBASE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205645#comment-13205645 ] Kannan Muthukkaruppan commented on HBASE-5263: -- Promising idea! In terms of the implementation details, it would be nice to avoid some pathological cases... were cold data (which was in the cache but almost on its way out of the cache) becomes hot again. I am guessing a naive approach could have this pitfall, but something that additionally takes into consideration the hotness of the keys in the block and appropriately places the data in the correct place in the blockcache LRU would not. Haven't thought through much about the implementation details... but wanted to throw out the initial thoughts at least. See also related idea by Liyin here: HBASE-5263. These could be complementary approaches. Preserving cached data on compactions through cache-on-write Key: HBASE-5263 URL: https://issues.apache.org/jira/browse/HBASE-5263 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block cache on compactions if cache-on-write is enabled. However, it would be ideal to reduce the effect compactions have on the cached data. For every block we are writing for a compacted file we can decide whether it needs to be cached based on whether the original blocks containing the same data were already in cache. More precisely, for every HFile reader in a compaction we can maintain a boolean flag saying whether the current key-value came from a disk IO or the block cache. In the HFile writer for the compaction's output we can maintain a flag that is set if any of the key-values in the block being written came from a cached block, use that flag at the end of a block to decide whether to cache-on-write the block, and reset the flag to false on a block boundary. If such an inclusive approach would still trash the cache, we could restrict the total number of blocks to be cached per an output HFile, switch to an and logic instead of or logic for deciding whether to cache an output file block, or only cache a certain percentage of output file blocks that contain some of the previously cached data. Thanks to Nicolas for this elegant online algorithm idea! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205647#comment-13205647 ] Prakash Khemani commented on HBASE-5347: another advantage of this approach will be that we will be able to get rid of low/high water marks in LRUBlockCache and make block eviction synchronous with demand. The default value of the watermarks is set to 75% and 85% (in 89). That means we waste somewhere around 20% of the block-cache today because of asynchronous garbage collection. GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5382: -- Status: Patch Available (was: Open) Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5382: -- Assignee: Mikhail Bautin Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5382: -- Attachment: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5263) Preserving cached data on compactions through cache-on-write
[ https://issues.apache.org/jira/browse/HBASE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205648#comment-13205648 ] Zhihong Yu commented on HBASE-5263: --- @Kannan: I think you were referring to HBASE-5369. Preserving cached data on compactions through cache-on-write Key: HBASE-5263 URL: https://issues.apache.org/jira/browse/HBASE-5263 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block cache on compactions if cache-on-write is enabled. However, it would be ideal to reduce the effect compactions have on the cached data. For every block we are writing for a compacted file we can decide whether it needs to be cached based on whether the original blocks containing the same data were already in cache. More precisely, for every HFile reader in a compaction we can maintain a boolean flag saying whether the current key-value came from a disk IO or the block cache. In the HFile writer for the compaction's output we can maintain a flag that is set if any of the key-values in the block being written came from a cached block, use that flag at the end of a block to decide whether to cache-on-write the block, and reset the flag to false on a block boundary. If such an inclusive approach would still trash the cache, we could restrict the total number of blocks to be cached per an output HFile, switch to an and logic instead of or logic for deciding whether to cache an output file block, or only cache a certain percentage of output file blocks that contain some of the previously cached data. Thanks to Nicolas for this elegant online algorithm idea! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5382: -- Description: This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. (was: This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695.) Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205651#comment-13205651 ] Zhihong Yu commented on HBASE-5347: --- Agreed. This initiative is on the right track. GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205652#comment-13205652 ] Jean-Daniel Cryans commented on HBASE-5382: --- +1 Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5263) Preserving cached data on compactions through cache-on-write
[ https://issues.apache.org/jira/browse/HBASE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205645#comment-13205645 ] Kannan Muthukkaruppan edited comment on HBASE-5263 at 2/10/12 7:23 PM: --- Promising idea! In terms of the implementation details, it would be nice to avoid some pathological cases... were cold data (which was in the cache but almost on its way out of the cache) becomes hot again. I am guessing a naive approach could have this pitfall, but something that additionally takes into consideration the hotness of the keys in the block and appropriately places the data in the correct place in the blockcache LRU would not. Haven't thought through much about the implementation details... but wanted to throw out the initial thoughts at least. See also related idea by Liyin here: HBASE-5639. These could be complementary approaches. was (Author: kannanm): Promising idea! In terms of the implementation details, it would be nice to avoid some pathological cases... were cold data (which was in the cache but almost on its way out of the cache) becomes hot again. I am guessing a naive approach could have this pitfall, but something that additionally takes into consideration the hotness of the keys in the block and appropriately places the data in the correct place in the blockcache LRU would not. Haven't thought through much about the implementation details... but wanted to throw out the initial thoughts at least. See also related idea by Liyin here: HBASE-5263. These could be complementary approaches. Preserving cached data on compactions through cache-on-write Key: HBASE-5263 URL: https://issues.apache.org/jira/browse/HBASE-5263 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block cache on compactions if cache-on-write is enabled. However, it would be ideal to reduce the effect compactions have on the cached data. For every block we are writing for a compacted file we can decide whether it needs to be cached based on whether the original blocks containing the same data were already in cache. More precisely, for every HFile reader in a compaction we can maintain a boolean flag saying whether the current key-value came from a disk IO or the block cache. In the HFile writer for the compaction's output we can maintain a flag that is set if any of the key-values in the block being written came from a cached block, use that flag at the end of a block to decide whether to cache-on-write the block, and reset the flag to false on a block boundary. If such an inclusive approach would still trash the cache, we could restrict the total number of blocks to be cached per an output HFile, switch to an and logic instead of or logic for deciding whether to cache an output file block, or only cache a certain percentage of output file blocks that contain some of the previously cached data. Thanks to Nicolas for this elegant online algorithm idea! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205685#comment-13205685 ] stack commented on HBASE-5364: -- You fellas going to commit? Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5368) Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs
[ https://issues.apache.org/jira/browse/HBASE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205686#comment-13205686 ] stack commented on HBASE-5368: -- +1 Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs - Key: HBASE-5368 URL: https://issues.apache.org/jira/browse/HBASE-5368 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5368.txt Very simple change to make PrefixSplitKeyPolicy accessible in HBase installs (user still needs to setup the table(s) accordingly). Right now it is in src/test/org.apache.hadoop.hbase.regionserver, I propose moving it to src/org.apache.hadoop.hbase.regionserver (alongside ConstantSizeRegionSplitPolicy), and maybe renaming it too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5368) Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs
[ https://issues.apache.org/jira/browse/HBASE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205687#comment-13205687 ] stack commented on HBASE-5368: -- +1 Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs - Key: HBASE-5368 URL: https://issues.apache.org/jira/browse/HBASE-5368 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5368.txt Very simple change to make PrefixSplitKeyPolicy accessible in HBase installs (user still needs to setup the table(s) accordingly). Right now it is in src/test/org.apache.hadoop.hbase.regionserver, I propose moving it to src/org.apache.hadoop.hbase.regionserver (alongside ConstantSizeRegionSplitPolicy), and maybe renaming it too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5381) Make memstore.flush.size as a table level configuration
[ https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205689#comment-13205689 ] Liyin Tang commented on HBASE-5381: --- Thanks Jean and Ted. I missed something before. Please close this jira for me. Thanks a lot Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205692#comment-13205692 ] Jonathan Hsieh commented on HBASE-5364: --- @Stack I'll commit it. I need to do a tweak on the maven stuff HBASE-5363, and if you could take a look at HBASE-5377 (I check web pages and info port and they looked good, not completely sure about other things I might have to worry about.) Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5381) Make memstore.flush.size as a table level configuration
[ https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu resolved HBASE-5381. --- Resolution: Not A Problem Already implemented. Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205698#comment-13205698 ] Hadoop QA commented on HBASE-5382: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514142/TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 156 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/941//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/941//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/941//console This message is automatically generated. Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205707#comment-13205707 ] Zhihong Yu commented on HBASE-3134: --- For #3 above, in changePeerState() etc, we should distinguish KeeperException.NoNodeException from other KeeperException's: {code} + private void changePeerState(String id, PeerState state) throws IOException { ... +} catch (KeeperException e) { + throw new IOException(Unable to change state of the peer + id, e); +} {code} Basically when a peer exists but peer state znode doesn't exist, we should add the peer state znode. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5364: -- Attachment: hbase-5364-v2.patch Includes update bin/hbase-jruby Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch, hbase-5364-v2.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205710#comment-13205710 ] Zhihong Yu commented on HBASE-3134: --- For #1 above, is it about peerStateTrackers ? {code} this.peerClusters = new HashMapString, ReplicationPeer(); +this.peerStateTrackers = new HashMapString, PeerStateTracker(); {code} Its usage is in line with that of peerClusters. Since the key is peer id and peer should be registered first, I don't see a problem here. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5364) Fix source files missing licenses in 0.92 and trunk
[ https://issues.apache.org/jira/browse/HBASE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205709#comment-13205709 ] Jonathan Hsieh commented on HBASE-5364: --- Committed. Thanks for the patch Elliott! Fix source files missing licenses in 0.92 and trunk --- Key: HBASE-5364 URL: https://issues.apache.org/jira/browse/HBASE-5364 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Reporter: Jonathan Hsieh Assignee: Elliott Clark Priority: Blocker Attachments: HBASE-5364-1.patch, hbase-5364-0.92.patch, hbase-5364-v2.patch running 'mvn rat:check' shows that a few files have snuck in that do not have proper apache licenses. Ideally we should fix these before we cut another release/release candidate. This is a blocker for 0.94, and probably should be for the other branches as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5383) Prevent the compaction read requests from changing the hotness of block cache
Prevent the compaction read requests from changing the hotness of block cache - Key: HBASE-5383 URL: https://issues.apache.org/jira/browse/HBASE-5383 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Block cache is organized in an sorted way based on LRU or some other algorithm and it will age out some blocks when the algorithm believes these blocks are not hot any more. The motivation here is to prevent the compaction read requests from changing the hotness of block cache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205713#comment-13205713 ] Zhihong Yu commented on HBASE-5382: --- I think the test failure reported here: https://builds.apache.org/job/PreCommit-HBASE-Build/941//testReport/org.apache.hadoop.hbase.io.hfile/TestForceCacheImportantBlocks/testCacheBlocks_2_/ is in line with the failed TestHFileBlock tests we have been seeing on Apache Jenkins for the past two weeks. Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205720#comment-13205720 ] Zhihong Yu commented on HBASE-5382: --- Looks like the patch has been checked in: {code} 1 out of 1 hunk ignored -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java.rej The next patch would create the file src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java, which already exists! Assume -R? [n] Apply anyway? [n] Skipping patch. 1 out of 1 hunk ignored -- saving rejects to file src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java.rej {code} I suggest reverting the patch until Hadoop QA can reliably show that TestForceCacheImportantBlocks passes. We already have two consistently failing tests. We don't want to make them three. Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205727#comment-13205727 ] Mikhail Bautin commented on HBASE-5382: --- @Ted: I ran unit tests and the patch passed all of them (not just small and medium that Hadoop QA runs). I got a +1 on this from JD, and this code has been previously reviewed and approved as part of HBASE-4683. Sorry if this is a misunderstanding, but I thought we had plans of increasing the memory limit of HBase QA? Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4683) Always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205728#comment-13205728 ] Mikhail Bautin commented on HBASE-4683: --- The new JIRA with the unit test is HBASE-5382. Always cache index and bloom blocks --- Key: HBASE-4683 URL: https://issues.apache.org/jira/browse/HBASE-4683 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Mikhail Bautin Priority: Minor Fix For: 0.94.0, 0.92.0 Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt, 4683.txt, D1695.1.patch, D807.1.patch, D807.2.patch, D807.3.patch, HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch This would add a new boolean config option: hfile.block.cache.datablocks Default would be true. Setting this to false allows HBase in a mode where only index blocks are cached, which is useful for analytical scenarios where a useful working set of the data cannot be expected to fit into the (aggregate) cache. This is the equivalent of setting cacheBlocks to false on all scans (including scans on behalf of gets). I would like to get a general feeling about what folks think about this. The change itself would be simple. Update (Mikhail): we probably don't need a new conf option. Instead, we will make index blocks cached by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205735#comment-13205735 ] Zhihong Yu commented on HBASE-5382: --- J-D's +1 came 44 minutes before Hadoop QA report. I assume every +1 is contingent upon Hadoop QA's nod. I think after reverting the patch, you can submit a new patch to Hadoop QA with increased heap. Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5363) Automatically run rat check on mvn release builds
[ https://issues.apache.org/jira/browse/HBASE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205744#comment-13205744 ] Jonathan Hsieh commented on HBASE-5363: --- Figured it out. Kind of evil. 'mvn rat:check' is wrong. you are supposed to use 'mvn apache-rat:check'. I'm tried to update this wiki but don't have perms: http://wiki.apache.org/hadoop/Hbase/HowToRelease Automatically run rat check on mvn release builds - Key: HBASE-5363 URL: https://issues.apache.org/jira/browse/HBASE-5363 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5363-0.90.patch, hbase-5363.2.patch, hbase-5363.patch Some of the recent hbase release failed rat checks (mvn rat:check). We should add checks likely in the mvn package phase so that this becomes a non-issue in the future. Here's an example from Whirr: https://github.com/apache/whirr/blob/trunk/pom.xml line 388 for an example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3852) ThriftServer leaks scanners
[ https://issues.apache.org/jira/browse/HBASE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Copeland updated HBASE-3852: Attachment: thrift2-scanner.patch We hit this running thrift2 in production. With a lot of scanners buffering a lot (100) of rows per call, thrift servers would OOME in no time. This patch just periodically expires old ones. I can respin/rework the patch if the approach is sound. ThriftServer leaks scanners --- Key: HBASE-3852 URL: https://issues.apache.org/jira/browse/HBASE-3852 Project: HBase Issue Type: Bug Affects Versions: 0.90.2 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.94.0 Attachments: 3852.txt, thrift2-scanner.patch The scannerMap in ThriftServer relies on the user to clean it by closing the scanner. If that doesn't happen, the ResultScanner will stay in the thrift server's memory and if any pre-fetching was done, it will also start accumulating Results (with all their data). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5380) [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks
[ https://issues.apache.org/jira/browse/HBASE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205767#comment-13205767 ] Hudson commented on HBASE-5380: --- Integrated in HBase-TRUNK #2658 (See [https://builds.apache.org/job/HBase-TRUNK/2658/]) hbase-5380. book.xml, comment about KeyValue instances not being split across blocks [book] book.xml - KeyValue, adding comment about KeyValue's not being split across blocks - Key: HBASE-5380 URL: https://issues.apache.org/jira/browse/HBASE-5380 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_hbase_5380.xml.patch book.xml * Adding comment in KeyValue section about KV's not being split across blocks. This was a recent question on the dist-list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205766#comment-13205766 ] Hudson commented on HBASE-5382: --- Integrated in HBase-TRUNK #2658 (See [https://builds.apache.org/job/HBase-TRUNK/2658/]) [jira] [HBASE-5382] Test that we always cache index and bloom blocks Summary: This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. Test Plan: Run unit tests Reviewers: JIRA, jdcryans, lhofhansl, Liyin Reviewed By: jdcryans CC: jdcryans Differential Revision: https://reviews.facebook.net/D1695 mbautin : Files : * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4683) Always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205765#comment-13205765 ] Hudson commented on HBASE-4683: --- Integrated in HBase-TRUNK #2658 (See [https://builds.apache.org/job/HBase-TRUNK/2658/]) [jira] [HBASE-5382] Test that we always cache index and bloom blocks Summary: This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. Test Plan: Run unit tests Reviewers: JIRA, jdcryans, lhofhansl, Liyin Reviewed By: jdcryans CC: jdcryans Differential Revision: https://reviews.facebook.net/D1695 mbautin : Files : * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java Always cache index and bloom blocks --- Key: HBASE-4683 URL: https://issues.apache.org/jira/browse/HBASE-4683 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Mikhail Bautin Priority: Minor Fix For: 0.94.0, 0.92.0 Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt, 4683.txt, D1695.1.patch, D807.1.patch, D807.2.patch, D807.3.patch, HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch This would add a new boolean config option: hfile.block.cache.datablocks Default would be true. Setting this to false allows HBase in a mode where only index blocks are cached, which is useful for analytical scenarios where a useful working set of the data cannot be expected to fit into the (aggregate) cache. This is the equivalent of setting cacheBlocks to false on all scans (including scans on behalf of gets). I would like to get a general feeling about what folks think about this. The change itself would be simple. Update (Mikhail): we probably don't need a new conf option. Instead, we will make index blocks cached by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5378) [book] book.xml - added link to coprocessor blog entry
[ https://issues.apache.org/jira/browse/HBASE-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205768#comment-13205768 ] Hudson commented on HBASE-5378: --- Integrated in HBase-TRUNK #2658 (See [https://builds.apache.org/job/HBase-TRUNK/2658/]) hbase-5378 book.xml - adding new section for coprocessors in Arch/RegionServer [book] book.xml - added link to coprocessor blog entry --- Key: HBASE-5378 URL: https://issues.apache.org/jira/browse/HBASE-5378 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5378.xml.patch book.xml * added section under Arch/RegionServer for Coprocessors, and a link to the blog entry on this subject. * updated the schema design chapter that mentioned coprocessors link to this new section. * minor update to compaction explanation in the 3rd example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205764#comment-13205764 ] Jean-Daniel Cryans commented on HBASE-3134: --- bq. Its usage is in line with that of peerClusters. Since the key is peer id and peer should be registered first, I don't see a problem here. It should be changed too, originally it wasn't used by multiple threads because there was a maximum of one peer. Just to be safe. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5327) Print a message when an invalid hbase.rootdir is passed
[ https://issues.apache.org/jira/browse/HBASE-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205778#comment-13205778 ] Jimmy Xiang commented on HBASE-5327: I looked into it. For new Path(path), the path doesn't have to be a complete and valid path. It could be a relative path so it can't be validated. new Path(parent, child) takes two paths to form a new one (String is converted to Path implicitly). If parent = hdfs://localhost:999 and child = /test, the new path will be hdfs://localhost:999/test and it is valid and all are happy. However is child = test, in combining there to a URI, the result is hdfs://localhost:999test which is invalid. That's the reason for URISyntaxException. v2 patch doesn't look good, but I am ok with it. Print a message when an invalid hbase.rootdir is passed --- Key: HBASE-5327 URL: https://issues.apache.org/jira/browse/HBASE-5327 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jimmy Xiang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: hbase-5327.txt, hbase-5327_v2.txt As seen on the mailing list: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/24124 If hbase.rootdir doesn't specify a folder on hdfs we crash while opening a path to .oldlogs: {noformat} 2012-02-02 23:07:26,292 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://sv4r11s38:9100.oldlogs at org.apache.hadoop.fs.Path.initialize(Path.java:148) at org.apache.hadoop.fs.Path.init(Path.java:71) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.URISyntaxException: Relative path in absolute URI: hdfs://sv4r11s38:9100.oldlogs at java.net.URI.checkPath(URI.java:1787) at java.net.URI.init(URI.java:735) at org.apache.hadoop.fs.Path.initialize(Path.java:145) ... 6 more {noformat} It could also crash anywhere else, this just happens to be the first place we use hbase.rootdir. We need to verify that it's an actual folder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5377) Fix licenses on the 0.90 branch.
[ https://issues.apache.org/jira/browse/HBASE-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205784#comment-13205784 ] stack commented on HBASE-5377: -- Why this: {code} /plugin +plugin + artifactIdmaven-surefire-report-plugin/artifactId + version2.9/version +/plugin +plugin + groupIdorg.apache.avro/groupId + artifactIdavro-maven-plugin/artifactId + version${avro.version}/version +/plugin +plugin + groupIdorg.codehaus.mojo/groupId + artifactIdbuild-helper-maven-plugin/artifactId + version1.5/version +/plugin {code} Else patch looks good to me. If you can build site and the webapps work, commit I'd say. Fix licenses on the 0.90 branch. Key: HBASE-5377 URL: https://issues.apache.org/jira/browse/HBASE-5377 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5377.patch There are a handful of empty files and several files missing apache licenses on the 0.90 branch. This patch will fixes all of them and in conjunction with HBASE-5363 will allow it to pass RAT tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3852) ThriftServer leaks scanners
[ https://issues.apache.org/jira/browse/HBASE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205805#comment-13205805 ] Zhihong Yu commented on HBASE-3852: --- {code} + timestamp = System.currentTimeMillis(); {code} We have EnvironmentEdgeManager which provides the above service. {code} + for (Integer key: toremove) +removeScanner(key); {code} Please use curly braces around the removeScanner() call. ThriftServer leaks scanners --- Key: HBASE-3852 URL: https://issues.apache.org/jira/browse/HBASE-3852 Project: HBase Issue Type: Bug Affects Versions: 0.90.2 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.94.0 Attachments: 3852.txt, thrift2-scanner.patch The scannerMap in ThriftServer relies on the user to clean it by closing the scanner. If that doesn't happen, the ResultScanner will stay in the thrift server's memory and if any pre-fetching was done, it will also start accumulating Results (with all their data). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5384) Up heap used by hadoopqa
[ https://issues.apache.org/jira/browse/HBASE-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5384: - Attachment: hadoopqa_mavenopts.txt How is this? It seems to work locally. Up heap used by hadoopqa Key: HBASE-5384 URL: https://issues.apache.org/jira/browse/HBASE-5384 Project: HBase Issue Type: Bug Reporter: stack Attachments: hadoopqa_mavenopts.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5384) Up heap used by hadoopqa
[ https://issues.apache.org/jira/browse/HBASE-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5384: -- Status: Patch Available (was: Open) Patch looks good. Up heap used by hadoopqa Key: HBASE-5384 URL: https://issues.apache.org/jira/browse/HBASE-5384 Project: HBase Issue Type: Bug Reporter: stack Attachments: hadoopqa_mavenopts.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5363) Automatically run rat check on mvn release builds
[ https://issues.apache.org/jira/browse/HBASE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205887#comment-13205887 ] stack commented on HBASE-5363: -- Whats your hadoop wikiid Jon? Automatically run rat check on mvn release builds - Key: HBASE-5363 URL: https://issues.apache.org/jira/browse/HBASE-5363 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5363-0.90.patch, hbase-5363.2.patch, hbase-5363.patch Some of the recent hbase release failed rat checks (mvn rat:check). We should add checks likely in the mvn package phase so that this becomes a non-issue in the future. Here's an example from Whirr: https://github.com/apache/whirr/blob/trunk/pom.xml line 388 for an example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5384) Up heap used by hadoopqa
[ https://issues.apache.org/jira/browse/HBASE-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5384: - Resolution: Fixed Fix Version/s: 0.94.0 Assignee: stack Release Note: Make hadoopqa mvn heap same as it is for other hbase builds (trunk, 0.92, etc.) Status: Resolved (was: Patch Available) Committed to TRUNK. Thanks for review Ted. Lets keep an eye on it and see if it helps w/ the OOMEs we've been seeing. Up heap used by hadoopqa Key: HBASE-5384 URL: https://issues.apache.org/jira/browse/HBASE-5384 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: hadoopqa_mavenopts.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression
[ https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205892#comment-13205892 ] He Yongqiang commented on HBASE-5313: - @Todd, with such a small block size and data also already sorted, i was also thinking it is will be very hard to optimize the space. So we did some experiments by modifying today's HFileWriter. It turns out it can still save a lot if we play more tricks. Here are test results (block size is 16KB): *42MB HFile, with Delta compression and with LZO compression* (with default setting on Apache trunk) *30MB HFile, with Columnar, with Delta compression, and with LZO compression.* Inside one block, first put all row keys inside that block, and do delta compression, and then LZO compression. After row key, put all column family data in that block, and do Delta+LZO for it. And then similarly put column_qualifier. etc *24MB HFile, with Columnar, Sort value column, Sort column_qualifier column, and with LZO compression.* Inside one block, first put all row keys inside that block, and do delta compression, and then LZO compression. After row key, put all column family data in that block, and do Delta+LZO for it. And then put column_qualifier, sort it, and then do Delta+LZO. TS column and Code column are processed the same as column family. The value column is processed the same as column_qualifier. So it is the same as disk format for the 30MB HFile, except all data for 'column_qualifier' and 'value' are sorted separately. Out of 24MB file, 6MB is used to store row keys, 7MB is used to store column_qualifier, and 6MB is to store value. More ideas are welcome! Restructure hfiles layout for better compression Key: HBASE-5313 URL: https://issues.apache.org/jira/browse/HBASE-5313 Project: HBase Issue Type: Improvement Components: io Reporter: dhruba borthakur Assignee: dhruba borthakur A HFile block contain a stream of key-values. Can we can organize these kvs on the disk in a better way so that we get much greater compression ratios? One option (thanks Prakash) is to store all the keys in the beginning of the block (let's call this the key-section) and then store all their corresponding values towards the end of the block. This will allow us to not-even decompress the values when we are scanning and skipping over rows in the block. Any other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5385) Delete table/column should delete stored permissions on -acl- table
Delete table/column should delete stored permissions on -acl- table - Key: HBASE-5385 URL: https://issues.apache.org/jira/browse/HBASE-5385 Project: HBase Issue Type: Sub-task Components: security Affects Versions: 0.94.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Deleting the table or a column does not cascade to the stored permissions at the -acl- table. We should also remove those permissions, otherwise, it can be a security leak, where freshly created tables contain permissions from previous same-named tables. We might also want to ensure, upon table creation, that no entries are already stored at the -acl- table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression
[ https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205895#comment-13205895 ] stack commented on HBASE-5313: -- How do I read the above? Its same amount of kvs in each of the files? Restructure hfiles layout for better compression Key: HBASE-5313 URL: https://issues.apache.org/jira/browse/HBASE-5313 Project: HBase Issue Type: Improvement Components: io Reporter: dhruba borthakur Assignee: dhruba borthakur A HFile block contain a stream of key-values. Can we can organize these kvs on the disk in a better way so that we get much greater compression ratios? One option (thanks Prakash) is to store all the keys in the beginning of the block (let's call this the key-section) and then store all their corresponding values towards the end of the block. This will allow us to not-even decompress the values when we are scanning and skipping over rows in the block. Any other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression
[ https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205909#comment-13205909 ] Zhihong Yu commented on HBASE-5313: --- @Yongqiang: Thanks for sharing the results. Can you also list the time it took writing the HFile for each of the three schemes ? If you can characterize the row keys and values, that would be nice too. Restructure hfiles layout for better compression Key: HBASE-5313 URL: https://issues.apache.org/jira/browse/HBASE-5313 Project: HBase Issue Type: Improvement Components: io Reporter: dhruba borthakur Assignee: dhruba borthakur A HFile block contain a stream of key-values. Can we can organize these kvs on the disk in a better way so that we get much greater compression ratios? One option (thanks Prakash) is to store all the keys in the beginning of the block (let's call this the key-section) and then store all their corresponding values towards the end of the block. This will allow us to not-even decompress the values when we are scanning and skipping over rows in the block. Any other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5386) [usability] Soft limit for eager region splitting of young tables
[usability] Soft limit for eager region splitting of young tables - Key: HBASE-5386 URL: https://issues.apache.org/jira/browse/HBASE-5386 Project: HBase Issue Type: New Feature Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Coming out of HBASE-2375, we need a new functionality much like hypertable's where we would have a lower split size for new tables and it would grow up to a certain hard limit. This helps usability in different ways: - With that we can set the default split size much higher and users will still have good data distribution - No more messing with force splits - Not mandatory to pre-split your table in order to get good out of the box performance The way Doug Judd described how it works for them, they start with a low value and then double it every time it splits. For example if we started with a soft size of 32MB and a hard size of 2GB, it wouldn't be until you have 64 regions that you hit the ceiling. On the implementation side, we could add a new qualifier in .META. that has that soft limit. When that field doesn't exist, this feature doesn't kick in. It would be written by the region servers after a split and by the master when the table is created with 1 region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5368) Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs
[ https://issues.apache.org/jira/browse/HBASE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5368: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the review. Move PrefixSplitKeyPolicy out of the src/test into src, so it is accessible in HBase installs - Key: HBASE-5368 URL: https://issues.apache.org/jira/browse/HBASE-5368 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5368.txt Very simple change to make PrefixSplitKeyPolicy accessible in HBase installs (user still needs to setup the table(s) accordingly). Right now it is in src/test/org.apache.hadoop.hbase.regionserver, I propose moving it to src/org.apache.hadoop.hbase.regionserver (alongside ConstantSizeRegionSplitPolicy), and maybe renaming it too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression
[ https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205921#comment-13205921 ] dhruba borthakur commented on HBASE-5313: - The same amount of kvs in each file. total of 3 million kvs for this experiment. The blocksize is 16 KB. Restructure hfiles layout for better compression Key: HBASE-5313 URL: https://issues.apache.org/jira/browse/HBASE-5313 Project: HBase Issue Type: Improvement Components: io Reporter: dhruba borthakur Assignee: dhruba borthakur A HFile block contain a stream of key-values. Can we can organize these kvs on the disk in a better way so that we get much greater compression ratios? One option (thanks Prakash) is to store all the keys in the beginning of the block (let's call this the key-section) and then store all their corresponding values towards the end of the block. This will allow us to not-even decompress the values when we are scanning and skipping over rows in the block. Any other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5363) Automatically run rat check on mvn release builds
[ https://issues.apache.org/jira/browse/HBASE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205924#comment-13205924 ] Jonathan Hsieh commented on HBASE-5363: --- JonathanHsieh Sent from my iPhone Automatically run rat check on mvn release builds - Key: HBASE-5363 URL: https://issues.apache.org/jira/browse/HBASE-5363 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5363-0.90.patch, hbase-5363.2.patch, hbase-5363.patch Some of the recent hbase release failed rat checks (mvn rat:check). We should add checks likely in the mvn package phase so that this becomes a non-issue in the future. Here's an example from Whirr: https://github.com/apache/whirr/blob/trunk/pom.xml line 388 for an example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression
[ https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205931#comment-13205931 ] Jesse Yates commented on HBASE-5313: However, those compression numbers are pretty nice. I worry a little bit about having now an hfileV3, so soon on the heels of the last, leading to a proliferation of versions. My other concern is that the columnar storage doesn't make sense for all cases - Dremel is for a specific use case. That being said, I would love to see the ability to do Dremel in HBase. How about along with a new version/columnar data support comes the ability to select storage files on a per-table basis? That would enable some tables to be optimized for certain use cases, other tables for others, rather than having to use completely different clusters (continuing the multi-tenancy story). Restructure hfiles layout for better compression Key: HBASE-5313 URL: https://issues.apache.org/jira/browse/HBASE-5313 Project: HBase Issue Type: Improvement Components: io Reporter: dhruba borthakur Assignee: dhruba borthakur A HFile block contain a stream of key-values. Can we can organize these kvs on the disk in a better way so that we get much greater compression ratios? One option (thanks Prakash) is to store all the keys in the beginning of the block (let's call this the key-section) and then store all their corresponding values towards the end of the block. This will allow us to not-even decompress the values when we are scanning and skipping over rows in the block. Any other ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205936#comment-13205936 ] stack commented on HBASE-5209: -- bq. I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. If you do a getMaster, I'd think that you should get the active master, only, in HConnection. Are you saying that it'll give you an Interface on the non-active Master? Thats broke I'd say. For the name of the Master, yeah, getServerName should be part of HMasterInterface. On the patch: {code} + private boolean isMasterRunning, isActiveMaster; {code} The above are the names of methods, not data members. Should be masterRunning and activeMaster. Whats going on here: {code} +this.master = master; +this.isMasterRunning = isMasterRunning; +this.isActiveMaster = isActiveMaster; {code} So, we could be reporting a master that is not running and not the active master? Why would we even care about it in that case? getMasterInfo as method name returning master ServerName seems off. Is this the 'active' master or non-running master? I think we need to be clear that ClusterStatus reports on the active master only (unless you want to add list of all running master which I don't think yet possible since they do not register until they assume mastership --- hmmm... looking further down in your patch, it looks like you are adding this facility to zk). Is this of any use? + public boolean isMasterRunning() { I mean, if master is not running, can you even get a ClusterStatus from the cluster? Ditto for + public boolean isActiveMaster() { Won't this just be true anytime you get a ClusterStatue? You up the ClusterStatue version number but you don't act on it (what if you are asked deserialize an earlier version of ClusterStatus?) On MasterInterface, I'd suggest don't bother upping the version number -- just add the new method on the end. Thats usually ok. Also, isActiveMaster of any use even? (You could ask zk directly? Have hbaseadmin go ask zk rather than go via the master at all? Isn't the master znode name its ServerName? Isn't that what you need?) I like your registering backup masters... and adding the list to the zk report. HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0, 0.90.5, 0.92.0 Reporter: Aditya Acharya Assignee: David S. Wang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: HBASE-5209-v0.diff, HBASE-5209-v1.diff I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5363) Automatically run rat check on mvn release builds
[ https://issues.apache.org/jira/browse/HBASE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205937#comment-13205937 ] stack commented on HBASE-5363: -- Try it now boss Automatically run rat check on mvn release builds - Key: HBASE-5363 URL: https://issues.apache.org/jira/browse/HBASE-5363 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5363-0.90.patch, hbase-5363.2.patch, hbase-5363.patch Some of the recent hbase release failed rat checks (mvn rat:check). We should add checks likely in the mvn package phase so that this becomes a non-issue in the future. Here's an example from Whirr: https://github.com/apache/whirr/blob/trunk/pom.xml line 388 for an example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5327) Print a message when an invalid hbase.rootdir is passed
[ https://issues.apache.org/jira/browse/HBASE-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205939#comment-13205939 ] Jonathan Hsieh commented on HBASE-5327: --- @Jimmy. Thanks for looking into this. I'm +1 on v2, plan on committing tomorrow unless I hear otherwise. Print a message when an invalid hbase.rootdir is passed --- Key: HBASE-5327 URL: https://issues.apache.org/jira/browse/HBASE-5327 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jimmy Xiang Fix For: 0.94.0, 0.90.7, 0.92.1 Attachments: hbase-5327.txt, hbase-5327_v2.txt As seen on the mailing list: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/24124 If hbase.rootdir doesn't specify a folder on hdfs we crash while opening a path to .oldlogs: {noformat} 2012-02-02 23:07:26,292 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://sv4r11s38:9100.oldlogs at org.apache.hadoop.fs.Path.initialize(Path.java:148) at org.apache.hadoop.fs.Path.init(Path.java:71) at org.apache.hadoop.fs.Path.init(Path.java:50) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.URISyntaxException: Relative path in absolute URI: hdfs://sv4r11s38:9100.oldlogs at java.net.URI.checkPath(URI.java:1787) at java.net.URI.init(URI.java:735) at org.apache.hadoop.fs.Path.initialize(Path.java:145) ... 6 more {noformat} It could also crash anywhere else, this just happens to be the first place we use hbase.rootdir. We need to verify that it's an actual folder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5363) Automatically run rat check on mvn release builds
[ https://issues.apache.org/jira/browse/HBASE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205941#comment-13205941 ] Jonathan Hsieh commented on HBASE-5363: --- Thanks! Works. Updated wiki. Automatically run rat check on mvn release builds - Key: HBASE-5363 URL: https://issues.apache.org/jira/browse/HBASE-5363 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.90.5, 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5363-0.90.patch, hbase-5363.2.patch, hbase-5363.patch Some of the recent hbase release failed rat checks (mvn rat:check). We should add checks likely in the mvn package phase so that this becomes a non-issue in the future. Here's an example from Whirr: https://github.com/apache/whirr/blob/trunk/pom.xml line 388 for an example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira