[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425084#comment-13425084 ] Andrew Purtell commented on HBASE-6427: --- bq. There is no deterministic order in which X and Z's observers are registered. Ted, every coprocessor is registered in order of priority by design. There is always a deterministic order of observers. Frankly, this is something basic about CPs you should already understand if providing design comment on CP API. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425086#comment-13425086 ] Andrew Purtell commented on HBASE-6427: --- @Lars, lgtm. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425109#comment-13425109 ] Zhihong Ted Yu commented on HBASE-6427: --- From https://blogs.apache.org/hbase/entry/coprocessor_introduction : bq. We have not really discussed priority, but it should be reasonably clear how the priority given to a coprocessor affects how it integrates with other coprocessors. When calling out to registered observers, the framework executes their callbacks methods in the sorted order of their priority. Ties are broken arbitrarily. This means there still might be scenarios where coprocessors for the same table have the same priority. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425120#comment-13425120 ] Zhihong Ted Yu commented on HBASE-6427: --- For RegionObserver.preFlushScannerOpen(), compaction is mentioned in its javadoc several times. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425124#comment-13425124 ] Andrew Purtell commented on HBASE-6427: --- bq. This means there still might be scenarios where coprocessors for the same table have the same priority. Fine, that text needs update. The tie is not broken arbitrarily, it is by load order. But you are missing the larger point that both Lars and I have mentioned above, CPs are not (currently, nor likely) going to be random user modules loaded blindly with respect to each other. They are deeply embedded in HBase implementation. If as a system integrator you are deploying coprocessors, you will be engineering their load/initialization order as well as all other cluster details. Again, this is an X-Y discussion. It would be best to stick to the issues in scope to this JIRA. If there are larger design issues you'd like to consider, let's open a JIRA for those. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425127#comment-13425127 ] Zhihong Ted Yu commented on HBASE-6427: --- I don't have further design level comments. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425142#comment-13425142 ] Hadoop QA commented on HBASE-6427: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538381/6427-v10.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 33 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2457//console This message is automatically generated. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425228#comment-13425228 ] Lars Hofhansl commented on HBASE-6427: -- Thanks Ted and Andy, and thanks for the discussion. I think we have two (maybe three) take-aways: (1) We need to look at the various scanner interface and see why we have so many diverging interfaces and (2) add more coprocessor documentation (maybe with some more examples) and potentially (3) think generally about what it means to extend HBase and when coprocessors are a good mechanism for that. It seems to me that coprocessors are a good solution to effect existing processing at certain (including critical) spots, but maybe not suited to replace the entire logic. In that this issue represents a corner case - which is probably what spawned the longer discussion here (the creation of the StoreScanner is replaced, and this is done in the context of a bigger operation). In the future we might be able to use Guice to replace entire subsystems. For this issue I'll fix up the Javadoc issues that Ted mentions and commit this to trunk and also make a 0.94 patch. Thanks again for the review and the discussion. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424622#comment-13424622 ] Zhihong Ted Yu commented on HBASE-6427: --- Review board doesn't show correct formatting. So I put the following here. bq. to make a new Interface extending both InternalScanner and KeyValueScanner I like the above approach. Currently we have: {code} $ find src/main -name '*.java' -exec grep 'ments.*InternalScanner' {} \; -print * also implements InternalScanner. WARNING: As is, if you try to use this implements KeyValueScanner, InternalScanner { src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java implements KeyValueScanner, InternalScanner, ChangedReadersObserver { src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java {code} Combining InternalScanner and KeyValueScanner seems natural. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424655#comment-13424655 ] Lars Hofhansl commented on HBASE-6427: -- Thinking on this more... I think the pre{Flush|Compact}ScannerOpen hooks should just continue to return InternalScanner. Only that interface is needed by downstream code and we should not extend this only so that coprocessor chaining becomes simpler. As it stands these hooks *can* use the passed InternalScanner, but it needs some understanding of HBase... Which is needed anyway to correctly deal with chained scanners for flush or compactions. I.e. I propose leaving it with what the current patch provides. Not opposed to filing a separate ticket to bring more sense into the various scanner interfaces we have. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424657#comment-13424657 ] Zhihong Ted Yu commented on HBASE-6427: --- bq. to correctly deal with chained scanners I personally haven't seen chained scanners in action. If we don't think through how chained scanners work, I wouldn't expect HBase users to use this mechanism. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424658#comment-13424658 ] Lars Hofhansl commented on HBASE-6427: -- It's used rarely, because coprocessors are not like store procedures, but a method to extend HBase. We *have* through it (IMHO), and the coprocessors now can be chained, and due to the nature of these hooks that will be tricky. For example even if we managed to pass in a KeyValueScanner instance, folks would be tempted to just add this one to the List of scanners down the chain (as was your first thought above), which is tempting, but will not be the right thing to do; the downstream must do some complicated logic to merge with the previous scanner in ways that we cannot anticipate. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424663#comment-13424663 ] Andrew Purtell commented on HBASE-6427: --- bq. I personally haven't seen chained scanners in action. If we don't think through how chained scanners work, I wouldn't expect HBase users to use this mechanism. Ted, this is a great comment, because I think it illustrates an incorrect approach to CP API design. (Not meant to be a criticism of you. :-)) CPs are targeted as much for HBase developers as they might be for users. The fundamental goal of CPs is to avoid needing to patch core HBase code for implementing new features or researching design alternatives. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424664#comment-13424664 ] Andrew Purtell commented on HBASE-6427: --- [~lhofhansl] A follow up JIRA to discuss refactoring the internal scanner interfaces would be reasonable. We do want for extensions to be able to wrap and merge scanners along CP chains. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424665#comment-13424665 ] Andrew Purtell commented on HBASE-6427: --- And a follow up note to my above comment: It is a goal to provide adequate extension surface (and I include here besides CPs also other pluggable interfaces where performance considerations are paramount) so new features or design alternatives require no core code patches. IMHO, the design strategy for this goal should be incremental, driven by actual use cases, but with foresight added. This issue is a good example of that I think, Lars has really improved flush and compaction hooks, and this work hasn't been done in the abstract. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424667#comment-13424667 ] Zhihong Ted Yu commented on HBASE-6427: --- I understand the goal for this JIRA and am in support of it. bq. the downstream must do some complicated logic to merge with the previous scanner in ways that we cannot anticipate. This illustrates the intricacies of scanner chaining. If a scanner is designed for some specific purpose, I wouldn't expect it to function correctly when an arbitrary number of scanners are chained (both) upstream and downstream. In fact, chaining introduces unnecessary burden on individual scanner. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424669#comment-13424669 ] Andrew Purtell commented on HBASE-6427: --- bq. I understand the goal for this JIRA and am in support of it. [...] This illustrates the intricacies of scanner chaining. If a scanner is designed for some specific purpose, I wouldn't expect it to function correctly when an arbitrary number of scanners are chained (both) upstream and downstream. Pardon Ted but I think this is an X-Y argument. I'm not sure we are discussing the chaining of _arbitrary_ scanners (unless I have this wrong.) Each CP hook is dealing with a constrained set of scanners. Flushes will be dealing with memstore scanners. Compactions will be dealing with store scanners. The question has been what kind of interface should input parameters and return types share, there's some design give-and-take there. But we are not talking about, for example, combining memstore scanners with store scanners. (At least, I am not.) Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424671#comment-13424671 ] Zhihong Ted Yu commented on HBASE-6427: --- There is no intersection between memstore scanners and store scanners. The arbitrary number of scanners is due to the following construct: {code} +for (RegionEnvironment env: coprocessors) { + if (env.getInstance() instanceof RegionObserver) { +ctx = ObserverContext.createAndPrepare(env, ctx); +try { + s = ((RegionObserver) env.getInstance()).preCompactScannerOpen(ctx, store, scanners, + scanType, earliestPutTs, s); {code} I think we should reduce the complexity (due to scanner chaining) for preCompactScannerOpen(). If we don't provide a working model for scanner chaining, there is no need to introduce construct for chaining scanners. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424672#comment-13424672 ] Lars Hofhansl commented on HBASE-6427: -- bq. This illustrates the intricacies of scanner chaining. I think that is a false transitive statement. :) For this particular hook chaining is hard, because this is an intricate part of the HBase code. That does not mean that chaining is generally hard (it is not). bq. I wouldn't expect it to function correctly when an arbitrary number of scanners are chained (both) upstream and downstream. Same here... The point is: It is possible to chain region observers even in this case. If that is not desired an implementer can break the chain (via the passed context. and by ignoring the InternalScanner argument). Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424673#comment-13424673 ] Lars Hofhansl commented on HBASE-6427: -- Also another thought: I wonder how hard it is to fold custom timestamps into this. An example of what I mean is time dimension that uses monotonically increasing transaction numbers. A slight API change to the StoreScanner constructor would make it possible to handle that too: By passing in the absolute time at which a KV expires rather than the TTL (which it then internally translates to an absolute time anyway, and which in turn depends on the RegionServer's understanding of time, which is not configurable). Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424674#comment-13424674 ] Lars Hofhansl commented on HBASE-6427: -- Re: Ted's last comment. So you want to enforce that only a single coprocessor can implement these hooks? I would disagree with that. The chaining is fine (if complicated), no need to mandate what an implementer can and cannot do (see my previous comment). Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424676#comment-13424676 ] Zhihong Ted Yu commented on HBASE-6427: --- bq. If that is not desired an implementer can break the chain If implementer X breaks the chain, how would implementer Z know that his/her implementation is not broken by someone else ? There is no deterministic order in which X and Z's observers are registered. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424680#comment-13424680 ] Lars Hofhansl commented on HBASE-6427: -- Ted, you are trying to redesign the overall coprocessor API. I would prefer that we can keep the API as is, and then work on ways to make better sense of the scanners as part of a different jira. As Andy said a major raison d'etre for coprocessors is to extend HBase. Nobody would just run a 3rd party coprocessor (unless that party is highly trusted). A coprocessor could call System.exit() and shut down the RegionServer... And that was a design choice. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424690#comment-13424690 ] Zhihong Ted Yu commented on HBASE-6427: --- For preCompactScannerOpen() / preFlushScannerOpen(), they're new APIs and serve particular use case if implemented. I feel there is no need to pass InternalScanner as parameter. Just my personal observation. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424697#comment-13424697 ] Lars Hofhansl commented on HBASE-6427: -- I see your point. But then you # need a mechanism to only allow a single RegionObserver to implement these hooks. # placed an arbitrary limit on the API (only because it is hard to chain the hooks here, does not mean it is impossible or not useful, as I said one can always create a new implementing class of InternalScanner that does the right thing) This follows the coprocessor API pattern. I think this should be kept. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424281#comment-13424281 ] Hadoop QA commented on HBASE-6427: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538229/6427-v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 33 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2450//console This message is automatically generated. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424367#comment-13424367 ] Zhihong Ted Yu commented on HBASE-6427: --- {code} + scanner = ((RegionObserver) env.getInstance()).preCompact(ctx, store, scanners, + scanType, earliestPutTs); {code} Maybe insert the returned scanner (if not null) into scanners ? {code} + * @deprecated use {@link #preCompact(ObserverContext, Store, List, ScanType, long)} instead */ InternalScanner preCompact(final ObserverContextRegionCoprocessorEnvironment c, final Store store, final InternalScanner scanner) throws IOException; {code} Do we have to deprecate the existing API ? I feel the new API is much more involved in terms of technical internals. Maybe a poll on user mailing list would help clarify. {code} + * @param scanners the list {@link StoreFileScanner}s to be read from ... + InternalScanner preCompact(final ObserverContextRegionCoprocessorEnvironment c, + final Store store, List? extends KeyValueScanner scanners, ScanType scanType, long earliestPutTs) {code} nit: the second line above is over 100 characters. If scanners really should be StoreFileScanner's, method signature should match expectation. See the following in RegionCoprocessorHost.java: {code} + * See {@link RegionObserver#preCompact(ObserverContext, Store, InternalScanner)} + */ + public InternalScanner preCompact(Store store, ListStoreFileScanner scanners, + ScanType scanType, long earliestPutTs) throws IOException { {code} Review board would facilitate more detailed review. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424416#comment-13424416 ] Zhihong Ted Yu commented on HBASE-6427: --- Looking at the changes to Compactor.compact(), the new preCompact() takes precedence over existing preCompact() method. This should be documented. My comment above about scanner insertion was not valid. It is not clear why more than one RegionObserver would return InternalScanner from preCompact(). For simplicity, we can break out of the loop in RegionCoprocessorHost.preCompact() when a non-null InternalScanner is returned. This behavior should also be documented. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424425#comment-13424425 ] Lars Hofhansl commented on HBASE-6427: -- I think we should what the current preScannerOpen is doing. Each RegionObserver is passed the previous InternalScanner (would be a KeyValueScanner here, but the same principle applies). A RegionObserver can break the loop via the passed Context, I do not think we should default that bevahior. I'll have a patch for that soon. As for the deprecation, I think I agree with Andy here. Having the two hooks is confusion. Everything that could be done with the old can also be done with the new hook, and coprocessors are meant for extending HBase. Re: StoreFileScanner vs ? extends KeyValueScanner... I typically prefer to express this in terms interface, rather than concrete classes. It also keep preCompact and preFlush similar (one gets a list of StoreFileScanners, the other gets a StoreScanner). I do not feel strongly about this, though. I'll change the patch and put it up on RB... Thanks for the detailed review Ted! Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424433#comment-13424433 ] Andrew Purtell commented on HBASE-6427: --- Agree that scanners should be passed along. The use case is subsequent observers wrapping scanners created by those earlier in the chain. Also we should deprecate and remove the older more limited hooks now that we have a superset interface that admits more possibilities. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424449#comment-13424449 ] Lars Hofhansl commented on HBASE-6427: -- Interestingly I would like to have the ability to bypass the default action from both preFlush and preCompact, for example to control how the store files are written. With the new hooks there is no way to indicate that (null means create the default scanner, non-null means use the returned scanner, but still follow the default action). The hook just prior to creating the scanner could create a new scanner (and hence decide how to filter the inputs) the hook right after scanner creation could then control how/where to write the store files. So maybe have a preCompactScannerOpen, and preFlushScannerOpen (similar to my initial idea), and not deprecating the existing hooks? Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424451#comment-13424451 ] Andrew Purtell commented on HBASE-6427: --- Then you want to restore this behavior: {noformat} // NULL scanner returned from coprocessor hooks means skip normal processing {noformat} Can that work? I'd not be opposed to adding additional hooks but that should be after exhausting other options here, IMHO, since they would be close to each other. We could pass in the default StoreScanner. The hook could just return it if wanting default behavior. Might need to make StoreScanner lazy, move initialization out of the constructor. I'm remote and just have your patch to go by at the moment. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424453#comment-13424453 ] Zhihong Ted Yu commented on HBASE-6427: --- bq. null means create the default scanner Maybe create a special class (called NullScanner ?) implementing InternalScanner. The class provides a singleton which can be returned by preCompact() to indicate skipping normal processing. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424458#comment-13424458 ] Lars Hofhansl commented on HBASE-6427: -- After reviewing all the use cases I have this seems to be best proposal: * add these {code} void postFlush(final ObserverContextRegionCoprocessorEnvironment c, final Store store, final StoreFile resultFile) throws IOException; InternalScanner preFlush(final ObserverContextRegionCoprocessorEnvironment c, final Store store, final InternalScanner scanner) throws IOException; InternalScanner preFlushScannerOpen(final ObserverContextRegionCoprocessorEnvironment c, final Store store, final KeyValueScanner memstoreScanner, final InternalScanner s) throws IOException; InternalScanner preCompactScannerOpen(final ObserverContextRegionCoprocessorEnvironment c, final Store store, List? extends KeyValueScanner scanners, final ScanType scanType, final long earliestPutTs, final InternalScanner s) throws IOException; KeyValueScanner preStoreScannerOpen(final ObserverContextRegionCoprocessorEnvironment c, final Store store, final Scan scan, final NavigableSetbyte[] targetCols, final KeyValueScanner s) throws IOException; {code} * deprecate these: {code} void postFlush(final ObserverContextRegionCoprocessorEnvironment c) throws IOException; void preFlush(final ObserverContextRegionCoprocessorEnvironment c) throws IOException; {code} The new {pre|post}Flush are called per Store in analogy to {pre|post}Compact. pre{Flush|Compact}ScannerOpen are called before the flush/compaction scanner is built. This is give maximum flexibility (I can control the reading scanners *and* how the storefiles are written for both flushes and compactions), makes more sense of {pre|post}Flush and leave existing functionality in place. This is the first proposal that really feels right to me. I'll have a patch for that soon. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424463#comment-13424463 ] Hadoop QA commented on HBASE-6427: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538259/6427-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 33 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterNoCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2451//console This message is automatically generated. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424464#comment-13424464 ] Lars Hofhansl commented on HBASE-6427: -- bq. I'd not be opposed to adding additional hooks but that should be after exhausting other options here, IMHO, since they would be close to each other. Do you think the latest proposal is too heavy handed? pre{Flush|Compact}ScannerOpen would be quite close to pre{Flush|Compact}. My reasoning was that an implementer could still override the relatively simple pre{Flush|Compact} hooks. If these are too many, we can still have only the fewer hooks, and then we'd need some NullScanner approach I think. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424466#comment-13424466 ] Andrew Purtell commented on HBASE-6427: --- It's fine. I like how you made the new APIs about opening (internal) scanners for various things. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423694#comment-13423694 ] Hadoop QA commented on HBASE-6427: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538127/6427-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 14 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2445//console This message is automatically generated. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424270#comment-13424270 ] Zhihong Ted Yu commented on HBASE-6427: --- Some new files need license header. {code} +public class ScanPolicyCoprocessor extends BaseRegionObserver { {code} This class is an observer. Suggest renaming the class. Annotations for audience and stability should be added. For RegionCoprocessorHost.preCompact(): {code} + scanner = ((RegionObserver) env.getInstance()).preCompact(ctx, store, scanners, + scanType, earliestPutTs); {code} If there're multiple RegionObserver's, it seems only the final returned scanner would be returned. Is this intentional ? Similar observation for preStoreScannerOpen() and preFlush() Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424273#comment-13424273 ] Lars Hofhansl commented on HBASE-6427: -- Thanks Ted. You are right on all counts. Need to think about multiple coprocessors a bit more. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424275#comment-13424275 ] Lars Hofhansl commented on HBASE-6427: -- The current preCompact/preScannerOpen/other hooks handle this by passing the scanner from the previous coprocessor to the next one, so that each coprocessor has all the information needed. I could do something similar here, although it would quickly get inscrutable for an implemented of these hooks; but I cannot think of anything better. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. This can be done with a few additional coprocessor hooks, or by makeing Store.ScanInfo pluggable. Was: The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423305#comment-13423305 ] Andrew Purtell commented on HBASE-6427: --- Or use polymorphism? {noformat} + public InternalScanner preFlush(final ObserverContextRegionCoprocessorEnvironment c, + Store store, KeyValueScanner scanner) throws IOException; + + @Deprecated public void preFlush(ObserverContextRegionCoprocessorEnvironment e) throws IOException; {noformat} {noformat} + public InternalScanner preCompact(final ObserverContextRegionCoprocessorEnvironment c, + Store store, List? extends KeyValueScanner scanners, ScanType scanType, long earliestPutTs) + throws IOException; + + @Deprecated public void preCompact(ObserverContextRegionCoprocessorEnvironment e ... {noformat} Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423325#comment-13423325 ] Lars Hofhansl commented on HBASE-6427: -- Uh... I like that. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423365#comment-13423365 ] Lars Hofhansl commented on HBASE-6427: -- The part I still have think through is how to handle actual use scans. Be default a user scan will also filter TTL/Versions, so it's one thing to prevent the KVs from being compacted away and another to actually make them visible to user scans. A similar approach can be followed in preScannerOpen, as long as the coprocessor has enough access to internal region data structure to rebuild the default scanner. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423367#comment-13423367 ] Andrew Purtell commented on HBASE-6427: --- BaseRegionObserver should reimplement the default behavior in the new methods? Anybody who inherits would get it. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423368#comment-13423368 ] Andrew Purtell commented on HBASE-6427: --- Or, better yet, BaseRegionObserver calls out to a Store static method that does it, with some javadoc to make it clear what's going on? Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423372#comment-13423372 ] Lars Hofhansl commented on HBASE-6427: -- Hmm... The default behavior is followed when RegionObserver.pre{flush|compact} return a null scanner, which is what BaseRegionObserver does by default. BaseRegionObserver implementing the default behavior would not really buy anything (unless I am missing something). As for last comment above, I think we'd need a preStoreScannerOpen, which would be called in Store.getScanner (right before the new StoreScanner is created) to allow the coprocessor to return a custom scanner here too. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423374#comment-13423374 ] Andrew Purtell commented on HBASE-6427: --- bq. The default behavior is followed when RegionObserver.pre{flush|compact} return a null scanner, which is what BaseRegionObserver does by default. Fine, I was misled by the unit test code. bq. I think we'd need a preStoreScannerOpen, which would be called in Store.getScanner (right before the new StoreScanner is created) to allow the coprocessor to return a custom scanner here too. Sounds good to me. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423384#comment-13423384 ] Lars Hofhansl commented on HBASE-6427: -- Of course this has to be compared to simply making the ScanInfo pluggable in Store.java. What I want to achieve here is to have an external process (backup tool, transaction engine, etc) to be able to override HBase's default TTL/#Versions with very high fidelity (i.e. not via a dynamic schema change, which is too heavyweight/slow). The coprocessor approach is nice, because it provides a lot of flexibility for other future use cases and it does not invent a new concept. At the same time it adds complexity. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423387#comment-13423387 ] Andrew Purtell commented on HBASE-6427: --- bq. The coprocessor approach is nice, because it provides a lot of flexibility for other future use cases and it does not invent a new concept. At the same time it adds complexity. On balance the API change here is nice because it extends something that was too limited to address your use case such that now it works for you, and it also admits the possibility of others. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423469#comment-13423469 ] Andrew Purtell commented on HBASE-6427: --- lgtm, good tests Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423501#comment-13423501 ] Hadoop QA commented on HBASE-6427: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538075/6427-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 24 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 14 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.catalog.TestMetaReaderEditor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2440//console This message is automatically generated. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors
[ https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423548#comment-13423548 ] Lars Hofhansl commented on HBASE-6427: -- I ran the failing tests locally, and they all pass. Will sit on this a bit longer, write a test that tests the actual scenario I am interested in, etc. Pluggable compaction policies via coprocessors -- Key: HBASE-6427 URL: https://issues.apache.org/jira/browse/HBASE-6427 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt When implementing higher level stores on top of HBase it is necessary to allow dynamic control over how long KVs must be kept around. Semi-static config options for ColumnFamilies (# of version or TTL) is not sufficient. The simplest way to achieve this is to have a pluggable class to determine the smallestReadpoint for Region. That way outside code can control what KVs to retain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira