[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425084#comment-13425084
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. There is no deterministic order in which X and Z's observers are registered.

Ted, every coprocessor is registered in order of priority by design. There is 
always a deterministic order of observers. Frankly, this is something basic 
about CPs you should already understand if providing design comment on CP API.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425086#comment-13425086
 ] 

Andrew Purtell commented on HBASE-6427:
---

@Lars, lgtm.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425109#comment-13425109
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

From https://blogs.apache.org/hbase/entry/coprocessor_introduction :
bq. We have not really discussed priority, but it should be reasonably clear 
how the priority given to a coprocessor affects how it integrates with other 
coprocessors. When calling out to registered observers, the framework executes 
their callbacks methods in the sorted order of their priority. Ties are broken 
arbitrarily.
This means there still might be scenarios where coprocessors for the same table 
have the same priority.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425120#comment-13425120
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

For RegionObserver.preFlushScannerOpen(), compaction is mentioned in its 
javadoc several times.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425124#comment-13425124
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. This means there still might be scenarios where coprocessors for the same 
table have the same priority.

Fine, that text needs update. The tie is not broken arbitrarily, it is by load 
order.

But you are missing the larger point that both Lars and I have mentioned above, 
CPs are not (currently, nor likely) going to be random user modules loaded 
blindly with respect to each other. They are deeply embedded in HBase 
implementation. If as a system integrator you are deploying coprocessors, you 
will be engineering their load/initialization order as well as all other 
cluster details. Again, this is an X-Y discussion. It would be best to stick to 
the issues in scope to this JIRA. If there are larger design issues you'd like 
to consider, let's open a JIRA for those.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425127#comment-13425127
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

I don't have further design level comments.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425142#comment-13425142
 ] 

Hadoop QA commented on HBASE-6427:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538381/6427-v10.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2457//console

This message is automatically generated.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-30 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425228#comment-13425228
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Thanks Ted and Andy, and thanks for the discussion.
I think we have two (maybe three) take-aways: (1) We need to look at the 
various scanner interface and see why we have so many diverging interfaces and 
(2) add more coprocessor documentation (maybe with some more examples) and 
potentially (3) think generally about what it means to extend HBase and when 
coprocessors are a good mechanism for that.

It seems to me that coprocessors are a good solution to effect existing 
processing at certain (including critical) spots, but maybe not suited to 
replace the entire logic. In that this issue represents a corner case - which 
is probably what spawned the longer discussion here (the creation of the 
StoreScanner is replaced, and this is done in the context of a bigger 
operation).
In the future we might be able to use Guice to replace entire subsystems.

For this issue I'll fix up the Javadoc issues that Ted mentions and commit this 
to trunk and also make a 0.94 patch.

Thanks again for the review and the discussion.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v10.txt, 
 6427-v2.txt, 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424622#comment-13424622
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

Review board doesn't show correct formatting. So I put the following here.

bq. to make a new Interface extending both InternalScanner and KeyValueScanner
I like the above approach.
Currently we have:
{code}
$ find src/main -name '*.java' -exec grep 'ments.*InternalScanner' {} \; -print
 * also implements InternalScanner.  WARNING: As is, if you try to use this
implements KeyValueScanner, InternalScanner {
src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
implements KeyValueScanner, InternalScanner, ChangedReadersObserver {
src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
{code}
Combining InternalScanner and KeyValueScanner seems natural.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424655#comment-13424655
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Thinking on this more... I think the pre{Flush|Compact}ScannerOpen hooks should 
just continue to return InternalScanner. Only that interface is needed by 
downstream code and we should not extend this only so that coprocessor chaining 
becomes simpler.
As it stands these hooks *can* use the passed InternalScanner, but it needs 
some understanding of HBase... Which is needed anyway to correctly deal with 
chained scanners for flush or compactions.

I.e. I propose leaving it with what the current patch provides. Not opposed to 
filing a separate ticket to bring more sense into the various scanner 
interfaces we have.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424657#comment-13424657
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

bq. to correctly deal with chained scanners
I personally haven't seen chained scanners in action.

If we don't think through how chained scanners work, I wouldn't expect HBase 
users to use this mechanism.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424658#comment-13424658
 ] 

Lars Hofhansl commented on HBASE-6427:
--

It's used rarely, because coprocessors are not like store procedures, but a 
method to extend HBase.

We *have* through it (IMHO), and the coprocessors now can be chained, and due 
to the nature of these hooks that will be tricky.
For example even if we managed to pass in a KeyValueScanner instance, folks 
would be tempted to just add this one to the List of scanners down the chain 
(as was your first thought above), which is tempting, but will not be the right 
thing to do; the downstream must do some complicated logic to merge with the 
previous scanner in ways that we cannot anticipate.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424663#comment-13424663
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. I personally haven't seen chained scanners in action. If we don't think 
through how chained scanners work, I wouldn't expect HBase users to use this 
mechanism.

Ted, this is a great comment, because I think it illustrates an incorrect 
approach to CP API design. (Not meant to be a criticism of you. :-)) CPs are 
targeted as much for HBase developers as they might be for users. The 
fundamental goal of CPs is to avoid needing to patch core HBase code for 
implementing new features or researching design alternatives. 

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424664#comment-13424664
 ] 

Andrew Purtell commented on HBASE-6427:
---

[~lhofhansl] A follow up JIRA to discuss refactoring the internal scanner 
interfaces would be reasonable. We do want for extensions to be able to wrap 
and merge scanners along CP chains.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424665#comment-13424665
 ] 

Andrew Purtell commented on HBASE-6427:
---

And a follow up note to my above comment: It is a goal to provide adequate 
extension surface (and I include here besides CPs also other pluggable 
interfaces where performance considerations are paramount) so new features or 
design alternatives require no core code patches. IMHO, the design strategy for 
this goal should be incremental, driven by actual use cases, but with foresight 
added. This issue is a good example of that I think, Lars has really improved 
flush and compaction hooks, and this work hasn't been done in the abstract.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424667#comment-13424667
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

I understand the goal for this JIRA and am in support of it.

bq. the downstream must do some complicated logic to merge with the previous 
scanner in ways that we cannot anticipate.
This illustrates the intricacies of scanner chaining. If a scanner is designed 
for some specific purpose, I wouldn't expect it to function correctly when an 
arbitrary number of scanners are chained (both) upstream and downstream.
In fact, chaining introduces unnecessary burden on individual scanner.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424669#comment-13424669
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. I understand the goal for this JIRA and am in support of it. [...] This 
illustrates the intricacies of scanner chaining. If a scanner is designed for 
some specific purpose, I wouldn't expect it to function correctly when an 
arbitrary number of scanners are chained (both) upstream and downstream.

Pardon Ted but I think this is an X-Y argument. I'm not sure we are discussing 
the chaining of _arbitrary_ scanners (unless I have this wrong.) Each CP hook 
is dealing with a constrained set of scanners. Flushes will be dealing with 
memstore scanners. Compactions will be dealing with store scanners. The 
question has been what kind of interface should input parameters and return 
types share, there's some design give-and-take there. But we are not talking 
about, for example, combining memstore scanners with store scanners. (At least, 
I am not.)

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424671#comment-13424671
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

There is no intersection between memstore scanners and store scanners.

The arbitrary number of scanners is due to the following construct:
{code}
+for (RegionEnvironment env: coprocessors) {
+  if (env.getInstance() instanceof RegionObserver) {
+ctx = ObserverContext.createAndPrepare(env, ctx);
+try {
+  s = ((RegionObserver) env.getInstance()).preCompactScannerOpen(ctx, 
store, scanners,
+  scanType, earliestPutTs, s);
{code}
I think we should reduce the complexity (due to scanner chaining) for 
preCompactScannerOpen().
If we don't provide a working model for scanner chaining, there is no need to 
introduce construct for chaining scanners.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424672#comment-13424672
 ] 

Lars Hofhansl commented on HBASE-6427:
--

bq. This illustrates the intricacies of scanner chaining.
I think that is a false transitive statement. :)
For this particular hook chaining is hard, because this is an intricate part of 
the HBase code. That does not mean that chaining is generally hard (it is not).

bq. I wouldn't expect it to function correctly when an arbitrary number of 
scanners are chained (both) upstream and downstream.
Same here... The point is: It is possible to chain region observers even in 
this case.
If that is not desired an implementer can break the chain (via the passed 
context. and by ignoring the InternalScanner argument).


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424673#comment-13424673
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Also another thought: I wonder how hard it is to fold custom timestamps into 
this. An example of what I mean is time dimension that uses monotonically 
increasing transaction numbers.
A slight API change to the StoreScanner constructor would make it possible to 
handle that too: By passing in the absolute time at which a KV expires rather 
than the TTL (which it then internally translates to an absolute time anyway, 
and which in turn depends on the RegionServer's understanding of time, which is 
not configurable).


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424674#comment-13424674
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Re: Ted's last comment. So you want to enforce that only a single coprocessor 
can implement these hooks? I would disagree with that.
The chaining is fine (if complicated), no need to mandate what an implementer 
can and cannot do (see my previous comment).


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424676#comment-13424676
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

bq. If that is not desired an implementer can break the chain
If implementer X breaks the chain, how would implementer Z know that his/her 
implementation is not broken by someone else ?
There is no deterministic order in which X and Z's observers are registered.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424680#comment-13424680
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Ted, you are trying to redesign the overall coprocessor API.
I would prefer that we can keep the API as is, and then work on ways to make 
better sense of the scanners as part of a different jira.

As Andy said a major raison d'etre for coprocessors is to extend HBase. Nobody 
would just run a 3rd party coprocessor (unless that party is highly trusted). A 
coprocessor could call System.exit() and shut down the RegionServer... And that 
was a design choice.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424690#comment-13424690
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

For preCompactScannerOpen() / preFlushScannerOpen(), they're new APIs and serve 
particular use case if implemented.
I feel there is no need to pass InternalScanner as parameter.

Just my personal observation.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424697#comment-13424697
 ] 

Lars Hofhansl commented on HBASE-6427:
--

I see your point.

But then you
# need a mechanism to only allow a single RegionObserver to implement these 
hooks.
# placed an arbitrary limit on the API (only because it is hard to chain the 
hooks here, does not mean it is impossible or not useful, as I said one can 
always create a new implementing class of InternalScanner that does the right 
thing)

This follows the coprocessor API pattern. I think this should be kept.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424281#comment-13424281
 ] 

Hadoop QA commented on HBASE-6427:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538229/6427-v5.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  org.apache.hadoop.hbase.master.TestAssignmentManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2450//console

This message is automatically generated.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424367#comment-13424367
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

{code}
+  scanner = ((RegionObserver) env.getInstance()).preCompact(ctx, 
store, scanners,
+  scanType, earliestPutTs);
{code}
Maybe insert the returned scanner (if not null) into scanners ?
{code}
+   * @deprecated use {@link #preCompact(ObserverContext, Store, List, 
ScanType, long)} instead
*/
   InternalScanner preCompact(final 
ObserverContextRegionCoprocessorEnvironment c,
   final Store store, final InternalScanner scanner) throws IOException;
{code}
Do we have to deprecate the existing API ? I feel the new API is much more 
involved in terms of technical internals. Maybe a poll on user mailing list 
would help clarify.
{code}
+   * @param scanners the list {@link StoreFileScanner}s to be read from
...
+  InternalScanner preCompact(final 
ObserverContextRegionCoprocessorEnvironment c,
+  final Store store, List? extends KeyValueScanner scanners, ScanType 
scanType, long earliestPutTs)
{code}
nit: the second line above is over 100 characters.
If scanners really should be StoreFileScanner's, method signature should match 
expectation.
See the following in RegionCoprocessorHost.java:
{code}
+   * See {@link RegionObserver#preCompact(ObserverContext, Store, 
InternalScanner)}
+   */
+  public InternalScanner preCompact(Store store, ListStoreFileScanner 
scanners,
+  ScanType scanType, long earliestPutTs) throws IOException {
{code}

Review board would facilitate more detailed review.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424416#comment-13424416
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

Looking at the changes to Compactor.compact(), the new preCompact() takes 
precedence over existing preCompact() method.
This should be documented.

My comment above about scanner insertion was not valid.
It is not clear why more than one RegionObserver would return InternalScanner 
from preCompact(). For simplicity, we can break out of the loop in 
RegionCoprocessorHost.preCompact() when a non-null InternalScanner is returned.
This behavior should also be documented.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424425#comment-13424425
 ] 

Lars Hofhansl commented on HBASE-6427:
--

I think we should what the current preScannerOpen is doing. Each RegionObserver 
is passed the previous InternalScanner (would be a KeyValueScanner here, but 
the same principle applies). A RegionObserver can break the loop via the passed 
Context, I do not think we should default that bevahior. I'll have a patch for 
that soon.

As for the deprecation, I think I agree with Andy here. Having the two hooks is 
confusion. Everything that could be done with the old can also be done with the 
new hook, and coprocessors are meant for extending HBase.

Re: StoreFileScanner vs ? extends KeyValueScanner... I typically prefer to 
express this in terms interface, rather than concrete classes. It also keep 
preCompact and preFlush similar (one gets a list of StoreFileScanners, the 
other gets a StoreScanner). I do not feel strongly about this, though.

I'll change the patch and put it up on RB... Thanks for the detailed review Ted!


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424433#comment-13424433
 ] 

Andrew Purtell commented on HBASE-6427:
---

Agree that scanners should be passed along.  The use case is subsequent 
observers wrapping scanners created by those earlier in the chain.

Also we should deprecate and remove the older more limited hooks now that we 
have a superset interface that admits more possibilities.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424449#comment-13424449
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Interestingly I would like to have the ability to bypass the default action 
from both preFlush and preCompact, for example to control how the store files 
are written.
With the new hooks there is no way to indicate that (null means create the 
default scanner, non-null means use the returned scanner, but still follow the 
default action).

The hook just prior to creating the scanner could create a new scanner (and 
hence decide how to filter the inputs) the hook right after scanner creation 
could then control how/where to write the store files.

So maybe have a preCompactScannerOpen, and preFlushScannerOpen (similar to my 
initial idea), and not deprecating the existing hooks?


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424451#comment-13424451
 ] 

Andrew Purtell commented on HBASE-6427:
---

Then you want to restore this behavior:
{noformat}
// NULL scanner returned from coprocessor hooks means skip normal processing
{noformat}

Can that work? 

I'd not be opposed to adding additional hooks but that should be after 
exhausting other options here, IMHO, since they would be close to each other.

We could pass in the default StoreScanner. The hook could just return it if 
wanting default behavior. Might need to make StoreScanner lazy, move 
initialization out of the constructor. I'm remote and just have your patch to 
go by at the moment.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424453#comment-13424453
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

bq. null means create the default scanner
Maybe create a special class (called NullScanner ?) implementing 
InternalScanner. The class provides a singleton which can be returned by 
preCompact() to indicate skipping normal processing.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424458#comment-13424458
 ] 

Lars Hofhansl commented on HBASE-6427:
--

After reviewing all the use cases I have this seems to be best proposal:
* add these
{code}
void postFlush(final ObserverContextRegionCoprocessorEnvironment c, final 
Store store, final StoreFile resultFile) throws IOException;
InternalScanner preFlush(final ObserverContextRegionCoprocessorEnvironment c, 
final Store store, final InternalScanner scanner) throws IOException;
InternalScanner preFlushScannerOpen(final 
ObserverContextRegionCoprocessorEnvironment c, final Store store, final 
KeyValueScanner memstoreScanner, final InternalScanner s) throws IOException;
InternalScanner preCompactScannerOpen(final 
ObserverContextRegionCoprocessorEnvironment c, final Store store, List? 
extends KeyValueScanner scanners, final ScanType scanType, final long 
earliestPutTs, final InternalScanner s) throws IOException;
KeyValueScanner preStoreScannerOpen(final 
ObserverContextRegionCoprocessorEnvironment c, final Store store, final Scan 
scan, final NavigableSetbyte[] targetCols, final KeyValueScanner s) throws 
IOException;
{code}
* deprecate these:
{code}
void postFlush(final ObserverContextRegionCoprocessorEnvironment c) throws 
IOException;
void preFlush(final ObserverContextRegionCoprocessorEnvironment c) throws 
IOException;
{code}

The new {pre|post}Flush are called per Store in analogy to {pre|post}Compact.
pre{Flush|Compact}ScannerOpen are called before the flush/compaction scanner is 
built.

This is give maximum flexibility (I can control the reading scanners *and* how 
the storefiles are written for both flushes and compactions), makes more sense 
of {pre|post}Flush and leave existing functionality in place.

This is the first proposal that really feels right to me. I'll have a patch 
for that soon.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424463#comment-13424463
 ] 

Hadoop QA commented on HBASE-6427:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538259/6427-v7.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterNoCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2451//console

This message is automatically generated.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424464#comment-13424464
 ] 

Lars Hofhansl commented on HBASE-6427:
--

bq. I'd not be opposed to adding additional hooks but that should be after 
exhausting other options here, IMHO, since they would be close to each other.

Do you think the latest proposal is too heavy handed? 
pre{Flush|Compact}ScannerOpen would be quite close to pre{Flush|Compact}. My 
reasoning was that an implementer could still override the relatively simple 
pre{Flush|Compact} hooks.
If these are too many, we can still have only the fewer hooks, and then we'd 
need some NullScanner approach I think.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424466#comment-13424466
 ] 

Andrew Purtell commented on HBASE-6427:
---

It's fine. I like how you made the new APIs about opening (internal) scanners 
for various things. 

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423694#comment-13423694
 ] 

Hadoop QA commented on HBASE-6427:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538127/6427-v3.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 27 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 14 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.master.TestAssignmentManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2445//console

This message is automatically generated.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-27 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424270#comment-13424270
 ] 

Zhihong Ted Yu commented on HBASE-6427:
---

Some new files need license header.
{code}
+public class ScanPolicyCoprocessor extends BaseRegionObserver {
{code}
This class is an observer. Suggest renaming the class.
Annotations for audience and stability should be added.

For RegionCoprocessorHost.preCompact():
{code}
+  scanner = ((RegionObserver) env.getInstance()).preCompact(ctx, 
store, scanners,
+  scanType, earliestPutTs);
{code}
If there're multiple RegionObserver's, it seems only the final returned scanner 
would be returned. Is this intentional ?
Similar observation for preStoreScannerOpen() and preFlush()

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424273#comment-13424273
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Thanks Ted. You are right on all counts. Need to think about multiple 
coprocessors a bit more.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424275#comment-13424275
 ] 

Lars Hofhansl commented on HBASE-6427:
--

The current preCompact/preScannerOpen/other hooks handle this by passing the 
scanner from the previous coprocessor to the next one, so that each coprocessor 
has all the information needed.
I could do something similar here, although it would quickly get inscrutable 
for an implemented of these hooks; but I cannot think of anything better.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt, 
 6427-v3.txt, 6427-v4.txt, 6427-v5.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423305#comment-13423305
 ] 

Andrew Purtell commented on HBASE-6427:
---

Or use polymorphism?

{noformat}
+  public InternalScanner preFlush(final 
ObserverContextRegionCoprocessorEnvironment c,
+  Store store, KeyValueScanner scanner) throws IOException;
+
+  @Deprecated
   public void preFlush(ObserverContextRegionCoprocessorEnvironment e) throws 
IOException;
{noformat}

{noformat}
+  public InternalScanner preCompact(final 
ObserverContextRegionCoprocessorEnvironment c,
+  Store store, List? extends KeyValueScanner scanners, ScanType 
scanType, long earliestPutTs)
+  throws IOException;
+
+  @Deprecated
   public void preCompact(ObserverContextRegionCoprocessorEnvironment e
...
{noformat}


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423325#comment-13423325
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Uh... I like that.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423365#comment-13423365
 ] 

Lars Hofhansl commented on HBASE-6427:
--

The part I still have think through is how to handle actual use scans. Be 
default a user scan will also filter TTL/Versions, so it's one thing to prevent 
the KVs from being compacted away and another to actually make them visible to 
user scans.
A similar approach can be followed in preScannerOpen, as long as the 
coprocessor has enough access to internal region data structure to rebuild the 
default scanner.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423367#comment-13423367
 ] 

Andrew Purtell commented on HBASE-6427:
---

BaseRegionObserver should reimplement the default behavior in the new methods? 
Anybody who inherits would get it.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423368#comment-13423368
 ] 

Andrew Purtell commented on HBASE-6427:
---

Or, better yet, BaseRegionObserver calls out to a Store static method that does 
it, with some javadoc to make it clear what's going on?

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423372#comment-13423372
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Hmm... The default behavior is followed when RegionObserver.pre{flush|compact} 
return a null scanner, which is what BaseRegionObserver does by default.
BaseRegionObserver implementing the default behavior would not really buy 
anything (unless I am missing something).

As for last comment above, I think we'd need a preStoreScannerOpen, which would 
be called in Store.getScanner (right before the new StoreScanner is created) to 
allow the coprocessor to return a custom scanner here too.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423374#comment-13423374
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. The default behavior is followed when RegionObserver.pre{flush|compact} 
return a null scanner, which is what BaseRegionObserver does by default.

Fine, I was misled by the unit test code.

bq. I think we'd need a preStoreScannerOpen, which would be called in 
Store.getScanner (right before the new StoreScanner is created) to allow the 
coprocessor to return a custom scanner here too.

Sounds good to me.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423384#comment-13423384
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Of course this has to be compared to simply making the ScanInfo pluggable in 
Store.java. 

What I want to achieve here is to have an external process (backup tool, 
transaction engine, etc) to be able to override HBase's default TTL/#Versions 
with very high fidelity (i.e. not via a dynamic schema change, which is too 
heavyweight/slow).

The coprocessor approach is nice, because it provides a lot of flexibility for 
other future use cases and it does not invent a new concept. At the same time 
it adds complexity.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423387#comment-13423387
 ] 

Andrew Purtell commented on HBASE-6427:
---

bq. The coprocessor approach is nice, because it provides a lot of flexibility 
for other future use cases and it does not invent a new concept. At the same 
time it adds complexity.

On balance the API change here is nice because it extends something that was 
too limited to address your use case such that now it works for you, and it 
also admits the possibility of others.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423469#comment-13423469
 ] 

Andrew Purtell commented on HBASE-6427:
---

lgtm, good tests

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423501#comment-13423501
 ] 

Hadoop QA commented on HBASE-6427:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538075/6427-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 24 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 14 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.backup.example.TestZooKeeperTableArchiveClient
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.catalog.TestMetaReaderEditor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2440//console

This message is automatically generated.

 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6427) Pluggable compaction policies via coprocessors

2012-07-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423548#comment-13423548
 ] 

Lars Hofhansl commented on HBASE-6427:
--

I ran the failing tests locally, and they all pass.
Will sit on this a bit longer, write a test that tests the actual scenario I am 
interested in, etc.


 Pluggable compaction policies via coprocessors
 --

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: 6427-notReady.txt, 6427-v1.txt, 6427-v2.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira