[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-18 Thread Yonatan Gottesman (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584734#comment-16584734
 ] 

Yonatan Gottesman commented on OMID-102:


Yes, I uploaded the patch to here,

I am not a commiter, so we will wait for [~ohads] to return, apply the patch to 
phoenix-integration branch, and see everything is ok.

Meanwhile ill take a look at why omid fails to build under jenkins - OMID-109

> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
> Attachments: bug_fix.patch
>
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-16 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583153#comment-16583153
 ] 

James Taylor commented on OMID-102:
---

Thanks for committing a fix, [~yonigo]. Do the tests now pass that I mentioned 
here? 
https://issues.apache.org/jira/browse/PHOENIX-3623?focusedCommentId=16574976=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16574976

 

 

> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-09 Thread Mujtaba Chohan (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575473#comment-16575473
 ] 

Mujtaba Chohan commented on OMID-102:
-

[~jamestaylor] - it fails on my machine with the same error using mvn build 
parameters you listed. Can you try building on a Linux machine? Also not sure 
how this can ever work when the repo. it's trying to download from does not 
exist anymore. Try hitting this URL which returns a 404: 
[https://raw.github.com/synergian/wagon-git/releases] 

> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-07 Thread Yonatan Gottesman (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571535#comment-16571535
 ] 

Yonatan Gottesman commented on OMID-102:


[~jamestaylor] code is commited and ready to merge with phoenix-integration 
branch 

> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571343#comment-16571343
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user yonigottesman commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r208160776
  
--- Diff: 
hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java ---
@@ -227,24 +252,26 @@ private static boolean endsWith(byte[] value, int 
offset, int length, byte[] suf
 return result == 0;
 }
 
+private static boolean startsWith(byte[] value, int offset, int 
length, byte[] prefix) {
--- End diff --

No because Bytes.startWith doesnt work with offset 


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571342#comment-16571342
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user yonigottesman commented on the issue:

https://github.com/apache/incubator-omid/pull/41
  
All FlappingTransactionIT tests pass


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570303#comment-16570303
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r207921192
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,244 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map commitCache;
+private final HBaseTransaction hbaseTransaction;
+
+// This cache is cleared when moving to the next row
+// So no need to keep row name
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+commitCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap<>();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+commitCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
+return ReturnCode.NEXT_COL;
+} else {
+return ReturnCode.SKIP;
+}
+}
+
+Optional ct = getCommitIfInSnapshot(v, 
CellUtils.isFamilyDeleteCell(v));
+if (ct.isPresent()) {
+commitCache.put(v.getTimestamp(), ct.get());
+if (hbaseTransaction.getVisibilityLevel() == 
AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL &&
+snapshotFilter.getTSIfInTransaction(v, 
hbaseTransaction).isPresent()) {
+return runUserFilter(v, ReturnCode.INCLUDE);
+}
+if (CellUtils.isFamilyDeleteCell(v)) {
+familyDeletionCache.put(createImmutableBytesWritable(v), 
ct.get());
+if (hbaseTransaction.getVisibilityLevel() == 
AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL) {
+return runUserFilter(v, 
ReturnCode.INCLUDE_AND_NEXT_COL);
+} else {
+return ReturnCode.NEXT_COL;
+}
+}
+Long deleteCommit = 
familyDeletionCache.get(createImmutableBytesWritable(v));
+if (deleteCommit != null && deleteCommit >= v.getTimestamp()) {
+if (hbaseTransaction.getVisibilityLevel() == 
AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL) {
+return runUserFilter(v, 

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570300#comment-16570300
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r207919955
  
--- Diff: 
hbase-coprocessor/src/test/java/org/apache/omid/transaction/TestSnapshotFilter.java
 ---
@@ -226,8 +222,115 @@ public void testGetFirstResult() throws Throwable {
 }
 
 @Test(timeOut = 60_000)
-public void testGetSecondResult() throws Throwable {
--- End diff --

Is there a test which would have failed without this patch (i.e. one that 
demonstrates the need for having the visibility filtering done as a pure HBase 
filter)?


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570293#comment-16570293
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r207918413
  
--- Diff: 
hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java ---
@@ -227,24 +252,26 @@ private static boolean endsWith(byte[] value, int 
offset, int length, byte[] suf
 return result == 0;
 }
 
+private static boolean startsWith(byte[] value, int offset, int 
length, byte[] prefix) {
--- End diff --

Can you just use Bytes.startsWith() instead?


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565538#comment-16565538
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on the issue:

https://github.com/apache/incubator-omid/pull/41
  
This is great, @yonigottesman! Do the Phoenix unit tests 
FlappingTransactionIT.testInflightUpdateNotSeen() and 
testInflightDeleteNotSeen() pass with this change? You can try running them 
from the omid2 feature branch.


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565535#comment-16565535
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206936274
  
--- Diff: 
hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java ---
@@ -52,6 +52,7 @@
 static byte[] DELETE_TOMBSTONE = Bytes.toBytes("__OMID_TOMBSTONE__");
--- End diff --

Will Cells end up with these constant values and if so can we make them 
shorter?


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565532#comment-16565532
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206935434
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(@Nullable Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+shadowCellCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+shadowCellCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
+return ReturnCode.NEXT_COL;
+} else {
+return ReturnCode.SKIP;
+}
+} else if (CellUtils.isFamilyDeleteCell(v)) {
+//Delete is part of this transaction
+if (snapshotFilter.getTSIfInTransaction(v, 
hbaseTransaction).isPresent()) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
v.getTimestamp());
+return ReturnCode.NEXT_COL;
+}
+
+if (shadowCellCache.containsKey(v.getTimestamp()) &&
+hbaseTransaction.getStartTimestamp() >= 
shadowCellCache.get(v.getTimestamp())) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
shadowCellCache.get(v.getTimestamp()));
+return ReturnCode.NEXT_COL;
+}
+
+// Try to get shadow cell from region
+final Get get = new Get(CellUtil.cloneRow(v));
+get.setTimeStamp(v.getTimestamp()).setMaxVersions(1);
+get.addColumn(CellUtil.cloneFamily(v), 
CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER));
+Result deleteFamilySC = 
snapshotFilter.getTableAccessWrapper().get(get);
+
+if (!deleteFamilySC.isEmpty() &&
+
Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < 
hbaseTransaction.getStartTimestamp()){
+

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565529#comment-16565529
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user JamesRTaylor commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206934998
  
--- Diff: 
hbase-coprocessor/src/test/java/org/apache/omid/transaction/TestSnapshotFilter.java
 ---
@@ -226,8 +222,115 @@ public void testGetFirstResult() throws Throwable {
 }
 
 @Test(timeOut = 60_000)
-public void testGetSecondResult() throws Throwable {
--- End diff --

Would be good to have a test that uses a scan with FirstKeyOnlyFilter that 
would get incorrect results without this new implementation.


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565272#comment-16565272
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206867892
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(@Nullable Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+shadowCellCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+shadowCellCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
+return ReturnCode.NEXT_COL;
+} else {
+return ReturnCode.SKIP;
+}
+} else if (CellUtils.isFamilyDeleteCell(v)) {
+//Delete is part of this transaction
+if (snapshotFilter.getTSIfInTransaction(v, 
hbaseTransaction).isPresent()) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
v.getTimestamp());
+return ReturnCode.NEXT_COL;
+}
+
+if (shadowCellCache.containsKey(v.getTimestamp()) &&
+hbaseTransaction.getStartTimestamp() >= 
shadowCellCache.get(v.getTimestamp())) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
shadowCellCache.get(v.getTimestamp()));
+return ReturnCode.NEXT_COL;
+}
+
+// Try to get shadow cell from region
+final Get get = new Get(CellUtil.cloneRow(v));
+get.setTimeStamp(v.getTimestamp()).setMaxVersions(1);
+get.addColumn(CellUtil.cloneFamily(v), 
CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER));
+Result deleteFamilySC = 
snapshotFilter.getTableAccessWrapper().get(get);
+
+if (!deleteFamilySC.isEmpty() &&
+
Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < 
hbaseTransaction.getStartTimestamp()){
+

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565269#comment-16565269
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206860096
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
--- End diff --

I would add a comment that the row info is redundant in here since reset is 
called between rows and we clear this map in reset.


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565262#comment-16565262
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206852241
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(@Nullable Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+shadowCellCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+shadowCellCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
--- End diff --

Why do we keep ones that committed after the transaction timestamp?


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> 

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565268#comment-16565268
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206866085
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(@Nullable Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+shadowCellCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+shadowCellCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
+return ReturnCode.NEXT_COL;
+} else {
+return ReturnCode.SKIP;
+}
+} else if (CellUtils.isFamilyDeleteCell(v)) {
+//Delete is part of this transaction
+if (snapshotFilter.getTSIfInTransaction(v, 
hbaseTransaction).isPresent()) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
v.getTimestamp());
+return ReturnCode.NEXT_COL;
+}
+
+if (shadowCellCache.containsKey(v.getTimestamp()) &&
+hbaseTransaction.getStartTimestamp() >= 
shadowCellCache.get(v.getTimestamp())) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
shadowCellCache.get(v.getTimestamp()));
+return ReturnCode.NEXT_COL;
+}
+
+// Try to get shadow cell from region
+final Get get = new Get(CellUtil.cloneRow(v));
+get.setTimeStamp(v.getTimestamp()).setMaxVersions(1);
+get.addColumn(CellUtil.cloneFamily(v), 
CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER));
+Result deleteFamilySC = 
snapshotFilter.getTableAccessWrapper().get(get);
+
+if (!deleteFamilySC.isEmpty() &&
+
Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < 
hbaseTransaction.getStartTimestamp()){
+

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565266#comment-16565266
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206858386
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.KeyValue;
+import org.apache.hadoop.hbase.client.Get;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.filter.Filter;
+import org.apache.hadoop.hbase.filter.FilterBase;
+import org.apache.hadoop.hbase.util.Bytes;
+
+import java.io.IOException;
+import java.util.*;
+
+public class TransactionVisibilityFilter extends FilterBase {
+
+// optional sub-filter to apply to visible cells
+private final Filter userFilter;
+private final SnapshotFilterImpl snapshotFilter;
+private final Map shadowCellCache;
+private final HBaseTransaction hbaseTransaction;
+private final Map familyDeletionCache;
+
+public SnapshotFilter getSnapshotFilter() {
+return snapshotFilter;
+}
+
+public TransactionVisibilityFilter(@Nullable Filter cellFilter,
+   SnapshotFilterImpl snapshotFilter,
+   HBaseTransaction hbaseTransaction) {
+this.userFilter = cellFilter;
+this.snapshotFilter = snapshotFilter;
+shadowCellCache = new HashMap<>();
+this.hbaseTransaction = hbaseTransaction;
+familyDeletionCache = new HashMap();
+}
+
+@Override
+public ReturnCode filterKeyValue(Cell v) throws IOException {
+if (CellUtils.isShadowCell(v)) {
+Long commitTs =  Bytes.toLong(CellUtil.cloneValue(v));
+shadowCellCache.put(v.getTimestamp(), commitTs);
+// Continue getting shadow cells until one of them fits this 
transaction
+if (hbaseTransaction.getStartTimestamp() >= commitTs) {
+return ReturnCode.NEXT_COL;
+} else {
+return ReturnCode.SKIP;
+}
+} else if (CellUtils.isFamilyDeleteCell(v)) {
+//Delete is part of this transaction
+if (snapshotFilter.getTSIfInTransaction(v, 
hbaseTransaction).isPresent()) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
v.getTimestamp());
+return ReturnCode.NEXT_COL;
+}
+
+if (shadowCellCache.containsKey(v.getTimestamp()) &&
+hbaseTransaction.getStartTimestamp() >= 
shadowCellCache.get(v.getTimestamp())) {
+
familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), 
shadowCellCache.get(v.getTimestamp()));
+return ReturnCode.NEXT_COL;
+}
+
+// Try to get shadow cell from region
+final Get get = new Get(CellUtil.cloneRow(v));
+get.setTimeStamp(v.getTimestamp()).setMaxVersions(1);
+get.addColumn(CellUtil.cloneFamily(v), 
CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER));
+Result deleteFamilySC = 
snapshotFilter.getTableAccessWrapper().get(get);
+
+if (!deleteFamilySC.isEmpty() &&
+
Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < 
hbaseTransaction.getStartTimestamp()){
+

[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565261#comment-16565261
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206843754
  
--- Diff: 
hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java ---
@@ -382,13 +385,16 @@ public int hashCode() {
 hasher.putBytes(cell.getRowArray(), cell.getRowOffset(), 
cell.getRowLength());
 hasher.putBytes(cell.getFamilyArray(), cell.getFamilyOffset(), 
cell.getFamilyLength());
 int qualifierLength = cell.getQualifierLength();
+int qualifierOffset = cell.getQualifierOffset();
 if (isShadowCell()) { // Update qualifier length when 
qualifier is shadow cell
 qualifierLength = 
qualifierLengthFromShadowCellQualifier(cell.getQualifierArray(),
 cell.getQualifierOffset(),
 cell.getQualifierLength());
+qualifierOffset = qualifierOffset + 1;
--- End diff --

Will it work when the shadow cell prefix is absent? legacy data.


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565267#comment-16565267
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user ohadshacham commented on a diff in the pull request:

https://github.com/apache/incubator-omid/pull/41#discussion_r206849947
  
--- Diff: 
hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java
 ---
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.omid.transaction;
+
+import com.google.common.base.Optional;
+import com.sun.istack.Nullable;
--- End diff --

Different Nullable usage


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565061#comment-16565061
 ] 

ASF GitHub Bot commented on OMID-102:
-

GitHub user yonigottesman reopened a pull request:

https://github.com/apache/incubator-omid/pull/41

[OMID-102] Support for user Filter when using coprocessor for snapshot 
filtering



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yonigottesman/incubator-omid scan_filters

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-omid/pull/41.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #41


commit 41554ec4c3cdfdbfd271e9adbef866b3653c3382
Author: Yonatan Gottesman 
Date:   2018-07-25T06:40:14Z

Support for user Filter when using coprocessor for snapshot filtering

commit ba575384dcdca427bf80e55dcee58a1ed93bf744
Author: Yonatan Gottesman 
Date:   2018-07-25T07:41:42Z

In coprocessor filtering, get shadow cell of delete family before going to 
commit table

commit 26469a0893b4c9bc36e68d66583951fb6e82cf9d
Author: Yonatan Gottesman 
Date:   2018-07-26T08:15:27Z

add inMemoryCommitTable client option in omid coprocessor for testing

commit 8d632670ab580ca04e0c5cec555211d1fa71a11a
Author: Yonatan Gottesman 
Date:   2018-07-31T11:29:13Z

Fix visibilityFilter to check if delete family is in current TX

commit b2980c7cea2ef9c83a772d4ce3bd9a82f5ae4d81
Author: Yonatan Gottesman 
Date:   2018-08-01T09:34:00Z

Merge OMID-105 changes




> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565060#comment-16565060
 ] 

ASF GitHub Bot commented on OMID-102:
-

Github user yonigottesman closed the pull request at:

https://github.com/apache/incubator-omid/pull/41


> Implement visibility filter as pure HBase Filter
> 
>
> Key: OMID-102
> URL: https://issues.apache.org/jira/browse/OMID-102
> Project: Apache Omid
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Yonatan Gottesman
>Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)