[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584734#comment-16584734 ] Yonatan Gottesman commented on OMID-102: Yes, I uploaded the patch to here, I am not a commiter, so we will wait for [~ohads] to return, apply the patch to phoenix-integration branch, and see everything is ok. Meanwhile ill take a look at why omid fails to build under jenkins - OMID-109 > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > Attachments: bug_fix.patch > > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583153#comment-16583153 ] James Taylor commented on OMID-102: --- Thanks for committing a fix, [~yonigo]. Do the tests now pass that I mentioned here? https://issues.apache.org/jira/browse/PHOENIX-3623?focusedCommentId=16574976=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16574976 > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575473#comment-16575473 ] Mujtaba Chohan commented on OMID-102: - [~jamestaylor] - it fails on my machine with the same error using mvn build parameters you listed. Can you try building on a Linux machine? Also not sure how this can ever work when the repo. it's trying to download from does not exist anymore. Try hitting this URL which returns a 404: [https://raw.github.com/synergian/wagon-git/releases] > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571535#comment-16571535 ] Yonatan Gottesman commented on OMID-102: [~jamestaylor] code is commited and ready to merge with phoenix-integration branch > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571343#comment-16571343 ] ASF GitHub Bot commented on OMID-102: - Github user yonigottesman commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r208160776 --- Diff: hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java --- @@ -227,24 +252,26 @@ private static boolean endsWith(byte[] value, int offset, int length, byte[] suf return result == 0; } +private static boolean startsWith(byte[] value, int offset, int length, byte[] prefix) { --- End diff -- No because Bytes.startWith doesnt work with offset > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571342#comment-16571342 ] ASF GitHub Bot commented on OMID-102: - Github user yonigottesman commented on the issue: https://github.com/apache/incubator-omid/pull/41 All FlappingTransactionIT tests pass > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570303#comment-16570303 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r207921192 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; + +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.io.ImmutableBytesWritable; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map commitCache; +private final HBaseTransaction hbaseTransaction; + +// This cache is cleared when moving to the next row +// So no need to keep row name +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +commitCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap<>(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +commitCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { +return ReturnCode.NEXT_COL; +} else { +return ReturnCode.SKIP; +} +} + +Optional ct = getCommitIfInSnapshot(v, CellUtils.isFamilyDeleteCell(v)); +if (ct.isPresent()) { +commitCache.put(v.getTimestamp(), ct.get()); +if (hbaseTransaction.getVisibilityLevel() == AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL && +snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) { +return runUserFilter(v, ReturnCode.INCLUDE); +} +if (CellUtils.isFamilyDeleteCell(v)) { +familyDeletionCache.put(createImmutableBytesWritable(v), ct.get()); +if (hbaseTransaction.getVisibilityLevel() == AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL) { +return runUserFilter(v, ReturnCode.INCLUDE_AND_NEXT_COL); +} else { +return ReturnCode.NEXT_COL; +} +} +Long deleteCommit = familyDeletionCache.get(createImmutableBytesWritable(v)); +if (deleteCommit != null && deleteCommit >= v.getTimestamp()) { +if (hbaseTransaction.getVisibilityLevel() == AbstractTransaction.VisibilityLevel.SNAPSHOT_ALL) { +return runUserFilter(v,
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570300#comment-16570300 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r207919955 --- Diff: hbase-coprocessor/src/test/java/org/apache/omid/transaction/TestSnapshotFilter.java --- @@ -226,8 +222,115 @@ public void testGetFirstResult() throws Throwable { } @Test(timeOut = 60_000) -public void testGetSecondResult() throws Throwable { --- End diff -- Is there a test which would have failed without this patch (i.e. one that demonstrates the need for having the visibility filtering done as a pure HBase filter)? > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570293#comment-16570293 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r207918413 --- Diff: hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java --- @@ -227,24 +252,26 @@ private static boolean endsWith(byte[] value, int offset, int length, byte[] suf return result == 0; } +private static boolean startsWith(byte[] value, int offset, int length, byte[] prefix) { --- End diff -- Can you just use Bytes.startsWith() instead? > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565538#comment-16565538 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on the issue: https://github.com/apache/incubator-omid/pull/41 This is great, @yonigottesman! Do the Phoenix unit tests FlappingTransactionIT.testInflightUpdateNotSeen() and testInflightDeleteNotSeen() pass with this change? You can try running them from the omid2 feature branch. > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565535#comment-16565535 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206936274 --- Diff: hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java --- @@ -52,6 +52,7 @@ static byte[] DELETE_TOMBSTONE = Bytes.toBytes("__OMID_TOMBSTONE__"); --- End diff -- Will Cells end up with these constant values and if so can we make them shorter? > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565532#comment-16565532 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206935434 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(@Nullable Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +shadowCellCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +shadowCellCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { +return ReturnCode.NEXT_COL; +} else { +return ReturnCode.SKIP; +} +} else if (CellUtils.isFamilyDeleteCell(v)) { +//Delete is part of this transaction +if (snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), v.getTimestamp()); +return ReturnCode.NEXT_COL; +} + +if (shadowCellCache.containsKey(v.getTimestamp()) && +hbaseTransaction.getStartTimestamp() >= shadowCellCache.get(v.getTimestamp())) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), shadowCellCache.get(v.getTimestamp())); +return ReturnCode.NEXT_COL; +} + +// Try to get shadow cell from region +final Get get = new Get(CellUtil.cloneRow(v)); +get.setTimeStamp(v.getTimestamp()).setMaxVersions(1); +get.addColumn(CellUtil.cloneFamily(v), CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER)); +Result deleteFamilySC = snapshotFilter.getTableAccessWrapper().get(get); + +if (!deleteFamilySC.isEmpty() && + Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < hbaseTransaction.getStartTimestamp()){ +
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565529#comment-16565529 ] ASF GitHub Bot commented on OMID-102: - Github user JamesRTaylor commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206934998 --- Diff: hbase-coprocessor/src/test/java/org/apache/omid/transaction/TestSnapshotFilter.java --- @@ -226,8 +222,115 @@ public void testGetFirstResult() throws Throwable { } @Test(timeOut = 60_000) -public void testGetSecondResult() throws Throwable { --- End diff -- Would be good to have a test that uses a scan with FirstKeyOnlyFilter that would get incorrect results without this new implementation. > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565272#comment-16565272 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206867892 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(@Nullable Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +shadowCellCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +shadowCellCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { +return ReturnCode.NEXT_COL; +} else { +return ReturnCode.SKIP; +} +} else if (CellUtils.isFamilyDeleteCell(v)) { +//Delete is part of this transaction +if (snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), v.getTimestamp()); +return ReturnCode.NEXT_COL; +} + +if (shadowCellCache.containsKey(v.getTimestamp()) && +hbaseTransaction.getStartTimestamp() >= shadowCellCache.get(v.getTimestamp())) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), shadowCellCache.get(v.getTimestamp())); +return ReturnCode.NEXT_COL; +} + +// Try to get shadow cell from region +final Get get = new Get(CellUtil.cloneRow(v)); +get.setTimeStamp(v.getTimestamp()).setMaxVersions(1); +get.addColumn(CellUtil.cloneFamily(v), CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER)); +Result deleteFamilySC = snapshotFilter.getTableAccessWrapper().get(get); + +if (!deleteFamilySC.isEmpty() && + Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < hbaseTransaction.getStartTimestamp()){ +
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565269#comment-16565269 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206860096 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; --- End diff -- I would add a comment that the row info is redundant in here since reset is called between rows and we clear this map in reset. > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565262#comment-16565262 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206852241 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(@Nullable Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +shadowCellCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +shadowCellCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { --- End diff -- Why do we keep ones that committed after the transaction timestamp? > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to >
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565268#comment-16565268 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206866085 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(@Nullable Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +shadowCellCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +shadowCellCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { +return ReturnCode.NEXT_COL; +} else { +return ReturnCode.SKIP; +} +} else if (CellUtils.isFamilyDeleteCell(v)) { +//Delete is part of this transaction +if (snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), v.getTimestamp()); +return ReturnCode.NEXT_COL; +} + +if (shadowCellCache.containsKey(v.getTimestamp()) && +hbaseTransaction.getStartTimestamp() >= shadowCellCache.get(v.getTimestamp())) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), shadowCellCache.get(v.getTimestamp())); +return ReturnCode.NEXT_COL; +} + +// Try to get shadow cell from region +final Get get = new Get(CellUtil.cloneRow(v)); +get.setTimeStamp(v.getTimestamp()).setMaxVersions(1); +get.addColumn(CellUtil.cloneFamily(v), CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER)); +Result deleteFamilySC = snapshotFilter.getTableAccessWrapper().get(get); + +if (!deleteFamilySC.isEmpty() && + Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < hbaseTransaction.getStartTimestamp()){ +
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565266#comment-16565266 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206858386 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; +import org.apache.hadoop.hbase.Cell; +import org.apache.hadoop.hbase.CellUtil; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.FilterBase; +import org.apache.hadoop.hbase.util.Bytes; + +import java.io.IOException; +import java.util.*; + +public class TransactionVisibilityFilter extends FilterBase { + +// optional sub-filter to apply to visible cells +private final Filter userFilter; +private final SnapshotFilterImpl snapshotFilter; +private final Map shadowCellCache; +private final HBaseTransaction hbaseTransaction; +private final Map familyDeletionCache; + +public SnapshotFilter getSnapshotFilter() { +return snapshotFilter; +} + +public TransactionVisibilityFilter(@Nullable Filter cellFilter, + SnapshotFilterImpl snapshotFilter, + HBaseTransaction hbaseTransaction) { +this.userFilter = cellFilter; +this.snapshotFilter = snapshotFilter; +shadowCellCache = new HashMap<>(); +this.hbaseTransaction = hbaseTransaction; +familyDeletionCache = new HashMap(); +} + +@Override +public ReturnCode filterKeyValue(Cell v) throws IOException { +if (CellUtils.isShadowCell(v)) { +Long commitTs = Bytes.toLong(CellUtil.cloneValue(v)); +shadowCellCache.put(v.getTimestamp(), commitTs); +// Continue getting shadow cells until one of them fits this transaction +if (hbaseTransaction.getStartTimestamp() >= commitTs) { +return ReturnCode.NEXT_COL; +} else { +return ReturnCode.SKIP; +} +} else if (CellUtils.isFamilyDeleteCell(v)) { +//Delete is part of this transaction +if (snapshotFilter.getTSIfInTransaction(v, hbaseTransaction).isPresent()) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), v.getTimestamp()); +return ReturnCode.NEXT_COL; +} + +if (shadowCellCache.containsKey(v.getTimestamp()) && +hbaseTransaction.getStartTimestamp() >= shadowCellCache.get(v.getTimestamp())) { + familyDeletionCache.put(Bytes.toString(CellUtil.cloneFamily(v)), shadowCellCache.get(v.getTimestamp())); +return ReturnCode.NEXT_COL; +} + +// Try to get shadow cell from region +final Get get = new Get(CellUtil.cloneRow(v)); +get.setTimeStamp(v.getTimestamp()).setMaxVersions(1); +get.addColumn(CellUtil.cloneFamily(v), CellUtils.addShadowCellSuffix(CellUtils.FAMILY_DELETE_QUALIFIER)); +Result deleteFamilySC = snapshotFilter.getTableAccessWrapper().get(get); + +if (!deleteFamilySC.isEmpty() && + Bytes.toLong(CellUtil.cloneValue(deleteFamilySC.rawCells()[0] )) < hbaseTransaction.getStartTimestamp()){ +
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565261#comment-16565261 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206843754 --- Diff: hbase-common/src/main/java/org/apache/omid/transaction/CellUtils.java --- @@ -382,13 +385,16 @@ public int hashCode() { hasher.putBytes(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength()); hasher.putBytes(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength()); int qualifierLength = cell.getQualifierLength(); +int qualifierOffset = cell.getQualifierOffset(); if (isShadowCell()) { // Update qualifier length when qualifier is shadow cell qualifierLength = qualifierLengthFromShadowCellQualifier(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength()); +qualifierOffset = qualifierOffset + 1; --- End diff -- Will it work when the shadow cell prefix is absent? legacy data. > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565267#comment-16565267 ] ASF GitHub Bot commented on OMID-102: - Github user ohadshacham commented on a diff in the pull request: https://github.com/apache/incubator-omid/pull/41#discussion_r206849947 --- Diff: hbase-coprocessor/src/main/java/org/apache/omid/transaction/TransactionVisibilityFilter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.omid.transaction; + +import com.google.common.base.Optional; +import com.sun.istack.Nullable; --- End diff -- Different Nullable usage > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565061#comment-16565061 ] ASF GitHub Bot commented on OMID-102: - GitHub user yonigottesman reopened a pull request: https://github.com/apache/incubator-omid/pull/41 [OMID-102] Support for user Filter when using coprocessor for snapshot filtering You can merge this pull request into a Git repository by running: $ git pull https://github.com/yonigottesman/incubator-omid scan_filters Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-omid/pull/41.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #41 commit 41554ec4c3cdfdbfd271e9adbef866b3653c3382 Author: Yonatan Gottesman Date: 2018-07-25T06:40:14Z Support for user Filter when using coprocessor for snapshot filtering commit ba575384dcdca427bf80e55dcee58a1ed93bf744 Author: Yonatan Gottesman Date: 2018-07-25T07:41:42Z In coprocessor filtering, get shadow cell of delete family before going to commit table commit 26469a0893b4c9bc36e68d66583951fb6e82cf9d Author: Yonatan Gottesman Date: 2018-07-26T08:15:27Z add inMemoryCommitTable client option in omid coprocessor for testing commit 8d632670ab580ca04e0c5cec555211d1fa71a11a Author: Yonatan Gottesman Date: 2018-07-31T11:29:13Z Fix visibilityFilter to check if delete family is in current TX commit b2980c7cea2ef9c83a772d4ce3bd9a82f5ae4d81 Author: Yonatan Gottesman Date: 2018-08-01T09:34:00Z Merge OMID-105 changes > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-102) Implement visibility filter as pure HBase Filter
[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565060#comment-16565060 ] ASF GitHub Bot commented on OMID-102: - Github user yonigottesman closed the pull request at: https://github.com/apache/incubator-omid/pull/41 > Implement visibility filter as pure HBase Filter > > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)