[ https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579365#comment-16579365 ]
Yonatan Gottesman commented on OMID-102: ---------------------------------------- Hi [~jamestaylor], I cannot seem to pass testCheckpointAndRollback in Phoenix. When i debug the cells that get called in filterKeyValue(Cell v) in our filter, i can see im getting the same cells with different versions even if i return INCLUDE_AND_NEXT_COL, then i noticed Tephra wrap there filter with another CellSkipFilter that seems to take care of this problem. From the code comments: /** * \{@link Filter} that encapsulates another \{@link Filter}. It remembers the last \{@link KeyValue} * for which the underlying filter returned the \{@link ReturnCode#NEXT_COL} or \{@link ReturnCode#INCLUDE_AND_NEXT_COL}, * so that when \{@link #filterKeyValue} is called again for the same \{@link KeyValue} with different * version, it returns \{@link ReturnCode#NEXT_COL} directly without consulting the underlying \{@link Filter}. * Please see TEPHRA-169 for more details. */ Is this a known issue in hbase that cells with lower versions get called to filterKeyValue even if i return NEXT_COL? > Implement visibility filter as pure HBase Filter > ------------------------------------------------ > > Key: OMID-102 > URL: https://issues.apache.org/jira/browse/OMID-102 > Project: Apache Omid > Issue Type: Sub-task > Reporter: James Taylor > Assignee: Yonatan Gottesman > Priority: Major > > The way Omid currently filters through it's own RegionScanner won't work the > way it's implemented (i.e. the way the filtering is done *after* the next > call). The reason is that the state of HBase filters get messed up since > these filters will start to see cells that it shouldn't (i.e. cells that > would be filtered based on snapshot isolation). It cannot be worked around by > manually running filters afterwards because filters may issue seek calls > which are handled during the running of scans by HBase. > > Instead, the filtering needs to be implemented as a pure HBase filter and > that filter needs to delegate to the other, delegate filter once it's > determined that the cell is visible. See Tephra's TransactionVisibilityFilter > and they way it calls the delegate filter (cellFilters) only after it's > determined that the cell is visible. You may run into TEPHRA-169 without > including the CellSkipFilter too. > Because it'll be easier if you see shadow cells *before* their corresponding > real cells you can prefix instead of suffix the column qualifiers to > guarantee that you'd see the shadow cells prior to the actual cells. Or you > could buffer cells in your filter prior to omitting them. Another issue would > be if the shadow cells aren't found and you need to consult the commit table > - I suppose if the shadow cells are first, this logic would be easier to know > when it needs to be called. > > To reproduce, see the Phoenix unit tests > FlappingTransactionIT.testInflightUpdateNotSeen() and > testInflightDeleteNotSeen(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)