[ 
https://issues.apache.org/jira/browse/OMID-102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579365#comment-16579365
 ] 

Yonatan Gottesman commented on OMID-102:
----------------------------------------

Hi [~jamestaylor],

I cannot seem to pass testCheckpointAndRollback in Phoenix.

When i debug the cells that get called in filterKeyValue(Cell v) in our filter, 
i can see im getting the same cells with different versions even if i return 
INCLUDE_AND_NEXT_COL, then i noticed Tephra wrap there filter with another 
CellSkipFilter that seems to take care of this problem. From the code comments:

/**
 * \{@link Filter} that encapsulates another \{@link Filter}. It remembers the 
last \{@link KeyValue}
 * for which the underlying filter returned the \{@link ReturnCode#NEXT_COL} or 
\{@link ReturnCode#INCLUDE_AND_NEXT_COL},
 * so that when \{@link #filterKeyValue} is called again for the same \{@link 
KeyValue} with different
 * version, it returns \{@link ReturnCode#NEXT_COL} directly without consulting 
the underlying \{@link Filter}.
 * Please see TEPHRA-169 for more details.
 */

Is this a known issue in hbase that cells with lower versions get called to 
filterKeyValue even if i return NEXT_COL?

 

 

> Implement visibility filter as pure HBase Filter
> ------------------------------------------------
>
>                 Key: OMID-102
>                 URL: https://issues.apache.org/jira/browse/OMID-102
>             Project: Apache Omid
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Yonatan Gottesman
>            Priority: Major
>
> The way Omid currently filters through it's own RegionScanner won't work the 
> way it's implemented (i.e. the way the filtering is done *after* the next 
> call). The reason is that the state of HBase filters get messed up since 
> these filters will start to see cells that it shouldn't (i.e. cells that 
> would be filtered based on snapshot isolation). It cannot be worked around by 
> manually running filters afterwards because filters may issue seek calls 
> which are handled during the running of scans by HBase.
>  
> Instead, the filtering needs to be implemented as a pure HBase filter and 
> that filter needs to delegate to the other, delegate filter once it's 
> determined that the cell is visible. See Tephra's TransactionVisibilityFilter 
> and they way it calls the delegate filter (cellFilters) only after it's 
> determined that the cell is visible. You may run into TEPHRA-169 without 
> including the CellSkipFilter too. 
> Because it'll be easier if you see shadow cells *before* their corresponding 
> real cells you can prefix instead of suffix the column qualifiers to 
> guarantee that you'd see the shadow cells prior to the actual cells. Or you 
> could buffer cells in your filter prior to omitting them. Another issue would 
> be if the shadow cells aren't found and you need to consult the commit table 
> - I suppose if the shadow cells are first, this logic would be easier to know 
> when it needs to be called.
>  
> To reproduce, see the Phoenix unit tests 
> FlappingTransactionIT.testInflightUpdateNotSeen() and 
> testInflightDeleteNotSeen().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to