[ 
https://issues.apache.org/jira/browse/PHOENIX-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040141#comment-17040141
 ] 

chenglei edited comment on PHOENIX-5736 at 2/19/20 3:16 PM:
------------------------------------------------------------

[~kozdemir] ,sorry for late response, yes , you are right, this problem indeed 
exists only on master branch. 

When I ran the IT tests on master for PHOENIX-5494 some days ago, I discovered 
this problem which caused some IT tests failed.  So in order to avoid this 
problem on the master, when the {{ReplayWrite}} is not null, I disabled the 
pre-scanning all the rows in a batch and the behavior is  same as before 
PHOENIX-5494 when the {{ReplayWrite}} is not null , which is different from the 
patch for 4.x-branches, just as the following code in {{CachedLocalState}} 
class on master branch:

{code:java}
   @Override
    public List<Cell> getCurrentRowState(
            Mutation mutation,
            Collection<? extends ColumnReference> columnReferences,
            boolean ignoreNewerMutations) throws IOException {

        if(ignoreNewerMutations) {
            return doScan(mutation, columnReferences);
        }

        byte[] rowKey = mutation.getRow();
        return this.rowKeyPtrToCells.get(new ImmutableBytesPtr(rowKey));
    }

    private List<Cell> doScan(Mutation mutation, Collection<? extends 
ColumnReference> columnReferences) throws IOException {
        byte[] rowKey = mutation.getRow();
        // need to use a scan here so we can get raw state, which Get doesn't 
provide.
        Scan scan = 
IndexManagementUtil.newLocalStateScan(Collections.singletonList(columnReferences));
        scan.setStartRow(rowKey);
        scan.setStopRow(rowKey);

        // Provides a means of client indicating that newer cells should not be 
considered,
        // enabling mutations to be replayed to partially rebuild the index 
when a write fails.
        // When replaying mutations we want the oldest timestamp (as anything 
newer we be replayed)
        //long ts = getOldestTimestamp(m.getFamilyCellMap().values());
        long ts = getMutationTimestampWhenAllCellTimestampIsSame(mutation);
        scan.setTimeRange(0,ts);

        try (RegionScanner regionScanner = region.getScanner(scan)) {
            List<Cell> cells = new ArrayList<Cell>(1);
            boolean more = regionScanner.next(cells);
            assert !more : "Got more than one result when scanning"
                + " a single row in the primary table!";

            return cells;
         }
    }
{code}


 


was (Author: comnetwork):
[~kozdemir] ,sorry for late response, yes , you are right, this problem indeed 
exists only on master branch. 

When I ran the IT tests on master for PHOENIX-5494 some days ago, I discovered 
this problem which caused some IT tests failed.  So in order to avoid this 
problem on the master, when the {{ReplayWrite}} is not null, I disabled the 
pre-scanning all the rows in a batch and the behavior is  same as before 
PHOENIX-5494 when the {{ReplayWrite}} is not null ,which is different from the 
patch for 4.x-branches, just as the following code in {{CachedLocalState}} 
class on master branch:

{code:java}
   @Override
    public List<Cell> getCurrentRowState(
            Mutation mutation,
            Collection<? extends ColumnReference> columnReferences,
            boolean ignoreNewerMutations) throws IOException {

        if(ignoreNewerMutations) {
            return doScan(mutation, columnReferences);
        }

        byte[] rowKey = mutation.getRow();
        return this.rowKeyPtrToCells.get(new ImmutableBytesPtr(rowKey));
    }

    private List<Cell> doScan(Mutation mutation, Collection<? extends 
ColumnReference> columnReferences) throws IOException {
        byte[] rowKey = mutation.getRow();
        // need to use a scan here so we can get raw state, which Get doesn't 
provide.
        Scan scan = 
IndexManagementUtil.newLocalStateScan(Collections.singletonList(columnReferences));
        scan.setStartRow(rowKey);
        scan.setStopRow(rowKey);

        // Provides a means of client indicating that newer cells should not be 
considered,
        // enabling mutations to be replayed to partially rebuild the index 
when a write fails.
        // When replaying mutations we want the oldest timestamp (as anything 
newer we be replayed)
        //long ts = getOldestTimestamp(m.getFamilyCellMap().values());
        long ts = getMutationTimestampWhenAllCellTimestampIsSame(mutation);
        scan.setTimeRange(0,ts);

        try (RegionScanner regionScanner = region.getScanner(scan)) {
            List<Cell> cells = new ArrayList<Cell>(1);
            boolean more = regionScanner.next(cells);
            assert !more : "Got more than one result when scanning"
                + " a single row in the primary table!";

            return cells;
         }
    }
{code}


 

> Mutable global index rebuilds are incorrect after PHOENIX-5494
> --------------------------------------------------------------
>
>                 Key: PHOENIX-5736
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5736
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Kadir OZDEMIR
>            Priority: Critical
>         Attachments: skipScanTest.txt
>
>
> PHOENIX-5494 uses skip scans to improve write performance for tables with 
> indexes. Before this jira, a separate scanner was opened for each data table 
> mutation to read all versions and delete markers of for the row to be 
> mutated. With this jira, a single scanner is opened using a raw scan with a 
> skip scan filter to read all versions and delete markers of the all rows in a 
> batch. Reading existing data table rows is required to generate index updates.
> However, I have discovered that a raw scan with a skip scan filter does not 
> return all raw versions. This means that after PHOENIX-5494 index rebuilds 
> for global indexes will not be correct. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to