[
https://issues.apache.org/jira/browse/PHOENIX-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040141#comment-17040141
]
chenglei edited comment on PHOENIX-5736 at 2/19/20 3:19 PM:
------------------------------------------------------------
[~kozdemir] ,sorry for late response, yes , you are right, this problem indeed
exists only on master branch.
When I verified the IT tests on master when I worked on PHOENIX-5494 some days
ago, I discovered this problem which caused some IT tests failed. So in order
to avoid this problem on the master, when the {{ReplayWrite}} is not null, I
disabled the pre-scanning all the rows in a batch and the behavior is same as
before PHOENIX-5494 when the {{ReplayWrite}} is not null , which is different
from the patch for 4.x-branches, just as the following code in
{{CachedLocalState}} class on master branch:
So I think the Index Rebuild is ok on the master.
{code:java}
@Override
public List<Cell> getCurrentRowState(
Mutation mutation,
Collection<? extends ColumnReference> columnReferences,
boolean ignoreNewerMutations) throws IOException {
if(ignoreNewerMutations) {
return doScan(mutation, columnReferences);
}
byte[] rowKey = mutation.getRow();
return this.rowKeyPtrToCells.get(new ImmutableBytesPtr(rowKey));
}
private List<Cell> doScan(Mutation mutation, Collection<? extends
ColumnReference> columnReferences) throws IOException {
byte[] rowKey = mutation.getRow();
// need to use a scan here so we can get raw state, which Get doesn't
provide.
Scan scan =
IndexManagementUtil.newLocalStateScan(Collections.singletonList(columnReferences));
scan.setStartRow(rowKey);
scan.setStopRow(rowKey);
// Provides a means of client indicating that newer cells should not be
considered,
// enabling mutations to be replayed to partially rebuild the index
when a write fails.
// When replaying mutations we want the oldest timestamp (as anything
newer we be replayed)
//long ts = getOldestTimestamp(m.getFamilyCellMap().values());
long ts = getMutationTimestampWhenAllCellTimestampIsSame(mutation);
scan.setTimeRange(0,ts);
try (RegionScanner regionScanner = region.getScanner(scan)) {
List<Cell> cells = new ArrayList<Cell>(1);
boolean more = regionScanner.next(cells);
assert !more : "Got more than one result when scanning"
+ " a single row in the primary table!";
return cells;
}
}
{code}
was (Author: comnetwork):
[~kozdemir] ,sorry for late response, yes , you are right, this problem indeed
exists only on master branch.
When I verified the IT tests on master when I worked on PHOENIX-5494 some days
ago, I discovered this problem which caused some IT tests failed. So in order
to avoid this problem on the master, when the {{ReplayWrite}} is not null, I
disabled the pre-scanning all the rows in a batch and the behavior is same as
before PHOENIX-5494 when the {{ReplayWrite}} is not null , which is different
from the patch for 4.x-branches, just as the following code in
{{CachedLocalState}} class on master branch:
So I think the Index Build is ok the master.
{code:java}
@Override
public List<Cell> getCurrentRowState(
Mutation mutation,
Collection<? extends ColumnReference> columnReferences,
boolean ignoreNewerMutations) throws IOException {
if(ignoreNewerMutations) {
return doScan(mutation, columnReferences);
}
byte[] rowKey = mutation.getRow();
return this.rowKeyPtrToCells.get(new ImmutableBytesPtr(rowKey));
}
private List<Cell> doScan(Mutation mutation, Collection<? extends
ColumnReference> columnReferences) throws IOException {
byte[] rowKey = mutation.getRow();
// need to use a scan here so we can get raw state, which Get doesn't
provide.
Scan scan =
IndexManagementUtil.newLocalStateScan(Collections.singletonList(columnReferences));
scan.setStartRow(rowKey);
scan.setStopRow(rowKey);
// Provides a means of client indicating that newer cells should not be
considered,
// enabling mutations to be replayed to partially rebuild the index
when a write fails.
// When replaying mutations we want the oldest timestamp (as anything
newer we be replayed)
//long ts = getOldestTimestamp(m.getFamilyCellMap().values());
long ts = getMutationTimestampWhenAllCellTimestampIsSame(mutation);
scan.setTimeRange(0,ts);
try (RegionScanner regionScanner = region.getScanner(scan)) {
List<Cell> cells = new ArrayList<Cell>(1);
boolean more = regionScanner.next(cells);
assert !more : "Got more than one result when scanning"
+ " a single row in the primary table!";
return cells;
}
}
{code}
> Mutable global index rebuilds are incorrect after PHOENIX-5494
> --------------------------------------------------------------
>
> Key: PHOENIX-5736
> URL: https://issues.apache.org/jira/browse/PHOENIX-5736
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: Kadir OZDEMIR
> Priority: Critical
> Attachments: skipScanTest.txt
>
>
> PHOENIX-5494 uses skip scans to improve write performance for tables with
> indexes. Before this jira, a separate scanner was opened for each data table
> mutation to read all versions and delete markers of for the row to be
> mutated. With this jira, a single scanner is opened using a raw scan with a
> skip scan filter to read all versions and delete markers of the all rows in a
> batch. Reading existing data table rows is required to generate index updates.
> However, I have discovered that a raw scan with a skip scan filter does not
> return all raw versions. This means that after PHOENIX-5494 index rebuilds
> for global indexes will not be correct.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)