[
https://issues.apache.org/jira/browse/PHOENIX-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667612#comment-16667612
]
ASF GitHub Bot commented on PHOENIX-4997:
-----------------------------------------
Github user twdsilva commented on a diff in the pull request:
https://github.com/apache/phoenix/pull/397#discussion_r229055918
--- Diff:
phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java
---
@@ -80,18 +83,39 @@ public boolean shouldStartNewScan(QueryPlan plan,
List<Scan> scans,
}
}
+ /**
+ * Get list of region locations from SnapshotManifest
+ * BaseResultIterators assume that regions are sorted using
RegionInfo.COMPARATOR
+ */
private List<HRegionLocation>
getRegionLocationsFromManifest(SnapshotManifest manifest) {
List<SnapshotRegionManifest> regionManifests =
manifest.getRegionManifests();
Preconditions.checkNotNull(regionManifests);
- List<HRegionLocation> regionLocations =
Lists.newArrayListWithCapacity(regionManifests.size());
+ List<RegionInfo> regionInfos =
Lists.newArrayListWithCapacity(regionManifests.size());
+ List<HRegionLocation> hRegionLocations =
Lists.newArrayListWithCapacity(regionManifests.size());
for (SnapshotRegionManifest regionManifest : regionManifests) {
- regionLocations.add(new HRegionLocation(
-
ProtobufUtil.toRegionInfo(regionManifest.getRegionInfo()), null));
+ RegionInfo regionInfo =
ProtobufUtil.toRegionInfo(regionManifest.getRegionInfo());
+ if (isValidRegion(regionInfo)) {
+ regionInfos.add(regionInfo);
+ }
+ }
+
+ regionInfos.sort(RegionInfo.COMPARATOR);
+
+ for (RegionInfo regionInfo : regionInfos) {
+ hRegionLocations.add(new HRegionLocation(regionInfo,
null));
}
- return regionLocations;
+ return hRegionLocations;
+ }
+
+ // Exclude offline split parent regions
+ private boolean isValidRegion(RegionInfo hri) {
--- End diff --
Maybe extract this to a util since its used in two classes.
> Phoenix MR on snapshots can produce duplicate rows
> --------------------------------------------------
>
> Key: PHOENIX-4997
> URL: https://issues.apache.org/jira/browse/PHOENIX-4997
> Project: Phoenix
> Issue Type: Bug
> Reporter: Karan Mehta
> Assignee: Karan Mehta
> Priority: Major
> Attachments: PHOENIX-4997.master.001.patch
>
>
> Phoenix MR over snapshots uses TableSnapshotResultIterator and
> SnapshotScanner classes for iterating/scanning over snapshots. They had been
> copied over from HBase classes TableSnapshotScanner and
> ClientSideRegionScanner classes and modified according to Phoenix
> requirements. This decision was taken since some of fields of these classes
> were private and hence it is not possible to reuse them. HBASE-8369 is the
> main Jira.
> The framework had a bug which was fixed as part of HBASE-16011. However the
> fix was not ported to Phoenix and hence Phoenix MR over snapshots still
> continues to have it. This Jira is to fix that issue.
> FYI [~akshita.malhotra]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)