[
https://issues.apache.org/jira/browse/PHOENIX-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868097#comment-16868097
]
Chinmay Kulkarni commented on PHOENIX-5313:
-------------------------------------------
[~tdsilva] [~vincentpoon] thanks for the feedback. Will attach a patch for
Hadoop QA to run.
[~vincentpoon] Thanks for pointing out the corner case! That would definitely
be a problem since as you mentioned, the issued scans would not match the
queryPlan embedded in the mappers' iterators/ResultSet. We could potentially
miss some scans/be looking for more than we actually require since we check the
original scans for this size. The resolved table would be as per the new
queryPlan, and there could be a mismatch here as well (considering the index
creation case you mentioned). There are potentially other repercussions in case
of intermediary metadata changes as well.
I also totally agree that we don't need all fields/information inside the
QueryPlan in the mappers. Passing in a subset of the information of a QueryPlan
to the mappers without having them regenerate the plans seems like the best way
forward. I will create a follow-up JIRA for this overall improvement to Phoenix
MR/Phoenix-Spark modules.
> All mappers grab all RegionLocations from .META
> -----------------------------------------------
>
> Key: PHOENIX-5313
> URL: https://issues.apache.org/jira/browse/PHOENIX-5313
> Project: Phoenix
> Issue Type: Bug
> Reporter: Geoffrey Jacoby
> Assignee: Chinmay Kulkarni
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Phoenix's MapReduce integration lives in PhoenixInputFormat. It implements
> getSplits by calculating a QueryPlan for the provided SELECT query, and each
> split gets a mapper. As part of this QueryPlan generation, we grab all
> RegionLocations from .META
> In PhoenixInputFormat:getQueryPlan:
> {code:java}
> // Initialize the query plan so it sets up the parallel scans
> queryPlan.iterator(MapReduceParallelScanGrouper.getInstance());
> {code}
> In MapReduceParallelScanGrouper.getRegionBoundaries()
> {code:java}
> return
> context.getConnection().getQueryServices().getAllTableRegions(tableName);
> {code}
> This is fine.
> Unfortunately, each mapper Task spawned by the job will go through this
> _same_ exercise. It will pass a MapReduceParallelScanGrouper to
> queryPlan.iterator(), which I believe is eventually causing
> getRegionBoundaries to get called when the scans are initialized in the
> result iterator.
> Since HBase 1.x and up got rid of .META prefetching and caching within the
> HBase client, that means that not only will each _Job_ make potentially
> thousands of calls to .META, potentially thousands of _Tasks_ will each make
> potentially thousands of calls to .META.
> We should get a QueryPlan and setup the scans without having to read all
> RegionLocations, either by using the mapper's internal knowledge of its split
> key range, or by serializing the query plan from the client and sending it to
> the mapper tasks for use there.
> Note that MapReduce tasks over snapshots are not affected by this, because
> region locations are stored in the snapshot manifest.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)