[ 
https://issues.apache.org/jira/browse/PHOENIX-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856157#comment-16856157
 ] 

Karan Mehta commented on PHOENIX-5313:
--------------------------------------

If the original scan maps to a range that crosses several regions, currently we 
simply launch containers for all regions and pass the scan object to those. The 
scan will intersect the boundaries and issue smaller scans accordingly to 
HBase. Also, the region boundaries can potentially change by the time MR job 
was configured v/s the time it was run (due to scheduling), is probably the 
main reason of doing it this way, so that results might be correct. It might 
result in scans running over boundaries and potentially throwing exceptions, 
which need to be handled well.

In order to ease this, a consistent view should be always used, which snapshots 
can provide.

I voiced that issue out for MR over snapshots as well in PHOENIX-4009, even 
though its reading the manifest, its potentially doing it container number of 
times. The MR over snapshots also restores snapshot for each of the containers. 
The snapshots are light but they still end up creating lots of small folders 
and files temporarily, which can cause unnecessary issue to NN.

> All mappers grab all RegionLocations from .META
> -----------------------------------------------
>
>                 Key: PHOENIX-5313
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5313
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Geoffrey Jacoby
>            Assignee: Chinmay Kulkarni
>            Priority: Major
>
> Phoenix's MapReduce integration lives in PhoenixInputFormat. It implements 
> getSplits by calculating a QueryPlan for the provided SELECT query, and each 
> split gets a mapper. As part of this QueryPlan generation, we grab all 
> RegionLocations from .META
> In PhoenixInputFormat:getQueryPlan: 
> {code:java}
>  // Initialize the query plan so it sets up the parallel scans
>  queryPlan.iterator(MapReduceParallelScanGrouper.getInstance());
> {code}
> In MapReduceParallelScanGrouper.getRegionBoundaries()
> {code:java}
> return 
> context.getConnection().getQueryServices().getAllTableRegions(tableName);
> {code}
> This is fine.
> Unfortunately, each mapper Task spawned by the job will go through this 
> _same_ exercise. It will pass a MapReduceParallelScanGrouper to 
> queryPlan.iterator(), which I believe is eventually causing 
> getRegionBoundaries to get called when the scans are initialized in the 
> result iterator.
> Since HBase 1.x and up got rid of .META prefetching and caching within the 
> HBase client, that means that not only will each _Job_ make potentially 
> thousands of calls to .META, potentially thousands of _Tasks_ will each make 
> potentially thousands of calls to .META. 
> We should get a QueryPlan and setup the scans without having to read all 
> RegionLocations, either by using the mapper's internal knowledge of its split 
> key range, or by serializing the query plan from the client and sending it to 
> the mapper tasks for use there. 
> Note that MapReduce tasks over snapshots are not affected by this, because 
> region locations are stored in the snapshot manifest. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to