[ 
https://issues.apache.org/jira/browse/PHOENIX-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877620#comment-15877620
 ] 

James Taylor commented on PHOENIX-3536:
---------------------------------------

Agreed, [~sergey.soldatov]. Good catch. I'm also concerned with the copy/paste 
from ScanPlan. I think the idea of the patch is good, but the implementation 
can be improved. Perhaps QueryPlan (or ScanPlan) can be made serializable. 
Also, why is the overhead so high for recompiling the plan? Is it the creation 
of the HConnection? I think we need more data on that.

> Remove creating unnecessary phoenix connections in MR Tasks of Hive
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-3536
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3536
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Jeongdae Kim
>            Assignee: Jeongdae Kim
>              Labels: HivePhoenix
>         Attachments: PHOENIX-3536.1.patch
>
>
> PhoenixStorageHandler creates phoenix connections to make QueryPlan in 
> getSplit phase(prepare MR) and getRecordReader phase(Map) while running MR 
> Job.
> in phoenix, it spends too many times to create the first phoenix 
> connection(QueryServices) for specific URL. (checking and loading phoenix 
> schema information)
> i found it is possible to remove creating query plan again in Map 
> phase(getRecordReader()) by serializing QueryPlan created from Input format 
> ans passing this plan to record reader. 
>  this approach improves scan performance by removing trying to unnecessary 
> connection in map phase.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to