[
https://issues.apache.org/jira/browse/PHOENIX-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankit Singhal updated PHOENIX-6334:
-----------------------------------
Fix Version/s: (was: 4.x)
(was: 4.16.0)
(was: 5.1.0)
> All map tasks should operate on the same restored snapshot
> ----------------------------------------------------------
>
> Key: PHOENIX-6334
> URL: https://issues.apache.org/jira/browse/PHOENIX-6334
> Project: Phoenix
> Issue Type: Bug
> Components: core
> Affects Versions: 5.0.0, 4.14.3
> Reporter: Saksham Gangwar
> Assignee: Rushabh Shah
> Priority: Major
>
> Recently we switched an MR application from scanning live tables to scanning
> snapshots (PHOENIX-3744). We ran into a severe performance issue, which
> turned out to a correctness issue due to over-lapping scan splits generation.
> After some debugging we figured that it has been fixed via PHOENIX-4997.
> We also *need not restore the snapshot per map task*. The purpose of this
> Jira is to correct that behavior. Currently, we restore the snapshot once per
> map task into a temp directory. For large tables on big clusters, this
> creates a storm of NN RPCs. We can do this once per job and let all the map
> tasks operate on the same restored snapshot. HBase already did this via
> HBASE-18806, we can do something similar.
>
> All other performance suggestions here:
> https://issues.apache.org/jira/browse/PHOENIX-6081
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)