[ https://issues.apache.org/jira/browse/PHOENIX-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankit Singhal updated PHOENIX-6334: ----------------------------------- Fix Version/s: (was: 4.x) (was: 4.16.0) (was: 5.1.0) > All map tasks should operate on the same restored snapshot > ---------------------------------------------------------- > > Key: PHOENIX-6334 > URL: https://issues.apache.org/jira/browse/PHOENIX-6334 > Project: Phoenix > Issue Type: Bug > Components: core > Affects Versions: 5.0.0, 4.14.3 > Reporter: Saksham Gangwar > Assignee: Rushabh Shah > Priority: Major > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. > We also *need not restore the snapshot per map task*. The purpose of this > Jira is to correct that behavior. Currently, we restore the snapshot once per > map task into a temp directory. For large tables on big clusters, this > creates a storm of NN RPCs. We can do this once per job and let all the map > tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > > All other performance suggestions here: > https://issues.apache.org/jira/browse/PHOENIX-6081 > -- This message was sent by Atlassian Jira (v8.3.4#803005)