[jira] [Updated] (PHOENIX-6081) Improvements to snapshot based MR input format
[ https://issues.apache.org/jira/browse/PHOENIX-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated PHOENIX-6081: -- Fix Version/s: (was: 4.16.1) 4.16.2 > Improvements to snapshot based MR input format > -- > > Key: PHOENIX-6081 > URL: https://issues.apache.org/jira/browse/PHOENIX-6081 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.0.0, 4.14.3 >Reporter: Bharath Vissapragada >Priority: Major > Fix For: 4.17.0, 4.16.2 > > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. Even > with that fix there are quite a few things we could improve about the > snapshot based input format. Listing them here, perhaps we can break them > into subtasks as needed. > - Do not restore the snapshot per map task. Currently we restore the snapshot > once per map task into a temp directory. For large tables on big clusters, > this creates a storm of NN RPCs. We can do this once per job and let all the > map tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > - Disable > [cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]] > on scans generated by input format. In our experiments block cache took a > lot of memory for MR jobs. For full table scans this isn't of much use and > can save a lot of memory. > - Short circuit live-table codepaths when snapshots are enabled. Currently > some codepaths make live table HBase RPCs to get a bunch of data. For example > {noformat} > private List generateSplits(final QueryPlan qplan, Configuration > config) throws IOException { > // We must call this in order to initialize the scans and splits from the > query plan > > // Get the RegionSizeCalculator > try(org.apache.hadoop.hbase.client.Connection connection = > > HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) { > RegionLocator regionLocator = > connection.getRegionLocator(TableName.valueOf(tableName)); > RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, > connection > .getAdmin()); {noformat} > This defeats the purpose of using snapshots. Refactor the code in a way that > the snapshot based codepaths do minimal HBase RPCs and rely solely on > snapshot manifest. Even better, improve locality of task scheduling based on > snapshot's hfile block locations. > - Disable indexes for query plan for scanning over snapshots. If there is an > index based access path, getScans() can potentially return index based splits > which is not what we want for snapshots. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (PHOENIX-6081) Improvements to snapshot based MR input format
[ https://issues.apache.org/jira/browse/PHOENIX-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated PHOENIX-6081: -- Affects Version/s: (was: master) 5.0.0 > Improvements to snapshot based MR input format > -- > > Key: PHOENIX-6081 > URL: https://issues.apache.org/jira/browse/PHOENIX-6081 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.0.0, 4.14.3 >Reporter: Bharath Vissapragada >Priority: Major > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. Even > with that fix there are quite a few things we could improve about the > snapshot based input format. Listing them here, perhaps we can break them > into subtasks as needed. > - Do not restore the snapshot per map task. Currently we restore the snapshot > once per map task into a temp directory. For large tables on big clusters, > this creates a storm of NN RPCs. We can do this once per job and let all the > map tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > - Disable > [cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]] > on scans generated by input format. In our experiments block cache took a > lot of memory for MR jobs. For full table scans this isn't of much use and > can save a lot of memory. > - Short circuit live-table codepaths when snapshots are enabled. Currently > some codepaths make live table HBase RPCs to get a bunch of data. For example > {noformat} > private List generateSplits(final QueryPlan qplan, Configuration > config) throws IOException { > // We must call this in order to initialize the scans and splits from the > query plan > > // Get the RegionSizeCalculator > try(org.apache.hadoop.hbase.client.Connection connection = > > HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) { > RegionLocator regionLocator = > connection.getRegionLocator(TableName.valueOf(tableName)); > RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, > connection > .getAdmin()); {noformat} > This defeats the purpose of using snapshots. Refactor the code in a way that > the snapshot based codepaths do minimal HBase RPCs and rely solely on > snapshot manifest. Even better, improve locality of task scheduling based on > snapshot's hfile block locations. > - Disable indexes for query plan for scanning over snapshots. If there is an > index based access path, getScans() can potentially return index based splits > which is not what we want for snapshots. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (PHOENIX-6081) Improvements to snapshot based MR input format
[ https://issues.apache.org/jira/browse/PHOENIX-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated PHOENIX-6081: -- Fix Version/s: 4.16.1 > Improvements to snapshot based MR input format > -- > > Key: PHOENIX-6081 > URL: https://issues.apache.org/jira/browse/PHOENIX-6081 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.0.0, 4.14.3 >Reporter: Bharath Vissapragada >Priority: Major > Fix For: 4.16.1 > > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. Even > with that fix there are quite a few things we could improve about the > snapshot based input format. Listing them here, perhaps we can break them > into subtasks as needed. > - Do not restore the snapshot per map task. Currently we restore the snapshot > once per map task into a temp directory. For large tables on big clusters, > this creates a storm of NN RPCs. We can do this once per job and let all the > map tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > - Disable > [cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]] > on scans generated by input format. In our experiments block cache took a > lot of memory for MR jobs. For full table scans this isn't of much use and > can save a lot of memory. > - Short circuit live-table codepaths when snapshots are enabled. Currently > some codepaths make live table HBase RPCs to get a bunch of data. For example > {noformat} > private List generateSplits(final QueryPlan qplan, Configuration > config) throws IOException { > // We must call this in order to initialize the scans and splits from the > query plan > > // Get the RegionSizeCalculator > try(org.apache.hadoop.hbase.client.Connection connection = > > HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) { > RegionLocator regionLocator = > connection.getRegionLocator(TableName.valueOf(tableName)); > RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, > connection > .getAdmin()); {noformat} > This defeats the purpose of using snapshots. Refactor the code in a way that > the snapshot based codepaths do minimal HBase RPCs and rely solely on > snapshot manifest. Even better, improve locality of task scheduling based on > snapshot's hfile block locations. > - Disable indexes for query plan for scanning over snapshots. If there is an > index based access path, getScans() can potentially return index based splits > which is not what we want for snapshots. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (PHOENIX-6081) Improvements to snapshot based MR input format
[ https://issues.apache.org/jira/browse/PHOENIX-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated PHOENIX-6081: -- Fix Version/s: 4.17.0 > Improvements to snapshot based MR input format > -- > > Key: PHOENIX-6081 > URL: https://issues.apache.org/jira/browse/PHOENIX-6081 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 5.0.0, 4.14.3 >Reporter: Bharath Vissapragada >Priority: Major > Fix For: 4.16.1, 4.17.0 > > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. Even > with that fix there are quite a few things we could improve about the > snapshot based input format. Listing them here, perhaps we can break them > into subtasks as needed. > - Do not restore the snapshot per map task. Currently we restore the snapshot > once per map task into a temp directory. For large tables on big clusters, > this creates a storm of NN RPCs. We can do this once per job and let all the > map tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > - Disable > [cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]] > on scans generated by input format. In our experiments block cache took a > lot of memory for MR jobs. For full table scans this isn't of much use and > can save a lot of memory. > - Short circuit live-table codepaths when snapshots are enabled. Currently > some codepaths make live table HBase RPCs to get a bunch of data. For example > {noformat} > private List generateSplits(final QueryPlan qplan, Configuration > config) throws IOException { > // We must call this in order to initialize the scans and splits from the > query plan > > // Get the RegionSizeCalculator > try(org.apache.hadoop.hbase.client.Connection connection = > > HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) { > RegionLocator regionLocator = > connection.getRegionLocator(TableName.valueOf(tableName)); > RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, > connection > .getAdmin()); {noformat} > This defeats the purpose of using snapshots. Refactor the code in a way that > the snapshot based codepaths do minimal HBase RPCs and rely solely on > snapshot manifest. Even better, improve locality of task scheduling based on > snapshot's hfile block locations. > - Disable indexes for query plan for scanning over snapshots. If there is an > index based access path, getScans() can potentially return index based splits > which is not what we want for snapshots. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (PHOENIX-6081) Improvements to snapshot based MR input format
[ https://issues.apache.org/jira/browse/PHOENIX-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated PHOENIX-6081: -- Affects Version/s: (was: 4.15.1) > Improvements to snapshot based MR input format > -- > > Key: PHOENIX-6081 > URL: https://issues.apache.org/jira/browse/PHOENIX-6081 > Project: Phoenix > Issue Type: Improvement > Components: core >Affects Versions: 4.14.3, master >Reporter: Bharath Vissapragada >Priority: Major > > Recently we switched an MR application from scanning live tables to scanning > snapshots (PHOENIX-3744). We ran into a severe performance issue, which > turned out to a correctness issue due to over-lapping scan splits generation. > After some debugging we figured that it has been fixed via PHOENIX-4997. Even > with that fix there are quite a few things we could improve about the > snapshot based input format. Listing them here, perhaps we can break them > into subtasks as needed. > - Do not restore the snapshot per map task. Currently we restore the snapshot > once per map task into a temp directory. For large tables on big clusters, > this creates a storm of NN RPCs. We can do this once per job and let all the > map tasks operate on the same restored snapshot. HBase already did this via > HBASE-18806, we can do something similar. > - Disable > [cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]] > on scans generated by input format. In our experiments block cache took a > lot of memory for MR jobs. For full table scans this isn't of much use and > can save a lot of memory. > - Short circuit live-table codepaths when snapshots are enabled. Currently > some codepaths make live table HBase RPCs to get a bunch of data. For example > {noformat} > private List generateSplits(final QueryPlan qplan, Configuration > config) throws IOException { > // We must call this in order to initialize the scans and splits from the > query plan > > // Get the RegionSizeCalculator > try(org.apache.hadoop.hbase.client.Connection connection = > > HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) { > RegionLocator regionLocator = > connection.getRegionLocator(TableName.valueOf(tableName)); > RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, > connection > .getAdmin()); {noformat} > This defeats the purpose of using snapshots. Refactor the code in a way that > the snapshot based codepaths do minimal HBase RPCs and rely solely on > snapshot manifest. Even better, improve locality of task scheduling based on > snapshot's hfile block locations. > - Disable indexes for query plan for scanning over snapshots. If there is an > index based access path, getScans() can potentially return index based splits > which is not what we want for snapshots. -- This message was sent by Atlassian Jira (v8.3.4#803005)