Bharath Vissapragada created PHOENIX-6081:
---------------------------------------------

             Summary: Improvements to snapshot based MR input format
                 Key: PHOENIX-6081
                 URL: https://issues.apache.org/jira/browse/PHOENIX-6081
             Project: Phoenix
          Issue Type: Improvement
          Components: core
    Affects Versions: 4.14.3, 4.15.1, master
            Reporter: Bharath Vissapragada


Recently we switched an MR application from scanning live tables to scanning 
snapshots (PHOENIX-3744). We ran into a severe performance issue, which turned 
out to a correctness issue due to over-lapping scan splits generation. After 
some debugging we figured that it has been fixed via PHOENIX-4997. Even with 
that fix there are quite a few things we could improve about the snapshot based 
input format. Listing them here, perhaps we can break them into subtasks as 
needed.

- Do not restore the snapshot per map task. Currently we restore the snapshot 
once per map task into a temp directory. For large tables on big clusters, this 
creates a storm of NN RPCs. We can do this once per job and let all the map 
tasks operate on the same restored snapshot. HBase already did this via 
HBASE-18806, we can do something similar.

- Disable 
[cacheBlocks|[https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks-boolean-]]
 on scans generated by input format. In our experiments block cache took a lot 
of memory for MR jobs. For full table scans this isn't of much use and can save 
a lot of memory.

- Short circuit live-table codepaths when snapshots are enabled. Currently some 
codepaths make live table HBase RPCs to get a bunch of data. For example
{noformat}
private List<InputSplit> generateSplits(final QueryPlan qplan, Configuration 
config) throws IOException {
    // We must call this in order to initialize the scans and splits from the 
query plan
  ....
// Get the RegionSizeCalculator
try(org.apache.hadoop.hbase.client.Connection connection =
            
HBaseFactoryProvider.getHConnectionFactory().createConnection(config)) {
RegionLocator regionLocator = 
connection.getRegionLocator(TableName.valueOf(tableName));
RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(regionLocator, 
connection
        .getAdmin()); {noformat}
This defeats the purpose of using snapshots. Refactor the code in a way that 
the snapshot based codepaths do minimal HBase RPCs and rely solely on snapshot 
manifest. Even better, improve locality of task scheduling based on snapshot's 
hfile block locations.

- Disable indexes for query plan for scanning over snapshots. If there is an 
index based access path, getScans() can potentially return index based splits 
which is not what we want for snapshots.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to