[ 
https://issues.apache.org/jira/browse/IMPALA-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17761920#comment-17761920
 ] 

ASF subversion and git services commented on IMPALA-12408:
----------------------------------------------------------

Commit c49f5d2778d10e988ab4d926e3326de043c20fe1 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c49f5d277 ]

IMPALA-12408: Optimize HdfsScanNode.computeScanRangeLocations()

computeScanRangeLocations() could be very slow for tables
with large number of partitions. This patch tries to minimize
the use of two expensive function calls:
1. HdfsPartition.getLocation()
  - This looks like a simple property but actually decompresses
    the location string.
  - Was often called indirectly through getFsType().
  - After the patch it is only called once per partition.
2. hadoop.fs.FileSystem.getFileSystem()
  - Hadoop caches the FileSystem object but the key contains
    UserGroupInformation which is obtained with
    UserGroupInformation.getCurrentUser(), making the call costly.
  - As the user is always the same during Impala planning we can cache
    it simply by scheme + authority part of the location URI. After
    the patch getFileSystem() is called if scheme/authority is
    different than in the previous partition, leading to a single call
    for most tables.

Note that caching these values in HdfsPartition could also help
but preferred to avoid increasing the size of that class.

The patch also changes the implementation of how we count the number
of partitions per file system (to avoid the extra calls to
getFsType()). This made class SampledPartitionMetadata unnecessary and
reverted some of the changes in https://gerrit.cloudera.org/#/c/12282/

Benchmarks:
Measured using tpcds.store_sales (1824 partitions)
union all'd 256 times:
explain select * from tpcds_parquet.store_sales256;
Before patch: 8.8s
After patch: 1.1s

The improvement is also visible on full tpcds benchmark:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCDS(2) | parquet / none / none | 0.53    | -8.99%     | 0.29       | 
-10.78%        |
+----------+-----------------------+---------+------------+------------+----------------+
The effect is less significant on higher scale factors.

Testing:
- ran core tests

Change-Id: Icf3e9c169d65c15df6a6762cc68fbb477fe64a7c
Reviewed-on: http://gerrit.cloudera.org:8080/20434
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Planner spends lot of time in HdfsPartition.getLocation()
> ---------------------------------------------------------
>
>                 Key: IMPALA-12408
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12408
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Csaba Ringhofer
>            Assignee: Csaba Ringhofer
>            Priority: Major
>              Labels: performance
>
> For queries with lot of partitions the majority of planning time can be spent 
> decoding compressed partition locations. This can be often avoided, e.g. by 
> caching FsType instead of always decompressing the path to get its prefix: 
> https://github.com/apache/impala/blob/218c4c447eadb14fadb8310db4b46ab8c04cb1ba/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L914



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to