Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16531 )
Change subject: IMPALA-9606: ABFS reads should use hdfsPreadFully ...................................................................... IMPALA-9606: ABFS reads should use hdfsPreadFully Similar to IMPALA-8525, but for ABFS, instead of S3A. I don't expect this to make a major improvement in performance, like it did for S3A, although I am still seeing a marginal improvement during some ad-hoc testing (about 5% scan perf improvement). The reason is that the implementation of the ABFS and S3A clients are very different, ABFS already reads all data requested in a single hdfsRead call. I ran the query 'select * from abfs_test_store_sales order by ss_net_profit limit 10;' several times to validate that perf does not regress. In fact, it does improve slightly for this query. The table 'abfs_test_store_sales' is just a copy of the mini-cluster's tpcds_parquet.store_sales, although it is not partitioned. Testing: * Tested against a ABFS storage account I have access to * Ran several queries to validate there are no functional or perf regressions. Change-Id: I994ea30cf31abc66f5d82d9b3c8e185d2bd06147 Reviewed-on: http://gerrit.cloudera.org:8080/16531 Reviewed-by: Joe McDonnell <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/runtime/io/hdfs-file-reader.cc 1 file changed, 3 insertions(+), 2 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I994ea30cf31abc66f5d82d9b3c8e185d2bd06147 Gerrit-Change-Number: 16531 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]>
