Michael Smith has posted comments on this change. (
http://gerrit.cloudera.org:8080/18616 )
Change subject: IMPALA-10545: Higher data_cache_write_concurrency for SSDs
......................................................................
Patch Set 2:
Tested with
#!/usr/bin/env bash
DATA_CACHE=${DATA_CACHE:-~/cache}
QUERY='select l_orderkey, l_linestatus, l_shipdate, l_comment from
tpch.lineitem order by l_orderkey'
rm -rf ${DATA_CACHE} && mkdir -p ${DATA_CACHE}
start-impala-cluster.py --cluster_size=1 --data_cache_dir=${DATA_CACHE}
--data_cache_size=10G --impalad_args='--always_use_data_cache=true
--data_cache_write_concurrency=1'
impala-shell.sh --show_profiles --query "$QUERY; $QUERY" --output_file
/dev/null > test1.out
rm -rf ${DATA_CACHE} && mkdir -p ${DATA_CACHE}
start-impala-cluster.py --cluster_size=1 --data_cache_dir=${DATA_CACHE}
--data_cache_size=10G --impalad_args='--always_use_data_cache=true
--data_cache_write_concurrency=8'
impala-shell.sh --show_profiles --query "$QUERY; $QUERY" --output_file
/dev/null > test8.out
and for the 2nd query run observed
| Metric | Concurrency=1 | Concurrency=8 |
| ------------------------ | ------------- | ------------- |
| DataCacheHitBytes | 415.56 MB | 719.25 MB |
| DataCacheHitCount | 57 | 95 |
| DataCacheMissBytes | 303.69 MB | 0 |
| DataCacheMissCount | 38 | 0 |
| DataCachePartialHitCount | 5 | 0 |
--
To view, visit http://gerrit.cloudera.org:8080/18616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60761faa2710f4795f1f3eaf66da866b5553f609
Gerrit-Change-Number: 18616
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Comment-Date: Tue, 21 Jun 2022 21:15:28 +0000
Gerrit-HasComments: No