[
https://issues.apache.org/jira/browse/IMPALA-10545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573189#comment-17573189
]
ASF subversion and git services commented on IMPALA-10545:
----------------------------------------------------------
Commit 89c3e1f821ccd335c3c5507496bb53b80c1cc07a in impala's branch
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=89c3e1f82 ]
IMPALA-10545: Higher data_cache_write_concurrency for SSDs
Provide device-specific defaults for `data_cache_write_concurrency`
based on device type. Rotational disks continue to use a default of 1,
while non-rotational disks use a default of 8. Option default of 0 is
used to select this mode.
Added unit test confirming concurrency based on mocked partitions and
block device info. Replaced FRIEND_TEST macros for a test that no longer
exists.
Started cluster with
start-impala-cluster.py --data_cache_dir=/home/michael/cache
--data_cache_size=1G --impalad_args=--always_use_data_cache=true
and observed
> Default data_cache_write_concurrency=8 for non-rotational disk nvme0n1
Change-Id: I60761faa2710f4795f1f3eaf66da866b5553f609
Reviewed-on: http://gerrit.cloudera.org:8080/18616
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>
> Tune data_cache_write_concurrency based on the type of IO device
> ----------------------------------------------------------------
>
> Key: IMPALA-10545
> URL: https://issues.apache.org/jira/browse/IMPALA-10545
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.0.0
> Reporter: Joe McDonnell
> Assignee: Michael Smith
> Priority: Major
> Labels: ramp-up
> Attachments: test.sh, test1.out, test8.out
>
>
> The data cache limits concurrency writes to the cache to avoid overwhelming
> the underlying IO device. This is controlled by the
> data_cache_write_concurrency flags and defaults to 1. For SSDs, we should be
> able to increase this to allow more concurrent writes to the data cache. This
> would allow the data cache to warm up faster and stay more up to date.
> One option is to detect the underlying IO device (similar to how we do this
> for other parts of Disk IO Mgr) and tune this parameter higher for SSDs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]