[ 
https://issues.apache.org/jira/browse/IMPALA-10545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573189#comment-17573189
 ] 

ASF subversion and git services commented on IMPALA-10545:
----------------------------------------------------------

Commit 89c3e1f821ccd335c3c5507496bb53b80c1cc07a in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=89c3e1f82 ]

IMPALA-10545: Higher data_cache_write_concurrency for SSDs

Provide device-specific defaults for `data_cache_write_concurrency`
based on device type. Rotational disks continue to use a default of 1,
while non-rotational disks use a default of 8. Option default of 0 is
used to select this mode.

Added unit test confirming concurrency based on mocked partitions and
block device info. Replaced FRIEND_TEST macros for a test that no longer
exists.

Started cluster with
    start-impala-cluster.py --data_cache_dir=/home/michael/cache
      --data_cache_size=1G --impalad_args=--always_use_data_cache=true

and observed
> Default data_cache_write_concurrency=8 for non-rotational disk nvme0n1

Change-Id: I60761faa2710f4795f1f3eaf66da866b5553f609
Reviewed-on: http://gerrit.cloudera.org:8080/18616
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>


> Tune data_cache_write_concurrency based on the type of IO device
> ----------------------------------------------------------------
>
>                 Key: IMPALA-10545
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10545
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0.0
>            Reporter: Joe McDonnell
>            Assignee: Michael Smith
>            Priority: Major
>              Labels: ramp-up
>         Attachments: test.sh, test1.out, test8.out
>
>
> The data cache limits concurrency writes to the cache to avoid overwhelming 
> the underlying IO device. This is controlled by the 
> data_cache_write_concurrency flags and defaults to 1. For SSDs, we should be 
> able to increase this to allow more concurrent writes to the data cache. This 
> would allow the data cache to warm up faster and stay more up to date.
> One option is to detect the underlying IO device (similar to how we do this 
> for other parts of Disk IO Mgr) and tune this parameter higher for SSDs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to