[
https://issues.apache.org/jira/browse/IMPALA-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhishek Rawat updated IMPALA-9697:
-----------------------------------
Description:
The `‑‑scratch_dirs` startup flag uses the given scratch directories in a round
robin manner. This may not always be ideal since these directories could come
from different class of storage system volumes having different performance
characteristics (SSD vs HDD, local storage vs network attached storage, etc.).
Giving user an option to configure the priority of their scratch directories
could help them optimize their workload based on their storage system
configuration.
One possible way could be that the user pass the priority as part of the
`–scratch_dirs` startup flag using <directory>:<spill_priority>. The
directories will be selected for spilling based on their priorities and if
multiple directories have the same priority then they will be selected in a
round robin fashion. In the below example, dir1 will be used as a spill victim
until its full and then dir2, dir3, and dir4 will be used in a round robin
fashion.
{code:java}
‑‑scratch_dirs="dir1:200GB:0, dir2:1024GB:1, dir3:1024GB:1, dir4:1024GB:1"{code}
was:
The `‑‑scratch_dirs` startup flag uses the given scratch directories in a round
robin manner. This may not always be ideal since these directories could come
from different class of storage system volumes having different performance
characteristics (SSD vs HDD, local storage vs network attached storage, etc.).
Giving user an option to configure the priority of their scratch directories
could help them optimize their workload based on their storage system
configuration.
One possible way could be that the user pass the priority as part of the
`–scratch_dirs` startup flag using <directory>:<spill_priority>. The
directories will be selected for spilling based on their priorities and if
multiple directories have the same priority then they will be selected in a
round robin fashion. In the below example, dir1 will be used as a spill victim
until its full and then dir2, dir3, and dir4 will be used in a round robin
fashion.
{code:java}
‑‑scratch_dirs="dir1:0, dir2:1, dir3:1, dir4:1{code}
> Support priority based scratch directory selection
> ---------------------------------------------------
>
> Key: IMPALA-9697
> URL: https://issues.apache.org/jira/browse/IMPALA-9697
> Project: IMPALA
> Issue Type: Task
> Reporter: Abhishek Rawat
> Assignee: Abhishek Rawat
> Priority: Major
> Fix For: Impala 4.0
>
>
> The `‑‑scratch_dirs` startup flag uses the given scratch directories in a
> round robin manner. This may not always be ideal since these directories
> could come from different class of storage system volumes having different
> performance characteristics (SSD vs HDD, local storage vs network attached
> storage, etc.). Giving user an option to configure the priority of their
> scratch directories could help them optimize their workload based on their
> storage system configuration.
> One possible way could be that the user pass the priority as part of the
> `–scratch_dirs` startup flag using <directory>:<spill_priority>. The
> directories will be selected for spilling based on their priorities and if
> multiple directories have the same priority then they will be selected in a
> round robin fashion. In the below example, dir1 will be used as a spill
> victim until its full and then dir2, dir3, and dir4 will be used in a round
> robin fashion.
> {code:java}
> ‑‑scratch_dirs="dir1:200GB:0, dir2:1024GB:1, dir3:1024GB:1,
> dir4:1024GB:1"{code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]