[jira] [Commented] (IMPALA-10791) Add Support of Batch Reading for Spilling to Remote FS

ASF subversion and git services (Jira) Wed, 28 Sep 2022 07:30:38 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610600#comment-17610600
 ]


ASF subversion and git services commented on IMPALA-10791:
----------------------------------------------------------

Commit 6dfab93fe9cb54aceed0b5203275827980752074 in impala's branch 
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6dfab93fe ]

IMPALA-10791 Add batch reading for remote temporary files

The patch adds a feature to batch read from a remote temporary
file in order to improve the reading performance for the spilled
remote data.

Originally, the design is to use the local disk file as the buffer
for batch read from the remote file. But in practice, it
doesn't help to improve the performance. Therefore, the design
is changed to use the memory as the read buffer.

Currently, each TmpFileRemote has two DiskFile, one is for the
remote, and one is for the local buffer. The patch adds MemBlocks
to the local buffer file. Each local buffer file is divided into
several MemBlocks evenly. Moreover, in order to guarantee a
single page not being cut into two parts in different blocks,
the block size could be a little different to each other in
practice. The default block size is the minimum value between
the default file size and
MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_BYTES, which is 16MB.

When pinning a page, the system will detect if there is enough
memory for the block that holds the page. If yes, the block will
be stored in the memory until all of the pages in the block are
read or the query ends. If not, we will go reading the page
directly and disable this block, because it may be good to avoid
duplicated reads from the remote fs for the same content.

One challenge of the read buffer is where to get the extra memory
for it, because when impala starts to spill data, it means the
process lacks of memory to use. By default, impala process will
reserve 20% of the total system memory as unused memory, and here
we will use this unused memory for the read buffer because it is
reasonable to use it for the emergency case like spilling and
the memory of the read buffer will be returned immediately after
the use.

For system reliability consideration, we set a restriction that,
the maximum bytes of the read buffer memory are no more than 10%
of the total system memory and 50% of the unused memory. Also,
if the unused memory is less than 5% of the total system memory,
the read buffer will be disabled.

Two start options have been added for the new feature.

1. remote_batch_read. Default is false. If set true, the batch read
is enabled.
2. remote_read_memory_buffer_size. Default is 1G. The maximum memory
that can be used by the read buffer. The number is also restricted
by the process memory limit, which can not exceed 10% of the process
memory limit.

Added metrics ScratchReadsUseMem/ScratchBytesReadUseMem/
ScratchBytesReadUseLocalDisk to the query profile.

The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB
from 256 to 512.

Tests:
Ran core and exhaustive tests.
Added and ran TmpFileMgrTest::TestBatchReadingFromRemote.
Added e2e test test_scratch_dirs_batch_reading.

Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Reviewed-on: http://gerrit.cloudera.org:8080/17979
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Add Support of Batch Reading for Spilling to Remote FS
> ------------------------------------------------------
>
>                 Key: IMPALA-10791
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10791
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Yida Wu
>            Assignee: Yida Wu
>            Priority: Major
>
> Impala allows spilling to a remote filesystem, like S3. The rate of uploading 
> the spilled data to the remote filesystem is fast, but the speed of reading 
> from the remote is slow because each time only one page is read from the 
> remote filesystem while we do per upload per file.
>  
> The task aims to improve the reading performance of the spilling to a remote 
> filesystem by using batch reading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-10791) Add Support of Batch Reading for Spilling to Remote FS

Reply via email to