Yida Wu created IMPALA-13992:
--------------------------------
Summary: Incorrect file path in logging while spilling to remote
filesystem
Key: IMPALA-13992
URL: https://issues.apache.org/jira/browse/IMPALA-13992
Project: IMPALA
Issue Type: Bug
Components: Backend
Reporter: Yida Wu
Assignee: Yida Wu
In
[https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/io/disk-io-mgr.cc#L280C31-L280C36,]
when spilling data to a remote filesystem, a write range is used to first
write the buffer to local storage. However, the file path set in the write
range may incorrectly point to the remote file path, as assigned in
[https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/tmp-file-mgr.cc#L1987].
Although the actual write logic works correctly, since it uses the writer's
configured file which is not the path from the write range, the logging relies
on the file path from the write range. This results in misleading logs that
incorrectly indicate the data is being written directly to the remote file.
The difficulty here is that the write range is used in two different modes,
either as a buffer before uploading to remote storage, or as a purely local
file. So this can lead to confusion. This logic should be reviewed to ensure
logging accurately reflects the actual write file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)