[
https://issues.apache.org/jira/browse/IMPALA-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806995#comment-17806995
]
ASF subversion and git services commented on IMPALA-12681:
----------------------------------------------------------
Commit 02d004a12166c5549d591b7352ec1463c5ee8ba3 in impala's branch
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=02d004a12 ]
IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests
The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote spilling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.
To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.
Testing:
Reran the tests a couple times and passed.
Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Reviewed-on: http://gerrit.cloudera.org:8080/20898
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Some local file descriptors not released when using remote spilling
> -------------------------------------------------------------------
>
> Key: IMPALA-12681
> URL: https://issues.apache.org/jira/browse/IMPALA-12681
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0.0, Impala 4.1.0, Impala 4.2.0, Impala 4.3.0
> Reporter: Yida Wu
> Assignee: Yida Wu
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> The bug occurs during remote spilling when writing spilled data to local
> buffers. In this process, if the files are not completely filled, such as
> when no more data is incoming at the end of spilling, there is a possibility
> of partial writes to the files, the files might be physically removed without
> properly releasing the associated file descriptor. This issue can be observed
> in cases like the one described below.
> {code:java}
> find /proc/*/fd -ls | grep '(deleted)'
> 288574785 0 lrwx------ 1 impala impala 64 Jan 3 14:24 /proc/x/fd/xxxx ->
> /opt/impala/scratch/impala-scratch/impala-scratch-xxxxxxxxx-xxxx-xxxx\
> (deleted) {code}
> In such a scenario, the disk space occupied by the file may not be reclaimed
> because the file descriptor still maintains a reference to the file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]