[
https://issues.apache.org/jira/browse/IMPALA-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815209#comment-17815209
]
ASF subversion and git services commented on IMPALA-12681:
----------------------------------------------------------
Commit 99e8170997f18db0f63d451af89ca32320ebb465 in impala's branch
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=99e817099 ]
IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()
check_deleted_file_fd() is introduced in IMPALA-12681, however some
spilling testcases involving check_deleted_file_fd() seem flaky.
This patch fixed the issue by adding a retry mechanism within the
check_deleted_file_fd() function. If the function encounters a
failure, it retries the process of verifying the presence of a
deleted referencing file. Based on my local test, the file will be
removed after the test even when the test fails and the call to
delete the file handle is ahead of the call to remove the file (This
has been confirmed through additional testing logs). While there is
no theory why this would happen, introducing a retry mechanism has
allowed the test case to run successfully for 200 times without
encountering any failures. It is possible that a delay may be
occurring at some point in the process which leads to this kind of
failure.
Tests:
Reran the testcase 200 times without a failure.
Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
Reviewed-on: http://gerrit.cloudera.org:8080/21000
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Some local file descriptors not released when using remote spilling
> -------------------------------------------------------------------
>
> Key: IMPALA-12681
> URL: https://issues.apache.org/jira/browse/IMPALA-12681
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0.0, Impala 4.1.0, Impala 4.2.0, Impala 4.3.0
> Reporter: Yida Wu
> Assignee: Yida Wu
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> The bug occurs during remote spilling when writing spilled data to local
> buffers. In this process, if the files are not completely filled, such as
> when no more data is incoming at the end of spilling, there is a possibility
> of partial writes to the files, the files might be physically removed without
> properly releasing the associated file descriptor. This issue can be observed
> in cases like the one described below.
> {code:java}
> find /proc/*/fd -ls | grep '(deleted)'
> 288574785 0 lrwx------ 1 impala impala 64 Jan 3 14:24 /proc/x/fd/xxxx ->
> /opt/impala/scratch/impala-scratch/impala-scratch-xxxxxxxxx-xxxx-xxxx\
> (deleted) {code}
> In such a scenario, the disk space occupied by the file may not be reclaimed
> because the file descriptor still maintains a reference to the file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]