[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-12681: Release file descriptors for partial written temporary files nit: partially http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@9 PS3, Line 9: The patch fixes a bug that the partial written temporary files are nit: This patch fixes a bug where partially written temporary files... -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 13:07:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14873/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 12:52:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG@16 PS2, Line 16: Testing: > can you add a bit more context here or in the Jira on how to trigger this? The query that can spill (contains something like aggregation or hash join or others) should be good enough, in my test, I use tpcds q78, dont need to cancel, simply monitor the /proc/x/fd. Added some comments. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 12:34:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Yida Wu has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. IMPALA-12681: Release file descriptors for partial written temporary files The patch fixes a bug that the partial written temporary files are removed without releasing the file descriptors. This patch fixes the bug by adding a call to Close() of the local file writer during the Delete() of the DiskFile class, which could be called when the local buffer file is being evicted or the query ends, ensuring proper release of the file handle. Testing: Manual testing of remote spilling, configure a local path as the local buffer and a S3 remote path for remote storage in scratch dir. Ran tpcds q78 with a low buffer_pool_limit in the query option to trigger spilling, and checked the path of local buffer path of the scratch dir that the temporary files did exist during running the query. Without the fix, after running the query, we can find references of deleted files in /proc/x/fd. With the patch that there are no references to the deleted files in the /proc/x/fd/. Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 --- M be/src/runtime/io/disk-file.cc M be/src/runtime/io/local-file-writer.cc M be/src/runtime/io/local-file-writer.h 3 files changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/3 -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG@16 PS2, Line 16: Testing: can you add a bit more context here or in the Jira on how to trigger this? e.g. what kind of query is needed, do we need to cancel the query -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 11:11:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14872/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 09:27:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc File be/src/runtime/io/disk-file.cc: http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc@41 PS1, Line 41: // Close the file writer to release the file handle. : RETURN_IF_ERROR(file_writer_->Close()); : } > Is it safe to delete it here, so can we exclude the possibility of having p Thanks Csaba! Yeah, it is better to call Close() here and add a DCHECK in the destructor of file writer. Changed. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 04 Jan 2024 09:06:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Yida Wu has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. IMPALA-12681: Release file descriptors for partial written temporary files The patch fixes a bug that the partial written temporary files are removed without releasing the file descriptors. This patch fixes the bug by adding a call to Close() of the local file writer during the Delete() of the DiskFile class, which could be called when the local buffer file is being evicted or the query ends, ensuring proper release of the file handle. Testing: Manual testing of remote spilling, with the patch that there are no references to the deleted files in the /proc/x/fd/. Conversely, in the absence of the patch, such references are observed. Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 --- M be/src/runtime/io/disk-file.cc M be/src/runtime/io/local-file-writer.cc M be/src/runtime/io/local-file-writer.h 3 files changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/2 -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc File be/src/runtime/io/disk-file.cc: http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc@41 PS1, Line 41: if (file_writer_ != nullptr) { : file_writer_.reset(); : } Is it safe to delete it here, so can we exclude the possibility of having pointers to file_writer_ elsewhere? It seems safer to me to call Close here and possibly add a DCHECK to the destructor of LocalFileWriter to check that the file was closed. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 1 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 04 Jan 2024 08:28:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20852 ) Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/14869/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 1 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 04 Jan 2024 07:49:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files
Yida Wu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20852 Change subject: IMPALA-12681: Release file descriptors for partial written temporary files .. IMPALA-12681: Release file descriptors for partial written temporary files The patch fixes a bug that the partial written temporary files are removed without releasing the file descriptors. This patch fixes the bug by adding a call to Close() during the destruction of the LocalFileWriter class, ensuring proper release of the file handle. Additionally, the patch introduces the immediate release of the file handle by adding the file writer reset when the Delete() is called within the DiskFile class. Testing: Manual testing of remote spilling, with the patch that there are no references to the deleted files in the /proc/x/fd/. Conversely, in the absence of the patch, such references are observed. Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 --- M be/src/runtime/io/disk-file.cc M be/src/runtime/io/local-file-writer.cc M be/src/runtime/io/local-file-writer.h 3 files changed, 13 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/1 -- To view, visit http://gerrit.cloudera.org:8080/20852 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14 Gerrit-Change-Number: 20852 Gerrit-PatchSet: 1 Gerrit-Owner: Yida Wu