[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-12681: Release file descriptors for partial written 
temporary files
nit: partially


http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@9
PS3, Line 9: The patch fixes a bug that the partial written temporary files are
nit: This patch fixes a bug where partially written temporary files...



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 13:07:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14873/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 12:52:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG@16
PS2, Line 16: Testing:
> can you add a bit more context here or in the Jira on how to trigger this?
The query that can spill (contains something like aggregation or hash join or 
others) should be good enough, in my test, I use tpcds q78, dont need to 
cancel, simply monitor the /proc/x/fd. Added some comments.



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 12:34:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Manual testing of remote spilling, configure a local path as the
local buffer and a S3 remote path for remote storage in scratch dir.
Ran tpcds q78 with a low buffer_pool_limit in the query option to
trigger spilling, and checked the path of local buffer path of the
scratch dir that the temporary files did exist during running the
query.
Without the fix, after running the query, we can find references
of deleted files in /proc/x/fd.
With the patch that there are no references to the deleted
files in the /proc/x/fd/.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/3
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG@16
PS2, Line 16: Testing:
can you add a bit more context here or in the Jira on how to trigger this? e.g. 
what kind of query is needed, do we need to cancel the query



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 11:11:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14872/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 09:27:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc
File be/src/runtime/io/disk-file.cc:

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc@41
PS1, Line 41:   // Close the file writer to release the file handle.
:   RETURN_IF_ERROR(file_writer_->Close());
: }
> Is it safe to delete it here, so can we exclude the possibility of having p
Thanks Csaba! Yeah, it is better to call Close() here and add a DCHECK in the 
destructor of file writer. Changed.



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 09:06:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Manual testing of remote spilling, with the patch that there are
no references to the deleted files in the /proc/x/fd/. Conversely,
in the absence of the patch, such references are observed.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/2
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc
File be/src/runtime/io/disk-file.cc:

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc@41
PS1, Line 41: if (file_writer_ != nullptr) {
:   file_writer_.reset();
: }
Is it safe to delete it here, so can we exclude the possibility of having 
pointers to file_writer_ elsewhere?

It seems safer to me to call Close here and possibly add a DCHECK to the 
destructor of LocalFileWriter to check that the file was closed.



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Jan 2024 08:28:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/14869/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 04 Jan 2024 07:49:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-03 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20852


Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() during the destruction of the
LocalFileWriter class, ensuring proper release of the file handle.
Additionally, the patch introduces the immediate release of the
file handle by adding the file writer reset when the Delete()
is called within the DiskFile class.

Testing:
Manual testing of remote spilling, with the patch that there are
no references to the deleted files in the /proc/x/fd/. Conversely,
in the absence of the patch, such references are observed.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 13 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/1
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu