RushabhK commented on PR #9844:
URL: 
https://github.com/apache/incubator-gluten/pull/9844#issuecomment-2954824586

   > > @JkSelf I added some logs for better visibility around what all files 
the abortTask is deleting. I can see in all reproducing scenarios that the 
abortTask is always having 0 files: 
https://github.com/RushabhK/incubator-gluten/blob/v1.3.0-fixes/backends-velox/src/main/scala/org/apache/spark/sql/execution/SparkWriteFilesCommitProtocol.scala#L104
 Sample log:
   > > ```
   > > ERROR SparkWriteFilesCommitProtocol: Filenames info: 0 files, file 
names: 
   > > ERROR SparkWriteFilesCommitProtocol: Filenames info: 0 files, file 
names: 
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > So is the code in the catchBlock suggests 0 files: 
https://github.com/RushabhK/incubator-gluten/blob/v1.3.0-fixes/backends-velox/src/main/scala/org/apache/spark/sql/execution/VeloxColumnarWriteFilesExec.scala#L244
 Sample log:
   > > ```
   > > ERROR VeloxColumnarWriteFilesRDD: Commit failed, aborting task. 
fileNames size: 0Deleting staging files
   > > ERROR VeloxColumnarWriteFilesRDD: Commit failed, aborting task. 
fileNames size: 0Deleting staging files
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > I had added more logs to check the fileNames status at every point. This 
suggest the fileNames size to be 1 in all the logs: 
https://github.com/RushabhK/incubator-gluten/blob/v1.3.0-fixes/backends-velox/src/main/scala/org/apache/spark/sql/execution/VeloxColumnarWriteFilesExec.scala#L139
 Sample logs:
   > > ```
   > > ERROR VeloxColumnarWriteFilesRDD: Current filenames size: 1, filenames: 
date_key=2025-05-26/hour=00/gluten-part-b37dc941-ec8f-4a26-a189-0b9119014c8b.zstd.parquet
   > > ERROR VeloxColumnarWriteFilesRDD: Current filenames size: 1, filenames: 
date_key=2025-05-26/hour=00/gluten-part-c924ab72-003b-4ae0-9f90-54695ce851d4.zstd.parquet
   > > ERROR VeloxColumnarWriteFilesRDD: Current filenames size: 1, filenames: 
date_key=2025-05-26/hour=00/gluten-part-6cac7566-982f-486a-b918-e2c0e2bed5a2.zstd.parquet
   > > ERROR VeloxColumnarWriteFilesRDD: Current filenames size: 1, filenames: 
date_key=2025-05-26/hour=00/gluten-part-6cac7566-982f-486a-b918-e2c0e2bed5a2.zstd.parquet
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > The problem is 0 files are being collected while calling the abortTask. 
This is the issue which needs to be addressed, this is why it's not deleting 
any files when the abort task is being called. @JkSelf @FelixYBW Could you 
think of any possible reasons for this? Do let me know what all steps / logs I 
can add for us to be able to troubleshoot this further.
   > 
   > Are you saying that `fileNames `is populated when 
`collectNativeWriteFilesMetrics `is called, but is empty when `abortTask `is 
invoked?
   
   Yes, `fileNames` is empty when `abortTask` is invoked. I also tried 
maintaining `localFileNames` inside `compute` to ensure it's not a scoping 
issue. But even that is showing similar behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to