RushabhK commented on issue #9801:
URL: 
https://github.com/apache/incubator-gluten/issues/9801#issuecomment-2965552203

   Hello @FelixYBW I had a question on the file output committers in Gluten.
   I was using the Manifest committter in one of my jobs with Gluten and in the 
driver logs I could see that this Manifest committer, file count was 0 in all 
the `manifest.json` and data size 0 was not doing anything with Gluten.
   So my question is who renames / moves these files in gluten with Manifest? 
Does Gluten have its own mechanism for handling commits?
   
   These are the driver logs I could see when running with gluten:
   ```
   25/06/11 15:37:21 [main] INFO PathOutputCommitterFactory: Using 
OutputCommitter factory class class 
org.apache.hadoop.mapreduce.lib.output.committer.manifest.ManifestCommitterFactory
 from key mapreduce.outputcommitter.factory.scheme.gs
   25/06/11 15:37:21 [main] INFO ManifestCommitter: Created ManifestCommitter 
with JobID job_202506111537213028082964935343887_0000, Task Attempt 
attempt_202506111537213028082964935343887_0000_m_000000_0 and destination 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb
   25/06/11 15:48:52 [main] INFO AbstractJobOrTaskStage: [Job-Attempt 
fa204b99-986e-4d65-8912-a107afc4105c/00]: Executing Stage 
job_stage_load_manifests
   25/06/11 15:48:52 [main] INFO LoadManifestsStage: [Job-Attempt 
fa204b99-986e-4d65-8912-a107afc4105c/00]: Executing Manifest Job Commit with 
manifests in 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests
   
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-31] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000031_1633 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000031-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-24] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000024_1695 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000024-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-16] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000016_1683 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000016-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-23] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000023_1682 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000023-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-25] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000025_1630 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000025-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-26] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000026_1636 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000026-manifest.json:
 File count: 0; data size=0
   25/06/11 15:48:52 
[manifest-committer-fa204b99-986e-4d65-8912-a107afc4105c_0-7] INFO 
LoadManifestsStage: [Job-Attempt fa204b99-986e-4d65-8912-a107afc4105c/00]: Task 
Attempt attempt_202506111537213403690146387912535_0002_m_000007_1824 file 
gs://<some_path>/.spark-staging-a0487498-59b2-4317-a70f-b72f303e3bfb/_temporary/fa204b99-986e-4d65-8912-a107afc4105c/00/manifests/task_202506111537213403690146387912535_0002_m_000007-manifest.json:
 File count: 0; data size=0
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to