pvary commented on pull request #1407:
URL: https://github.com/apache/iceberg/pull/1407#issuecomment-687074236


   > > Relies on listing of the target dir.
   > 
   > Can we find out in job commit how many writer tasks there were? Then we 
could use well-known locations and make sure each one is read.
   
   I suspect that the JobContext contains only input information about the 
number of mappers/reducers. I have only debugged the LocalJobRunner code for 
now, but I did not see anything which would indicate that we have up-to-date 
information there.
   The only solution for it I was able to come up is creating a new JobClient 
to get the info from the server. I was not able to make it work for the 
LocalJobRunner yet, and I think this would be too specific for MR.
   How does this work for Spark writes? Do we have any other places where MR 
write is already implemented for Iceberg?
   
   Updated the PR to commit the task only at IcebergOutputCommitter.commitTask, 
and not at IcebergRecordWriter.close.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to