[GitHub] [spark] otterc edited a comment on pull request #30433: [SPARK-32916][SHUFFLE][test-maven][test-hadoop2.7] Ensure the number of chunks in meta file and index file are equal

GitBox Wed, 02 Dec 2020 12:47:28 -0800


otterc edited a comment on pull request #30433:
URL: https://github.com/apache/spark/pull/30433#issuecomment-737477581



   While making these changes and adding the unit tests I think it's better to 
not rely on the file pointer from RandomAccessFile and maintain our own 
pointer. Thus, not use RandomAccessFile at all.
   The reason is that let's say `seek` fails for index file and after that we 
abort merging any new merge blocks. However, when we finalize that partition we 
still need the pointer to the last successful update made to the index file of 
this partition.
   Since the seek had failed, we can't rely on the pointer from RAF. I think 
this also simplifies the exception handling code.
   Also, `RAF.getFilePointer` can throw IOException and by now it seems to me 
that using RAF is not simplifying but making the code more complicated.
   
   Let me know if you think otherwise.
   @mridulm @Victsm @Ngone51 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] otterc edited a comment on pull request #30433: [SPARK-32916][SHUFFLE][test-maven][test-hadoop2.7] Ensure the number of chunks in meta file and index file are equal

Reply via email to