otterc edited a comment on pull request #30433: URL: https://github.com/apache/spark/pull/30433#issuecomment-737477581
While making these changes and adding the unit tests I think it's better to not rely on the file pointer from RandomAccessFile and maintain our own pointer. Thus, not use RandomAccessFile at all. The reason is that let's say `seek` fails for index file and after that we abort merging any new merge blocks. However, when we finalize that partition we still need the pointer to the last successful update made to the index file of this partition. Since the seek had failed, we can't rely on the pointer from RAF. I think this also simplifies the exception handling code. Also, `RAF.getFilePointer` can throw IOException and by now it seems to me that using RAF is not simplifying but making the code more complicated. Let me know if you think otherwise. @mridulm @Victsm @Ngone51 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
