siknezevic commented on issue #27246: [SPARK-30536][CORE][SQL] Sort-merge join 
operator spilling performance improvements
URL: https://github.com/apache/spark/pull/27246#issuecomment-577499526
 
 
   > Regarding the "Implement lazy initialization of UnsafeSorterSpillReader" 
portion of the changes:
   > 
   > It looks like @liutang123 previously proposed a similar change in 2018 in 
#20184 / [SPARK-22987](https://issues.apache.org/jira/browse/SPARK-22987). I 
stumbled across that PR earlier this year and [left some 
comments](https://github.com/apache/spark/pull/20184#issuecomment-511240279): 
overall, I think that lazy initialization of the individual spill readers makes 
sense.
   > 
   > In order to enable incremental review and merge of these changes, do you 
think it could make sense to split the lazy spill reader initialization changes 
into a separate PR? I haven't compared the implementations in detail, so I 
haven't formed any opinion on the approach here vs. the one in #20184.
   
   Thank you for the comments.
   Yes. I am OK with submitting new PR only for lazy spill reader 
initialization.
   I looked #20184 and I can see there is still I/O in the constructor 
   Line 69:  numRecords = numRecordsRemaining = dis.readInt();
   But, I am open for suggestions.  Please let me know which approach would 
make more sense.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to