nsivabalan commented on pull request #2319:
URL: https://github.com/apache/hudi/pull/2319#issuecomment-743289804


   yeah, [same](https://github.com/apache/hudi/pull/1721) was already brought 
up before and we didn't proceed since it needed some perf analysis. 
   this was the rational:
   even though explodeRecordRDDWithFileComparisons(...) is called in two 
places, in the first place we just do count by key which may not shuffle any 
actual data, where as the 2nd call could incur shuffling. Hence the pattern of 
usage differs. More info can be found in the attached PR. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to