[GitHub] spark pull request: [SPARK-13250][SQL] Make broadcast join handle ...

nongli Tue, 09 Feb 2016 16:09:11 -0800

Github user nongli commented on the pull request:

    https://github.com/apache/spark/pull/11141#issuecomment-182141070
  
    @davies Yes I am. I don't think we're going to add a ton more operators and 
in all the one you mentioned, we should think hard about serializing to the in 
memory structure the operator wants rather than just copying. For example, 
CartesianProduct should probably serialize all the rows on one side 
contiguously; similarly sort for tungsten pages. In-memory cache should use a 
columnar version and not need to first go to UnsafeRow.
    
    I think we can take what I've done here and make it more componentized so 
it's less code duplication to use this elsewhere. I'm okay not doing this in 
general in the planner. The operators that need to accumulate memory should 
think hard about how to do it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13250][SQL] Make broadcast join handle ...

Reply via email to