[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

mateiz Mon, 21 Jul 2014 16:59:21 -0700

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1165#issuecomment-49682053
  
    Hey @andrewor14, one question here just to make sure I understand: if the 
data is supposed to be stored as MEMORY_ONLY_SER, will this code still unroll 
it in an un-serialized form before testing whether it can put it? I guess this 
is okay, but it it would be better to write directly to a serialized stream in 
this case. And then we'd have to track whether that becomes too big to store as 
well.
    
    Also, it seems like in this case, even if the array of non-serialized 
elements fits in memory, we allocate a bit of extra space as we write the 
objects to a byte stream. Not horrible but it's another reason to try to 
serialize directly.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

Reply via email to