[ 
https://issues.apache.org/jira/browse/SPARK-20340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971071#comment-15971071
 ] 

Thomas Graves commented on SPARK-20340:
---------------------------------------

Right, I figured it was probably for performance, the thing is that when its 
wrong it causes the job to fail and this could be unexpectedly.  Meaning a 
production job was running fine for months and then the data it gets in is all 
of a sudden differently/skewed due to say high traffic day and then a critical 
job fails.  This to me is not good for a production environment, if we want 
Spark to continue to be adopted by larger companies in production environments 
this sort of thing has to be very reliable.   

It looks very bad for Spark that my user said this runs fine on PIG. It makes 
users very wary about switching to Spark as they have doubts about this 
scalability and reliability.

Anyway, I think this either needs a sanity at some point to say its estimates 
are really off so get the real size or switch it to always use the real size.  
For shuffle data we should know what the size is as we just transferred it.  
that is abstracted away a git at this point in the code though so need to 
understand the code more.

 [~rxin]  [~joshrosen] any thoughts on this?

> Size estimate very wrong in ExternalAppendOnlyMap from CoGroupedRDD, cause OOM
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-20340
>                 URL: https://issues.apache.org/jira/browse/SPARK-20340
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Thomas Graves
>
> I had a user doing a basic join operation. The values are image's binary 
> data(in base64 format) and widely vary in size.
> The job failed with out of memory. Originally failed on yarn with using to 
> much overhead memory, turned spark.shuffle.io.preferDirectBufs      to false 
> then failed with out of heap memory.  I debugged it down to during the 
> shuffle when CoGroupedRDD putting things into the ExternalAppendOnlyMap, it 
> computes an estimated size to determine when to spill.  In this case 
> SizeEstimator handle arrays such that if it is larger then 400 elements, it 
> samples 100 elements. The estimate is coming back as GB's different from the 
> actual size.  It claims 1GB when it is actually using close to 5GB. 
> Temporary work around it to increase the memory to be very large (10GB 
> executors) but that isn't really acceptable here.  User did the same thing in 
> pig and it easily handled the data with 1.5GB of memory.
> It seems risky to be using an estimate in such a critical thing. If the 
> estimate is wrong you are going to run out of memory and fail the job.
> I'm looking closer at the users data still to get more insights.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to