[ 
https://issues.apache.org/jira/browse/SPARK-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576230#comment-14576230
 ] 

Yuri Makhno commented on SPARK-8147:
------------------------------------

[~srowen] we want something similar to the following logic applied to every 
iterator:
{code}
 iterator.map(x => {
         if (! jvmHasEnoughFreeMemory()) {
              throw new NotEnoughExecutorMemoryException();   
         }
         x 
 });
{code}

I can wrap iterators in my own RDD implementations, but when I use SparkSQL it 
creates RDD chains for queries behind the scene. And it's imposible to handle 
situations where:
 * there are enough memory on executor to load data by our custom RDD 
implementations
 * not enough memory to do join (or different sql query)


> Add ability to decorate RDD iterators
> -------------------------------------
>
>                 Key: SPARK-8147
>                 URL: https://issues.apache.org/jira/browse/SPARK-8147
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.3.1
>            Reporter: Yuri Makhno
>
> In Spark all computations are done through iterators which are created by 
> RDD.iterator method. It would be good if we can specify some 
> RDDIteratorDecoratorFactory in SparkConf and be able to decorate all RDD 
> iterators created in executor JVM. 
> For us it would be extremely useful because we want to control executor's 
> memory and prevent OutOfMemory on executor but instead fail job with 
> NotEnoughMemory reason in case when we see that we don't have more memory to 
> do this. Also we want to collect some computation statistics on executor.
> I can provide PR in case this improvement is approved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to