[
https://issues.apache.org/jira/browse/SPARK-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576230#comment-14576230
]
Yuri Makhno commented on SPARK-8147:
------------------------------------
[~srowen] we want something similar to the following logic applied to every
iterator:
{code}
iterator.map(x => {
if (! jvmHasEnoughFreeMemory()) {
throw new NotEnoughExecutorMemoryException();
}
x
});
{code}
I can wrap iterators in my own RDD implementations, but when I use SparkSQL it
creates RDD chains for queries behind the scene. And it's imposible to handle
situations where:
* there are enough memory on executor to load data by our custom RDD
implementations
* not enough memory to do join (or different sql query)
> Add ability to decorate RDD iterators
> -------------------------------------
>
> Key: SPARK-8147
> URL: https://issues.apache.org/jira/browse/SPARK-8147
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.3.1
> Reporter: Yuri Makhno
>
> In Spark all computations are done through iterators which are created by
> RDD.iterator method. It would be good if we can specify some
> RDDIteratorDecoratorFactory in SparkConf and be able to decorate all RDD
> iterators created in executor JVM.
> For us it would be extremely useful because we want to control executor's
> memory and prevent OutOfMemory on executor but instead fail job with
> NotEnoughMemory reason in case when we see that we don't have more memory to
> do this. Also we want to collect some computation statistics on executor.
> I can provide PR in case this improvement is approved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]