[jira] [Commented] (SPARK-24727) The cache 100 in CodeGenerator is too small for streaming

ant_nebula (JIRA) Tue, 03 Jul 2018 02:39:39 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-24727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531107#comment-16531107
 ]


ant_nebula commented on SPARK-24727:
------------------------------------

NO.Spark would do DAG schedule for each streaming batchDuration job.

If my one streaming batchDuration jobs completely fill the cache 100, then the 
overfill code would do janino compile every streaming batchDuration job.

> The cache 100 in CodeGenerator is too small for streaming
> ---------------------------------------------------------
>
>                 Key: SPARK-24727
>                 URL: https://issues.apache.org/jira/browse/SPARK-24727
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: ant_nebula
>            Priority: Major
>
> {code:java}
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator 
> private val cache = CacheBuilder.newBuilder().maximumSize(100).build{code}
> The cache 100 in CodeGenerator is too small for realtime streaming 
> calculation, although is ok for offline calculation. Because realtime 
> streaming calculation is mostly more complex in one driver, and performance 
> sensitive.
> I suggest spark support configging for user with default 100, such as 
> spark.codegen.cache=1000
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24727) The cache 100 in CodeGenerator is too small for streaming

Reply via email to