[
https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
hosein updated SPARK-20352:
---------------------------
Environment:
Ubuntu 12
Spark 2.1
JRE 8.0
Python 2.7
was:
linux ubunto 12
spark 2.1
JRE 8.0
> PySpark SparkSession initialization take longer every iteration in a single
> application
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-20352
> URL: https://issues.apache.org/jira/browse/SPARK-20352
> Project: Spark
> Issue Type: Question
> Components: PySpark
> Affects Versions: 2.1.0
> Environment: Ubuntu 12
> Spark 2.1
> JRE 8.0
> Python 2.7
> Reporter: hosein
> Fix For: 2.1.0
>
>
> I run Spark on a standalone Ubuntu server with 128G memory and 32-core CPU.
> Run spark-sumbit my_code.py without any additional configuration parameters.
> In a while loop I start SparkSession, analyze data and then stop the context
> and this process repeats every 10 seconds.
> #####################
> while True:
> >>>>spark =
> >>>>SparkSession.builder.appName("sync_task").config('spark.driver.maxResultSize'
> >>>> , '5g').getOrCreate()
> >>>>sc = spark.sparkContext
> >>>>#some process and analyze
> >>>>spark.stop()
> #######################
> When program starts, it works perfectly.
> but when it works for many hours. spark initialization take long time. it
> makes 10 or 20 seconds for just initializing spark.
> So what is the problem ?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]