[jira] [Assigned] (SPARK-25003) Pyspark Does not use Spark Sql Extensions

2018-10-17 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-25003:


Assignee: Russell Spitzer

> Pyspark Does not use Spark Sql Extensions
> -
>
> Key: SPARK-25003
> URL: https://issues.apache.org/jira/browse/SPARK-25003
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.2.2, 2.3.1
>Reporter: Russell Spitzer
>Assignee: Russell Spitzer
>Priority: Major
> Fix For: 3.0.0
>
>
> When creating a SparkSession here
> [https://github.com/apache/spark/blob/v2.2.2/python/pyspark/sql/session.py#L216]
> {code:python}
> if jsparkSession is None:
>   jsparkSession = self._jvm.SparkSession(self._jsc.sc())
> self._jsparkSession = jsparkSession
> {code}
> I believe it ends up calling the constructor here
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L85-L87
> {code:scala}
>   private[sql] def this(sc: SparkContext) {
> this(sc, None, None, new SparkSessionExtensions)
>   }
> {code}
> Which creates a new SparkSessionsExtensions object and does not pick up new 
> extensions that could have been set in the config like the companion 
> getOrCreate does.
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L928-L944
> {code:scala}
> //in getOrCreate
> // Initialize extensions if the user has defined a configurator class.
> val extensionConfOption = 
> sparkContext.conf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS)
> if (extensionConfOption.isDefined) {
>   val extensionConfClassName = extensionConfOption.get
>   try {
> val extensionConfClass = 
> Utils.classForName(extensionConfClassName)
> val extensionConf = extensionConfClass.newInstance()
>   .asInstanceOf[SparkSessionExtensions => Unit]
> extensionConf(extensions)
>   } catch {
> // Ignore the error if we cannot find the class or when the class 
> has the wrong type.
> case e @ (_: ClassCastException |
>   _: ClassNotFoundException |
>   _: NoClassDefFoundError) =>
>   logWarning(s"Cannot use $extensionConfClassName to configure 
> session extensions.", e)
>   }
> }
> {code}
> I think a quick fix would be to use the getOrCreate method from the companion 
> object instead of calling the constructor from the SparkContext. Or we could 
> fix this by ensuring that all constructors attempt to pick up custom 
> extensions if they are set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-25003) Pyspark Does not use Spark Sql Extensions

2018-08-03 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-25003:


Assignee: (was: Apache Spark)

> Pyspark Does not use Spark Sql Extensions
> -
>
> Key: SPARK-25003
> URL: https://issues.apache.org/jira/browse/SPARK-25003
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.2.2, 2.3.1
>Reporter: Russell Spitzer
>Priority: Major
>
> When creating a SparkSession here
> [https://github.com/apache/spark/blob/v2.2.2/python/pyspark/sql/session.py#L216]
> {code:python}
> if jsparkSession is None:
>   jsparkSession = self._jvm.SparkSession(self._jsc.sc())
> self._jsparkSession = jsparkSession
> {code}
> I believe it ends up calling the constructor here
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L85-L87
> {code:scala}
>   private[sql] def this(sc: SparkContext) {
> this(sc, None, None, new SparkSessionExtensions)
>   }
> {code}
> Which creates a new SparkSessionsExtensions object and does not pick up new 
> extensions that could have been set in the config like the companion 
> getOrCreate does.
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L928-L944
> {code:scala}
> //in getOrCreate
> // Initialize extensions if the user has defined a configurator class.
> val extensionConfOption = 
> sparkContext.conf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS)
> if (extensionConfOption.isDefined) {
>   val extensionConfClassName = extensionConfOption.get
>   try {
> val extensionConfClass = 
> Utils.classForName(extensionConfClassName)
> val extensionConf = extensionConfClass.newInstance()
>   .asInstanceOf[SparkSessionExtensions => Unit]
> extensionConf(extensions)
>   } catch {
> // Ignore the error if we cannot find the class or when the class 
> has the wrong type.
> case e @ (_: ClassCastException |
>   _: ClassNotFoundException |
>   _: NoClassDefFoundError) =>
>   logWarning(s"Cannot use $extensionConfClassName to configure 
> session extensions.", e)
>   }
> }
> {code}
> I think a quick fix would be to use the getOrCreate method from the companion 
> object instead of calling the constructor from the SparkContext. Or we could 
> fix this by ensuring that all constructors attempt to pick up custom 
> extensions if they are set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-25003) Pyspark Does not use Spark Sql Extensions

2018-08-03 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-25003:


Assignee: Apache Spark

> Pyspark Does not use Spark Sql Extensions
> -
>
> Key: SPARK-25003
> URL: https://issues.apache.org/jira/browse/SPARK-25003
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.2.2, 2.3.1
>Reporter: Russell Spitzer
>Assignee: Apache Spark
>Priority: Major
>
> When creating a SparkSession here
> [https://github.com/apache/spark/blob/v2.2.2/python/pyspark/sql/session.py#L216]
> {code:python}
> if jsparkSession is None:
>   jsparkSession = self._jvm.SparkSession(self._jsc.sc())
> self._jsparkSession = jsparkSession
> {code}
> I believe it ends up calling the constructor here
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L85-L87
> {code:scala}
>   private[sql] def this(sc: SparkContext) {
> this(sc, None, None, new SparkSessionExtensions)
>   }
> {code}
> Which creates a new SparkSessionsExtensions object and does not pick up new 
> extensions that could have been set in the config like the companion 
> getOrCreate does.
> https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L928-L944
> {code:scala}
> //in getOrCreate
> // Initialize extensions if the user has defined a configurator class.
> val extensionConfOption = 
> sparkContext.conf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS)
> if (extensionConfOption.isDefined) {
>   val extensionConfClassName = extensionConfOption.get
>   try {
> val extensionConfClass = 
> Utils.classForName(extensionConfClassName)
> val extensionConf = extensionConfClass.newInstance()
>   .asInstanceOf[SparkSessionExtensions => Unit]
> extensionConf(extensions)
>   } catch {
> // Ignore the error if we cannot find the class or when the class 
> has the wrong type.
> case e @ (_: ClassCastException |
>   _: ClassNotFoundException |
>   _: NoClassDefFoundError) =>
>   logWarning(s"Cannot use $extensionConfClassName to configure 
> session extensions.", e)
>   }
> }
> {code}
> I think a quick fix would be to use the getOrCreate method from the companion 
> object instead of calling the constructor from the SparkContext. Or we could 
> fix this by ensuring that all constructors attempt to pick up custom 
> extensions if they are set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org