[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r758008380 ## File path: python/pyspark/sql/session.py ## @@ -301,6 +306,9 @@ def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObjec jsparkSession = self._jvm.SparkSession.getDefaultSession().get() else: jsparkSession = self._jvm.SparkSession(self._jsc.sc()) Review comment: Yup, let's create a followup PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r758005218 ## File path: python/pyspark/sql/session.py ## @@ -301,6 +306,9 @@ def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObjec jsparkSession = self._jvm.SparkSession.getDefaultSession().get() else: jsparkSession = self._jvm.SparkSession(self._jsc.sc()) Review comment: Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r758001741 ## File path: python/pyspark/sql/session.py ## @@ -301,6 +306,9 @@ def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObjec jsparkSession = self._jvm.SparkSession.getDefaultSession().get() else: jsparkSession = self._jvm.SparkSession(self._jsc.sc()) Review comment: LGTM with a couple of nits: @AngersZh, - Can we actually leverage existing constructor on `SparkSession` to pass the initial options instead of setting it manually? Here unlike Scala, it initiates `sharedState` always. I think it's best to keep the code path matched. - Another nit is that: It's always preferred to use less Py4J connections which exposes potential flakiness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r758001741 ## File path: python/pyspark/sql/session.py ## @@ -301,6 +306,9 @@ def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObjec jsparkSession = self._jvm.SparkSession.getDefaultSession().get() else: jsparkSession = self._jvm.SparkSession(self._jsc.sc()) Review comment: LGTM with a couple of nits: @AngersZh, - Can we actually leverage existing constructor on `SparkSession` to pass the initial options instead of manually? Here unlike Scala, it initiates `sharedState` always. I think it's best to keep the code path matched. - Another nit is that: It's always preferred to use less Py4J connections which exposes potential flakiness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r748779639 ## File path: python/pyspark/sql/session.py ## @@ -287,7 +287,12 @@ def getOrCreate(self) -> "SparkSession": _instantiatedSession: ClassVar[Optional["SparkSession"]] = None _activeSession: ClassVar[Optional["SparkSession"]] = None -def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObject] = None): +def __init__( +self, +sparkContext: SparkContext, +jsparkSession: Optional[JavaObject] = None, +options: Optional[Dict[str, Any]] = None, +): Review comment: Yeah I don't think this is breaking. Let me double check closely by tmr EOD but from a cursory look it seems fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r748779639 ## File path: python/pyspark/sql/session.py ## @@ -287,7 +287,12 @@ def getOrCreate(self) -> "SparkSession": _instantiatedSession: ClassVar[Optional["SparkSession"]] = None _activeSession: ClassVar[Optional["SparkSession"]] = None -def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObject] = None): +def __init__( +self, +sparkContext: SparkContext, +jsparkSession: Optional[JavaObject] = None, +options: Optional[Dict[str, Any]] = None, +): Review comment: Yeah I don't think this is breaking. Let me double check closely by tmr EOD but from a coursory look it seems fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34559: [SPARK-37291][PYTHON][SQL] PySpark init SparkSession should copy conf to sharedState
HyukjinKwon commented on a change in pull request #34559: URL: https://github.com/apache/spark/pull/34559#discussion_r748779639 ## File path: python/pyspark/sql/session.py ## @@ -287,7 +287,12 @@ def getOrCreate(self) -> "SparkSession": _instantiatedSession: ClassVar[Optional["SparkSession"]] = None _activeSession: ClassVar[Optional["SparkSession"]] = None -def __init__(self, sparkContext: SparkContext, jsparkSession: Optional[JavaObject] = None): +def __init__( +self, +sparkContext: SparkContext, +jsparkSession: Optional[JavaObject] = None, +options: Optional[Dict[str, Any]] = None, +): Review comment: Yeah I don't think this is breaking. Let me double check closely by tmr EODbut from a coursory look it seems fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org