ueshin commented on code in PR #44697:
URL: https://github.com/apache/spark/pull/44697#discussion_r1456459431
##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala:
##########
@@ -371,6 +373,14 @@ case class SessionHolder(userId: String, sessionId:
String, session: SparkSessio
private[connect] def listListenerIds(): Seq[String] = {
listenerCache.keySet().asScala.toSeq
}
+
+ /**
+ * An accumulator for Python executors.
+ *
+ * The accumulated results will be sent to the Python client via
observed_metrics message.
+ */
+ private[connect] val pythonAccumulator: Option[PythonAccumulator] =
+ Try(session.sparkContext.collectionAccumulator[Array[Byte]]).toOption
Review Comment:
> if the profile is disabled, we shouldn't probably create this accumulator
to avoid performance issue.
It needs to always have the accumulator because:
- it can't know whether or not / when the profiler is enabled
- to support the registered UDFs
What kind of performance issue do you concern?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]