Marc Arndt created SPARK-27850:
----------------------------------
Summary: Make SparkPlan#doExecuteBroadcast public
Key: SPARK-27850
URL: https://issues.apache.org/jira/browse/SPARK-27850
Project: Spark
Issue Type: Improvement
Components: Optimizer, Spark Core, SQL
Affects Versions: 2.4.3
Reporter: Marc Arndt
The handling of broadcasts of SparkPlan objects is handled inside the
SparkPlan#executeBroadcast method. According to the documentation of SparkPlan
to provide custom broadcast functionality the `doExecuteBroadcast` method
should be overriden as indicated by the comment:
{code:scala}
/**
* Returns the result of this query as a broadcast variable by delegating to
`doExecuteBroadcast`
* after preparations.
*
* Concrete implementations of SparkPlan should override `doExecuteBroadcast`.
*/
final def executeBroadcast[T](): broadcast.Broadcast[T] = executeQuery {
if (isCanonicalizedPlan) {
throw new IllegalStateException("A canonicalized plan is not supposed to
be executed.")
}
doExecuteBroadcast()
}
{code}
When looking at the definition of SparkPlan#doExecuteBroadcast:
{code:scala}
/**
* Produces the result of the query as a broadcast variable.
*
* Overridden by concrete implementations of SparkPlan.
*/
protected[sql] def doExecuteBroadcast[T](): broadcast.Broadcast[T] = {
throw new UnsupportedOperationException(s"$nodeName does not implement
doExecuteBroadcast")
}
{code}
it becomes apparent that it is not possible to override the method from
user-defined SparkPlan implementations, because the method has been defined as
package protected.
To allow custom SparkPlan implementations to provide their own broadcast
operations I ask to change the SparkPlan#doExecuteBroadcast to be a public
method, so that all SparkPlan implementations, independent of the package they
belong to, can override it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]