Lior Chaga created SPARK-35508:
----------------------------------
Summary: job group and description do not apply on broadcasts
Key: SPARK-35508
URL: https://issues.apache.org/jira/browse/SPARK-35508
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.1.0, 3.0.0
Reporter: Lior Chaga
Given the following code:
{code:java}
SparkContext context = new SparkContext("local", "test");
SparkSession session = new SparkSession(context);
List<String> strings = Lists.newArrayList("a", "b", "c");
List<String> otherString = Lists.newArrayList( "b", "c", "d");
Dataset<Row> broadcastedDf = session.createDataset(strings,
Encoders.STRING()).toDF();
Dataset<Row> dataframe = session.createDataset(otherString,
Encoders.STRING()).toDF();
context.setJobGroup("my group", "my job", false);
dataframe.join(broadcast(broadcastedDf), "value").count();
{code}
Job group and description do not apply on broadcasted dataframe.
With spark 2.x, broadcast creation is given the same job description as the
query itself.
This seems to be broken with spark 3.x
See attached images
!image-2021-05-25-09-39-36-816.png!
!image-2021-05-25-09-40-12-210.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]