Matei Zaharia created SPARK-3084: ------------------------------------ Summary: Collect broadcasted tables in parallel in joins Key: SPARK-3084 URL: https://issues.apache.org/jira/browse/SPARK-3084 Project: Spark Issue Type: Bug Components: SQL Reporter: Matei Zaharia Assignee: Matei Zaharia
BroadcastHashJoin has a broadcastFuture variable that tries to collect the broadcasted table in a separate thread, but this doesn't help because it's a lazy val that only gets initialized when you attempt to build the RDD. Thus queries that broadcast multiple tables would collect and broadcast them sequentially. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org