Zhan Zhang created SPARK-20006:
----------------------------------
Summary: Separate threshold for broadcast and shuffled hash join
Key: SPARK-20006
URL: https://issues.apache.org/jira/browse/SPARK-20006
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.1.0
Reporter: Zhan Zhang
Priority: Minor
Currently both canBroadcast and canBuildLocalHashMap use the same
configuration: AUTO_BROADCASTJOIN_THRESHOLD.
But the memory model may be different. For broadcast, currently the hash map is
always build on heap. For shuffledHashJoin, the hash map may be build on
heap(longHash), or off heap(other map if off heap is enabled). The same
configuration makes the configuration hard to tune (how to allocate memory
onheap/offheap). Propose to use different configuration.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]