[ https://issues.apache.org/jira/browse/SPARK-41277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ohad Raviv updated SPARK-41277: ------------------------------- Summary: Leverage shuffle key as bucketing properties (was: Save and leverage shuffle key in tblproperties) > Leverage shuffle key as bucketing properties > -------------------------------------------- > > Key: SPARK-41277 > URL: https://issues.apache.org/jira/browse/SPARK-41277 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.3.1 > Reporter: Ohad Raviv > Priority: Minor > > I'm not sure if I'm not missing anything trivial. > In a typical process, many datasets get materialized and many of them after a > shuffle (e.g join). then they would again be involved in further actions and > often use the same key. > Wouldn't it make sense to save the shuffle key along with the table to avoid > unnecessary shuffles? > Also, the implementation seems quite straightforward - to just leverage the > bucketing mechanism. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org