[
https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989560#comment-15989560
]
Tejas Patil commented on SPARK-19256:
-------------------------------------
[~cloud_fan], [~sameerag] : I was looking at trunk and observed changes which
would affect the plan (more from implementation perspective not the high level
design).
`InsertIntoHiveTable` is now a `RunnableCommand` (unlike earlier
`UnaryExecNode`). With exec node, it was possible to set the
requiredDistribution and requiredOrdering and let the planner
(`EnsureRequirements`) take care of managing things. With `RunnableCommand`,
the model seems to be that these requirements have to handled separately (so
far there is only one place which does that: [0]). Two comments:
- I feel that this somewhat ugly as one would expect `EnsureRequirements` to be
a single place for handling this
- we might miss out optimizations. eg. If I am adding an extra shuffle in
`InsertIntoHiveTable` and if the previous node was shuffle as well, the code
for merging these two shuffle nodes as a single shuffle would have to
duplicated as well from `EnsureRequirements`.
Would it be OK to make `InsertIntoHiveTable` as a `UnaryExecNode` ?
Why was it made a `RunnableCommand` recently
(https://github.com/apache/spark/pull/16517) ? cc [~smilegator]
[0] :
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L173
> Hive bucketing support
> ----------------------
>
> Key: SPARK-19256
> URL: https://issues.apache.org/jira/browse/SPARK-19256
> Project: Spark
> Issue Type: Umbrella
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Tejas Patil
> Priority: Minor
>
> JIRA to track design discussions and tasks related to Hive bucketing support
> in Spark.
> Proposal :
> https://docs.google.com/document/d/1a8IDh23RAkrkg9YYAeO51F4aGO8-xAlupKwdshve2fc/edit?usp=sharing
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]