Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/13496#discussion_r66730126
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -452,6 +452,17 @@ class Analyzer(
def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
case i @ InsertIntoTable(u: UnresolvedRelation, parts, child, _, _)
if child.resolved =>
+ // A partitioned relation's schema can be different from the input
logicalPlan, since
+ // partition columns are all moved after data columns. We Project
to adjust the ordering.
+ val input = if (parts.nonEmpty) {
+ val (inputPartCols, inputDataCols) = child.output.partition {
attr =>
+ parts.contains(attr.name)
+ }
+ Project(inputDataCols ++ inputPartCols, child)
+ } else {
+ child
+ }
--- End diff --
@gatorsmile good catch! The reason we have `insertInto` is to have a SQL
INSERT INTO version in `DataFrameWriter`. We should use `saveAsTable` if we
need by-name resolution.
I have reverted this PR, @viirya do you mind open a new PR to also remove
this logic in `insertInto` to make it consistent with SQL version? thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]