Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/22124#discussion_r211208353
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -384,7 +384,12 @@ object RemoveRedundantAliases extends
Rule[LogicalPlan] {
}
}
- def apply(plan: LogicalPlan): LogicalPlan = removeRedundantAliases(plan,
AttributeSet.empty)
+ def apply(plan: LogicalPlan): LogicalPlan = {
+ plan match {
+ case c: Command => c
+ case _ => removeRedundantAliases(plan, AttributeSet.empty)
--- End diff --
Yes, this is correct. Without this PR, `RemoveRedundantAliases` works like
this:
```scala
=== Applying Rule
org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases ===
InsertIntoHadoopFsRelationCommand
file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-a
InsertIntoHadoopFsRelationCommand
file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf
Database: default
Database: default
Table: table2
Table: table2
Owner: yumwang
Owner: yumwang
Created Time: Mon Aug 20 03:03:52 PDT 2018
Created Time: Mon Aug 20
03:03:52 PDT 2018
Last Access: Wed Dec 31 16:00:00 PST 1969
Last Access: Wed Dec 31
16:00:00 PST 1969
Created By: Spark 2.4.0-SNAPSHOT
Created By: Spark
2.4.0-SNAPSHOT
Type: MANAGED
Type: MANAGED
Provider: hive
Provider: hive
Table Properties: [transient_lastDdlTime=1534759432]
Table Properties:
[transient_lastDdlTime=1534759432]
Location:
file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf0-8b2736665d26/table2
Location:
file:/private/var/folders/tg/f5mz46090wg7swzgdc69f8q03965_0/T/warehouse-ae504f50-9543-49fb-acf0-8b2736665d26/table2
Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
Serde Library:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
InputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
OutputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Storage Properties: [serialization.format=1]
Storage Properties:
[serialization.format=1]
Partition Provider: Catalog
Partition Provider: Catalog
Schema: root
Schema: root
-- COL1: long (nullable = true)
|-- COL1: long (nullable =
true)
-- COL2: long (nullable = true)
|-- COL2: long (nullable =
true)
!), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@60582d55,
[COL1#10L, COL2#11L] ),
org.apache.spark.sql.execution.datasources.InMemoryFileIndex@60582d55,
[col1#8L, col2#9L]
!+- Project [col1#8L AS col1#10L, col2#9L AS col2#11L]
+- Project [col1#8L,
col2#9L]
+- Filter (col1#8L > -20)
+- Filter (col1#8L >
-20)
+- Relation[col1#8L,col2#9L] parquet
+-
Relation[col1#8L,col2#9L] parquet
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]