AngersZhuuuu opened a new pull request #29739:
URL: https://github.com/apache/spark/pull/29739
### What changes were proposed in this pull request?
In current mode, when explain a SQL plan with HiveTableRelation, it will
show so many info about HiveTableRelation's prunedPartition, this make plan
hard to read, this pr make this information simpler.
for UT
```
test("Make HiveTableScanExec message simple") {
withSQLConf(HiveUtils.CONVERT_METASTORE_ORC.key -> "false",
"hive.exec.dynamic.partition.mode" -> "nonstrict") {
withTable("df1", "df2") {
spark.range(1000)
.select(col("id"), col("id").as("k"))
.write
.partitionBy("k")
.format("hive")
.mode("overwrite")
.saveAsTable("df1")
spark.range(100)
.select(col("id"), col("id").as("k"))
.write
.partitionBy("k")
.format("hive")
.mode("overwrite")
.saveAsTable("df2")
val df = sql("SELECT df1.id, df2.k FROM df1 JOIN df2 ON df1.k =
df2.k AND df2.id < 2")
df.explain(true)
}
}
}
```
will show
```
== Parsed Logical Plan ==
'Project ['df1.id, 'df2.k]
+- 'Join Inner, (('df1.k = 'df2.k) AND ('df2.id < 2))
:- 'UnresolvedRelation [df1], []
+- 'UnresolvedRelation [df2], []
== Analyzed Logical Plan ==
id: bigint, k: bigint
Project [id#22L, k#25L]
+- Join Inner, ((k#23L = k#25L) AND (id#24L < cast(2 as bigint)))
:- SubqueryAlias spark_catalog.default.df1
: +- HiveTableRelation [`default`.`df1`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#22L],
Partition Cols: [k#23L], Statistic: sizeInBytes=8.0 EiB]
+- SubqueryAlias spark_catalog.default.df2
+- HiveTableRelation [`default`.`df2`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#24L],
Partition Cols: [k#25L], Statistic: sizeInBytes=8.0 EiB]
== Optimized Logical Plan ==
Project [id#22L, k#25L]
+- Join Inner, (k#23L = k#25L)
:- Filter isnotnull(k#23L)
: +- HiveTableRelation [`default`.`df1`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#22L],
Partition Cols: [k#23L], Pruned Partitions: [k=0, k=1, k=10, k=100, k=101,
k=102, k=103, k=104, k=105, k=106, k=107, k=108, k=109, k=11, k=11...,
Statistic: sizeInBytes=8.0 EiB]
+- Project [k#25L]
+- Filter ((isnotnull(id#24L) AND (id#24L < 2)) AND isnotnull(k#25L))
+- HiveTableRelation [`default`.`df2`,
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#24L],
Partition Cols: [k#25L], Pruned Partitions: [k=0, k=1, k=10, k=11, k=12, k=13,
k=14, k=15, k=16, k=17, k=18, k=19, k=2, k=20, k=21, k=22, k=2..., Statistic:
sizeInBytes=8.0 EiB]
== Physical Plan ==
*(2) Project [id#22L, k#25L]
+- *(2) BroadcastHashJoin [k#23L], [k#25L], Inner, BuildRight, false
:- Scan hive default.df1 [id#22L, k#23L], HiveTableRelation
[`default`.`df1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data
Cols: [id#22L], Partition Cols: [k#23L], Pruned Partitions: [k=0, k=1, k=10,
k=100, k=101, k=102, k=103, k=104, k=105, k=106, k=107, k=108, k=109, k=11,
k=11..., Statistic: sizeInBytes=8.0 EiB], [isnotnull(k#23L)]
+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint,
true]),false), [id=#53]
+- *(1) Project [k#25L]
+- *(1) Filter (isnotnull(id#24L) AND (id#24L < 2))
+- Scan hive default.df2 [id#24L, k#25L], HiveTableRelation
[`default`.`df2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data
Cols: [id#24L], Partition Cols: [k#25L], Pruned Partitions: [k=0, k=1, k=10,
k=11, k=12, k=13, k=14, k=15, k=16, k=17, k=18, k=19, k=2, k=20, k=21, k=22,
k=2..., Statistic: sizeInBytes=8.0 EiB], [isnotnull(k#25L)]
```
### Why are the changes needed?
Make plan about HiveTableRelation more readable
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
No
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]