LuciferYang commented on code in PR #39598:
URL: https://github.com/apache/spark/pull/39598#discussion_r1071229541
##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala:
##########
@@ -47,6 +47,7 @@ trait SQLQueryTestHelper {
.replaceAll("Last Access.*", s"Last Access $notIncludedMsg")
.replaceAll("Partition Statistics\t\\d+", s"Partition
Statistics\t$notIncludedMsg")
.replaceAll("\\*\\(\\d+\\) ", "*") // remove the WholeStageCodegen
codegenStageIds
+ .replaceAll("@[0-9a-z]+,", ",") // remove hashCode
Review Comment:
https://github.com/apache/spark/blob/cedc9d2d351443ccb013adaafbdd4f7b7acf56ae/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala#L44
It seems that the problem is here. The current replaceAll will match the
longest string, such as the following:
```
== Analyzed Logical Plan ==
InsertIntoHadoopFsRelationCommand
file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5,
false, [val#x], Parquet,
[path=file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5],
Append, `spark_catalog`.`default`.`explain_temp5`,
org.apache.spark.sql.execution.datasources.CatalogFileIndex(file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5),
[key, val]
+- Project [key#x, val#x]
+- SubqueryAlias spark_catalog.default.explain_temp4
+- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
```
After the the replaceAll, it will become
```
== Analyzed Logical Plan ==
InsertIntoHadoopFsRelationCommand Location [not included in
comparison]/{warehouse_dir}/explain_temp5), [key, val]
+- Project [key#x, val#x]
+- SubqueryAlias spark_catalog.default.explain_temp4
+- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
```
instead of
```
== Analyzed Logical Plan ==
InsertIntoHadoopFsRelationCommand Location [not included in
comparison]/{warehouse_dir}/explain_temp5, false, [val#x], Parquet,
[path=Location [not included in comparison]/{warehouse_dir}/explain_temp5],
Append, `spark_catalog`.`default`.`explain_temp5`,
org.apache.spark.sql.execution.datasources.CatalogFileIndex(fLocation [not
included in comparison]/{warehouse_dir}/explain_temp5), [key, val]
+- Project [key#x, val#x]
+- SubqueryAlias spark_catalog.default.explain_temp4
+- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]