LuciferYang commented on code in PR #39598:
URL: https://github.com/apache/spark/pull/39598#discussion_r1071229541


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala:
##########
@@ -47,6 +47,7 @@ trait SQLQueryTestHelper {
       .replaceAll("Last Access.*", s"Last Access $notIncludedMsg")
       .replaceAll("Partition Statistics\t\\d+", s"Partition 
Statistics\t$notIncludedMsg")
       .replaceAll("\\*\\(\\d+\\) ", "*") // remove the WholeStageCodegen 
codegenStageIds
+      .replaceAll("@[0-9a-z]+,", ",") // remove hashCode

Review Comment:
   
https://github.com/apache/spark/blob/cedc9d2d351443ccb013adaafbdd4f7b7acf56ae/sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala#L44
   
   It seems that the problem is here. The current replaceAll will match the 
longest match string, such as the following:
   
   
   ```
   == Analyzed Logical Plan ==
   InsertIntoHadoopFsRelationCommand 
file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5,
 false, [val#x], Parquet, 
[path=file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5],
 Append, `spark_catalog`.`default`.`explain_temp5`, 
org.apache.spark.sql.execution.datasources.CatalogFileIndex(file:/Users/yangjie01/SourceCode/git/spark-mine-12/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/explain_temp5),
 [key, val]
   +- Project [key#x, val#x]
      +- SubqueryAlias spark_catalog.default.explain_temp4
         +- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
   ```
   
   
   After the the replaceAll, it will become
   
   ```
   == Analyzed Logical Plan ==
   InsertIntoHadoopFsRelationCommand Location [not included in 
comparison]/{warehouse_dir}/explain_temp5), [key, val]
   +- Project [key#x, val#x]
      +- SubqueryAlias spark_catalog.default.explain_temp4
         +- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
   ```   
   
   instead of
   
   
   ```
   == Analyzed Logical Plan ==
   InsertIntoHadoopFsRelationCommand Location [not included in 
comparison]/{warehouse_dir}/explain_temp5, false, [val#x], Parquet, 
[path=Location [not included in comparison]/{warehouse_dir}/explain_temp5], 
Append, `spark_catalog`.`default`.`explain_temp5`, 
org.apache.spark.sql.execution.datasources.CatalogFileIndex(fLocation [not 
included in comparison]/{warehouse_dir}/explain_temp5), [key, val]
   +- Project [key#x, val#x]
      +- SubqueryAlias spark_catalog.default.explain_temp4
         +- Relation spark_catalog.default.explain_temp4[key#x,val#x] parquet
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to