HeartSaVioR commented on code in PR #37042:
URL: https://github.com/apache/spark/pull/37042#discussion_r911903258


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala:
##########
@@ -44,6 +44,13 @@ trait SQLQueryTestHelper {
       .replaceAll("Last Access.*", s"Last Access $notIncludedMsg")
       .replaceAll("Partition Statistics\t\\d+", s"Partition 
Statistics\t$notIncludedMsg")
       .replaceAll("\\*\\(\\d+\\) ", "*") // remove the WholeStageCodegen 
codegenStageIds
+
+      // Below is needed since the catalog table in LogicalRelation can 
produce serde class
+      // "optionally" if CatalogTable is presented and has a serde information 
in the storage.
+      // This assumes LogicalRelation contains a catalog table, otherwise it 
would not match with
+      // this pattern.
+      .replaceAll("Arguments: (.+), (\\[.+\\]), (`.+`\\.`.+`)(, .+)?, 
(false|true)",

Review Comment:
   2nd argument requires `[...]` which is basically Seq or Array.
   3rd argument requires quoted table name.
   4th (or 5th) argument requires either false or true.
   
   This is actually the matter of probability. It's not possible we can ensure 
it won't be broken in future against arbitrary changes, but this works for 
current one.
   
   If we would like to put more strict rule to avoid innocent matching, then we 
could be more specific with the regex. Some possible ideas: in 2nd argument, we 
could match up with attribute format, `colName#x`. In 3rd argument, we could 
expand this to match to arbitrary level if we need to.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to