HeartSaVioR commented on code in PR #37042:
URL: https://github.com/apache/spark/pull/37042#discussion_r911903258
##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestHelper.scala:
##########
@@ -44,6 +44,13 @@ trait SQLQueryTestHelper {
.replaceAll("Last Access.*", s"Last Access $notIncludedMsg")
.replaceAll("Partition Statistics\t\\d+", s"Partition
Statistics\t$notIncludedMsg")
.replaceAll("\\*\\(\\d+\\) ", "*") // remove the WholeStageCodegen
codegenStageIds
+
+ // Below is needed since the catalog table in LogicalRelation can
produce serde class
+ // "optionally" if CatalogTable is presented and has a serde information
in the storage.
+ // This assumes LogicalRelation contains a catalog table, otherwise it
would not match with
+ // this pattern.
+ .replaceAll("Arguments: (.+), (\\[.+\\]), (`.+`\\.`.+`)(, .+)?,
(false|true)",
Review Comment:
2nd argument requires `[...]` which is basically Seq or Array.
3rd argument requires quoted table name.
4th (or 5th) argument requires either false or true.
This is actually the matter of probability. It's not possible we can ensure
it won't be broken in future against arbitrary changes, but this works for
current one.
If we would like to more strict rule to avoid innocent matching, then we
could be more specific with the regex. Some possible ideas: in 2nd argument, we
could match up with attribute format, `colName#x`. In 3rd argument, we could
expand this to match to arbitrary level if we need to.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]