[
https://issues.apache.org/jira/browse/SPARK-35327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340226#comment-17340226
]
Apache Spark commented on SPARK-35327:
--------------------------------------
User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/32454
> Merge similar v1.4/v2.7 TPCDS queries
> -------------------------------------
>
> Key: SPARK-35327
> URL: https://issues.apache.org/jira/browse/SPARK-35327
> Project: Spark
> Issue Type: Test
> Components: SQL, Tests
> Affects Versions: 3.0.2, 3.1.1, 3.2.0
> Reporter: Takeshi Yamamuro
> Priority: Major
>
> This ticket aims at merging similar
> v1.4(`resources/tpcds`)/v2.7(`resources/tpcds-v2.7.0`) TPCDS queries; it
> copies 13 query files (q6,q11,q12,q20,q24,q34,q47,q57,q64,q74,q75,q78,q98)
> from`resources/tpcds-v2.7.0` to `resources/tpcds`, and then remove the files
> in `resources/tpcds-v2.7.0`.
> I saw`TPCDSQueryTestSuite` failed nondeterministically because output row
> orders were different with those in the golden files. For example, the
> failure in the GA job,
> https://github.com/linhongliu-db/spark/runs/2507928605?check_suite_focus=true,
> happened because the `tpcds/q6.sql` query output rows were only sorted by
> `cnt`:
> https://github.com/apache/spark/blob/a0c76a8755a148e2bd774edcda12fe20f2f38c75/sql/core/src/test/resources/tpcds/q6.sql#L20
> Actually, `tpcds/q6.sql` and `tpcds-v2.7.0/q6.sql` are almost the same and
> the only difference is that `tpcds-v2.7.0/q6.sql` sorts both `cnt` and
> `a.ca_state`:
> https://github.com/apache/spark/blob/a0c76a8755a148e2bd774edcda12fe20f2f38c75/sql/core/src/test/resources/tpcds-v2.7.0/q6.sql#L22
> So, I think it's okay just to use `tpcds-v2.7.0/q6.sql` for stable testing in
> this case.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]