CodiumAI-Agent commented on PR #9454: URL: https://github.com/apache/incubator-gluten/pull/9454#issuecomment-2849090720
## PR Code Suggestions ✨ <!-- 7c69a64 --> <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=5>General</td> <td> <details><summary>Invoke test case setup</summary> ___ **The generated test suite doesn’t invoke the test case setup, so no TPCH queries will <br>run. Call <code>setupTestCase()</code> in the class body to register the tests defined by <br><code>testCases</code> and <code>testCasesWithConfig</code>.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseTPCHAbstractSuite.scala [343-346]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-7dea6c248140a3959e816b97496ebb6de7ea4069fad3da9377b20d10a4e07727R343-R346) ```diff class CreateMergeTreeSuite extends GlutenClickHouseTPCHAbstractSuite with TPCHMergeTreeResult - with TPCHParquetSource {} + with TPCHParquetSource { + setupTestCase() +} + ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: Calling `setupTestCase()` is essential to register the TPCH tests for `CreateMergeTreeSuite`, otherwise no tests will run. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Register TPCH tests</summary> ___ **The suite defines no tests because <code>setupTestCase()</code> is never called. Add <br><code>setupTestCase()</code> into the class body to initialize the TPCH test cases.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseTPCHAbstractSuite.scala [353-356]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-7dea6c248140a3959e816b97496ebb6de7ea4069fad3da9377b20d10a4e07727R353-R356) ```diff class MergeTreeSuite extends GlutenClickHouseTPCHAbstractSuite with TPCHMergeTreeResult - with TPCHMergeTreeSource {} + with TPCHMergeTreeSource { + setupTestCase() +} + ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: Without invoking `setupTestCase()`, `MergeTreeSuite` defines no tests and the new test harness is unused. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Enable TPCH test registration</summary> ___ **This suite doesn’t execute any TPCH queries since <code>setupTestCase()</code> isn’t invoked. Add <br>a class body with <code>setupTestCase()</code> to register the defined tests.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseTPCHAbstractSuite.scala [370-373]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-7dea6c248140a3959e816b97496ebb6de7ea4069fad3da9377b20d10a4e07727R370-R373) ```diff class ParquetTPCHSuite extends GlutenClickHouseTPCHAbstractSuite with TPCHParquetResult - with TPCHParquetSource + with TPCHParquetSource { + setupTestCase() +} + ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: `ParquetTPCHSuite` will not execute any TPCH queries unless `setupTestCase()` is called to register its test cases. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Remove redundant quoting</summary> ___ **Remove the extra single quotes around the boolean property so it is interpreted <br>correctly. The value should be <code>"true"</code>, not <code>"'true'"</code>.** [backends-clickhouse/src-delta-32/test/scala/org/apache/spark/gluten/delta/GlutenDeltaMergeTreeDeletionVectorSuite.scala [39]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-c619c9b0fe45b957f2b4fcb70c26d5e9e5b82c3fc1574994b17dcd04f3ab691eR39-R39) ```diff -.withProps(Map("delta.enableDeletionVectors" -> "'true'")) +.withProps(Map("delta.enableDeletionVectors" -> "true")) ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The extra single quotes around the config value cause `delta.enableDeletionVectors` to receive `"'true'"` instead of `"true"`, so removing them ensures the boolean flag is interpreted correctly. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>move setupTestCase to beforeAll</summary> ___ **Invoking <code>setupTestCase()</code> directly in the class body may run before Spark is <br>initialized. Move it into <code>beforeAll</code> to ensure proper ordering.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseTPCHSuite.scala [53]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-a2d431fb083fc7044f0bcde0fcedda291310aa6b6dfa4eb30a40a384a902cdfdR53-R53) ```diff -setupTestCase() +override def beforeAll(): Unit = { + super.beforeAll() + setupTestCase() +} ``` <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: Calling `setupTestCase()` in the class body may execute before Spark is ready, so moving it into `beforeAll` improves initialization order. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=3>Possible issue</td> <td> <details><summary>include all test case IDs</summary> ___ **The <code>testCases</code> list omits the queries configured in <code>testCasesWithConfig</code> (7, 8, 14, <br>17, 18), so those cases will never run. Include them in <code>testCases</code> to cover all <br>intended tests.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseTPCHSuite.scala [36-38]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-a2d431fb083fc7044f0bcde0fcedda291310aa6b6dfa4eb30a40a384a902cdfdR36-R38) ```diff final override val testCases: Seq[Int] = Seq( - 4, 6, 9, 10, 11, 12, 13, 15, 16, 19, 20, 21, 22 + 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 ) ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: `testCases` omits queries 7, 8, 14, 17, and 18 which are configured in `testCasesWithConfig`, causing those tests not to run. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>replace undefined logLevel</summary> ___ **<code>logLevel</code> is not defined in this scope and will cause a compile error. Replace it <br>with a concrete level or fetch from Spark configuration.** [backends-clickhouse/src/test/scala/org/apache/gluten/execution/GlutenClickHouseWholeStageTransformerSuite.scala [122]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-bd29efab37b77d993d8ea4dd22f91368f11f0fcc1e4fc1a6a81275395bc05770R122-R122) ```diff -spark.sparkContext.setLogLevel(logLevel) +spark.sparkContext.setLogLevel( + spark.sparkContext.getConf.get("spark.logLevel", "INFO") +) ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: `logLevel` is not defined in the suite, leading to a compile error when calling `setLogLevel(logLevel)`. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Use safe head access</summary> ___ **Guard against empty collections to avoid <code>NoSuchElementException</code> by using <code>headOption</code> <br>and providing a clear error if missing. This makes the test failure more <br>informative.** [backends-clickhouse/src-celeborn/test/scala/org/apache/gluten/execution/GlutenClickHouseRSSColumnarShuffleAQESuite.scala [56-58]](https://github.com/apache/incubator-gluten/pull/9454/files#diff-ab9f1b814f0adc26b64fc8ad080f233d7db5626146295d370dc641c465e4412bR56-R58) ```diff -val coalescedPartitionSpec0 = colCustomShuffleReaderExecs.head - .partitionSpecs.head +val coalescedPartitionSpec0 = colCustomShuffleReaderExecs.headOption + .getOrElse(throw new AssertionError("Expected at least one AQEShuffleReadExec")) + .partitionSpecs.headOption + .getOrElse(throw new AssertionError("Expected at least one partitionSpec")) .asInstanceOf[CoalescedPartitionSpec] ``` <details><summary>Suggestion importance[1-10]: 4</summary> __ Why: Introducing `headOption` checks adds clearer error messages but the test already asserts `colCustomShuffleReaderExecs.size == 2`, so it only marginally improves diagnostics. </details></details></td><td align=center>Low </td></tr></tr></tbody></table> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
