sarutak opened a new pull request, #56452: URL: https://github.com/apache/spark/pull/56452
### What changes were proposed in this pull request? This PR backports #56406 to `branch-4.0`. Replace the `show()`-based doctest for `sql_keywords()` with a `.columns` check. ### Why are the changes needed? SPARK-57133 (#56247) added 7 new non-reserved keywords (BIN, WIDTH, ALIGN, etc.), which changed the top-20 row output of `sql_keywords().show()` and consequently the column width in the formatted output. This broke the `pyspark-connect-old-client` CI job, which runs `branch-4.0` client tests against a `master` server. The `branch-4.0` doctest still expects the old column width. https://github.com/sarutak/spark/actions/runs/27188469096/job/80265105548 ``` ********************************************************************** File "/__w/spark/spark-4.0/python/pyspark/sql/connect/tvf.py", line ?, in pyspark.sql.connect.tvf.TableValuedFunction.sql_keywords Failed example: spark.tvf.sql_keywords().show() Expected: +-------------+--------+ | keyword|reserved| +-------------+--------+ ... +-------------+--------+... Got: +----------+--------+ | keyword|reserved| +----------+--------+ | ADD| false| | AFTER| false| | AGGREGATE| false| | ALIGN| false| | ALL| false| | ALTER| false| | ALWAYS| false| | ANALYZE| false| | AND| false| | ANTI| false| | ANY| false| | ANY_VALUE| false| | APPROX| false| | ARCHIVE| false| | ARRAY| false| | AS| false| | ASC| false| |ASENSITIVE| false| | AT| false| | ATOMIC| false| +----------+--------+ only showing top 20 rows ********************************************************************** 1 of 1 in pyspark.sql.connect.tvf.TableValuedFunction.sql_keywords ***Test Failed*** 1 failures. ``` The `show()` output is inherently fragile for this TVF because any keyword addition changes the formatting. Since a dedicated unittest (`test_sql_keywords` in `test_tvf.py`) already verifies the full output via `assertDataFrameEqual`, the doctest only needs to confirm that the method returns a valid DataFrame. Using `.columns` achieves this without being sensitive to keyword list changes. ### Does this PR introduce *any* user-facing change? No. ### How was this patch tested? Existing `test_sql_keywords` unittest continues to pass. ### Was this patch authored or co-authored using generative AI tooling? Kiro CLI / Claude -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
