HyukjinKwon opened a new pull request, #39034: URL: https://github.com/apache/spark/pull/39034
### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/38970 which makes the test pass with ANSI mode on. ### Why are the changes needed? To recover the build with ANSI mode on. Currently it's broke as follows: ``` ====================================================================== ERROR [2.651s]: test_cast (pyspark.sql.tests.connect.test_connect_column.SparkConnectTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_column.py", line 119, in test_cast df.select(df.id.cast(x)).toPandas(), df2.select(df2.id.cast(x)).toPandas() File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 1466, in toPandas return self._session.client._to_pandas(query) File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, in _to_pandas return self._execute_and_fetch(req) File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, in _execute_and_fetch for b in self._stub.ExecutePlan(req, metadata=self._builder.metadata()): File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 426, in __next__ return self._next() File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 826, in _next raise self grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.UNKNOWN details = "[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] Cannot resolve "id" due to data type mismatch: cannot cast "BIGINT" to "BINARY" with ANSI mode on. If you have to cast "BIGINT" to "BINARY", you can set "spark.sql.ansi.enabled" as 'false'.; 'Project [unresolvedalias(cast(id#31L as binary), None)] +- SubqueryAlias spark_catalog.default.test_connect_basic_table_1 +- Relation spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet " debug_error_string = "UNKNOWN:Error received from peer ipv4:127.0.0.1:15002 {created_time:"2022-12-09T01:54:45.378316841+00:00", grpc_status:2, grpc_message:"[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] Cannot resolve \"id\" due to data type mismatch: cannot cast \"BIGINT\" to \"BINARY\" with ANSI mode on.\nIf you have to cast \"BIGINT\" to \"BINARY\", you can set \"spark.sql.ansi.enabled\" as \'false\'.;\n\'Project [unresolvedalias(cast(id#31L as binary), None)]\n+- SubqueryAlias spark_catalog.default.test_connect_basic_table_1\n +- Relation spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet\n"}" > ``` https://github.com/apache/spark/actions/runs/3671813752 ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? This PR fixes the unittest to make passed. I manually tested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
