HyukjinKwon opened a new pull request, #39034:
URL: https://github.com/apache/spark/pull/39034

   ### What changes were proposed in this pull request?
   
   This PR is a followup of https://github.com/apache/spark/pull/38970 which 
makes the test pass with ANSI mode on.
   
   ### Why are the changes needed?
   
   To recover the build with ANSI mode on. Currently it's broke as follows:
   
   ```
   ======================================================================
   ERROR [2.651s]: test_cast 
(pyspark.sql.tests.connect.test_connect_column.SparkConnectTests)
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File 
"/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_column.py", 
line 119, in test_cast
       df.select(df.id.cast(x)).toPandas(), 
df2.select(df2.id.cast(x)).toPandas()
     File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 
1466, in toPandas
       return self._session.client._to_pandas(query)
     File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, in 
_to_pandas
       return self._execute_and_fetch(req)
     File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, in 
_execute_and_fetch
       for b in self._stub.ExecutePlan(req, metadata=self._builder.metadata()):
     File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 426, 
in __next__
       return self._next()
     File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 826, 
in _next
       raise self
   grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC 
that terminated with:
        status = StatusCode.UNKNOWN
        details = "[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] Cannot resolve 
"id" due to data type mismatch: cannot cast "BIGINT" to "BINARY" with ANSI mode 
on.
   If you have to cast "BIGINT" to "BINARY", you can set 
"spark.sql.ansi.enabled" as 'false'.;
   'Project [unresolvedalias(cast(id#31L as binary), None)]
   +- SubqueryAlias spark_catalog.default.test_connect_basic_table_1
      +- Relation 
spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet
   "
        debug_error_string = "UNKNOWN:Error received from peer 
ipv4:127.0.0.1:15002 {created_time:"2022-12-09T01:54:45.378316841+00:00", 
grpc_status:2, grpc_message:"[DATATYPE_MISMATCH.CAST_WITH_CONF_SUGGESTION] 
Cannot resolve \"id\" due to data type mismatch: cannot cast \"BIGINT\" to 
\"BINARY\" with ANSI mode on.\nIf you have to cast \"BIGINT\" to \"BINARY\", 
you can set \"spark.sql.ansi.enabled\" as \'false\'.;\n\'Project 
[unresolvedalias(cast(id#31L as binary), None)]\n+- SubqueryAlias 
spark_catalog.default.test_connect_basic_table_1\n   +- Relation 
spark_catalog.default.test_connect_basic_table_1[id#31L,name#32] parquet\n"}"
   >
   ```
   
   https://github.com/apache/spark/actions/runs/3671813752
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, test-only.
   
   ### How was this patch tested?
   
   This PR fixes the unittest to make passed. I manually tested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to