[ 
https://issues.apache.org/jira/browse/SPARK-41548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41548:
------------------------------------

    Assignee:     (was: Apache Spark)

> Disable ANSI mode in pyspark.sql.tests.connect.test_connect_functions
> ---------------------------------------------------------------------
>
>                 Key: SPARK-41548
>                 URL: https://issues.apache.org/jira/browse/SPARK-41548
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect, Tests
>    Affects Versions: 3.4.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> There are failures in {{test_connect_functions}} with ANSI mode on 
> (https://github.com/apache/spark/actions/runs/3709431687/jobs/6288067223). I 
> tried to fix but they are tricky to fix because Spark Connect does not 
> respect the runtime configuration at the server side.
> It is also tricky to fix the test to pass in both ANSI mode on and off. 
> Therefore, it disables temporarily to make other tests pass. Note that 
> PySpark tests stop in the middle if one fails.
> {code:java}
> ======================================================================
> 1322ERROR [0.264s]: test_date_ts_functions 
> (pyspark.sql.tests.connect.test_connect_function.SparkConnectFunctionTests)
> 1323----------------------------------------------------------------------
> 1324Traceback (most recent call last):
> 1325  File 
> "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_function.py", 
> line 1149, in test_date_ts_functions
> 1326    cdf.select(cfunc(cdf.ts1)).toPandas(),
> 1327  File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 
> 1533, in toPandas
> 1328    return self._session.client._to_pandas(query)
> 1329  File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, 
> in _to_pandas
> 1330    return self._execute_and_fetch(req)
> 1331  File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, 
> in _execute_and_fetch
> 1332    for b in self._stub.ExecutePlan(req, 
> metadata=self._builder.metadata()):
> 1333  File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 
> 426, in __next__
> 1334    return self._next()
> 1335  File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 
> 826, in _next
> 1336    raise self
> 1337grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC 
> that terminated with:
> 1338  status = StatusCode.UNKNOWN
> 1339  details = "[CAST_INVALID_INPUT] The value '1997/02/28 10:30:00' of the 
> type "STRING" cannot be cast to "DATE" because it is malformed. Correct the 
> value as per the syntax, or change its target type. Use `try_cast` to 
> tolerate malformed input and return NULL instead. If necessary set 
> "spark.sql.ansi.enabled" to "false" to bypass this error."
> 1340  debug_error_string = "UNKNOWN:Error received from peer 
> ipv4:127.0.0.1:15002 {grpc_message:"[CAST_INVALID_INPUT] The value 
> \'1997/02/28 10:30:00\' of the type \"STRING\" cannot be cast to \"DATE\" 
> because it is malformed. Correct the value as per the syntax, or change its 
> target type. Use `try_cast` to tolerate malformed input and return NULL 
> instead. If necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass 
> this error.", grpc_status:2, 
> created_time:"2022-12-16T01:49:15.71844837+00:00"}"
> 1341>
> 1342
> 1343======================================================================
> 1344ERROR [0.527s]: test_string_functions_one_arg 
> (pyspark.sql.tests.connect.test_connect_function.SparkConnectFunctionTests)
> 1345----------------------------------------------------------------------
> 1346Traceback (most recent call last):
> 1347  File 
> "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_function.py", 
> line 985, in test_string_functions_one_arg
> 1348    cdf.select(cfunc("a"), cfunc(cdf.b)).toPandas(),
> 1349  File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 
> 1533, in toPandas
> 1350    return self._session.client._to_pandas(query)
> 1351  File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, 
> in _to_pandas
> 1352    return self._execute_and_fetch(req)
> 1353  File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, 
> in _execute_and_fetch
> 1354    for b in self._stub.ExecutePlan(req, 
> metadata=self._builder.metadata()):
> 1355  File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 
> 426, in __next__
> 1356    return self._next()
> 1357  File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line 
> 826, in _next
> 1358    raise self
> 1359grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC 
> that terminated with:
> 1360  status = StatusCode.UNKNOWN
> 1361  details = "[CAST_INVALID_INPUT] The value '   ab   ' of the type 
> "STRING" cannot be cast to "BIGINT" because it is malformed. Correct the 
> value as per the syntax, or change its target type. Use `try_cast` to 
> tolerate malformed input and return NULL instead. If necessary set 
> "spark.sql.ansi.enabled" to "false" to bypass this error."
> 1362  debug_error_string = "UNKNOWN:Error received from peer 
> ipv4:127.0.0.1:15002 {grpc_message:"[CAST_INVALID_INPUT] The value \'   ab   
> \' of the type \"STRING\" cannot be cast to \"BIGINT\" because it is 
> malformed. Correct the value as per the syntax, or change its target type. 
> Use `try_cast` to tolerate malformed input and return NULL instead. If 
> necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass this error.", 
> grpc_status:2, created_time:"2022-12-16T01:49:25.529953492+00:00"}"
> 1363>
> 1364
> 1365----------------------------------------------------------------------
> 1366Ran 14 tests in 40.832s
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to