[ https://issues.apache.org/jira/browse/SPARK-41548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-41548: ------------------------------------ Assignee: (was: Apache Spark) > Disable ANSI mode in pyspark.sql.tests.connect.test_connect_functions > --------------------------------------------------------------------- > > Key: SPARK-41548 > URL: https://issues.apache.org/jira/browse/SPARK-41548 > Project: Spark > Issue Type: Sub-task > Components: Connect, Tests > Affects Versions: 3.4.0 > Reporter: Hyukjin Kwon > Priority: Major > > There are failures in {{test_connect_functions}} with ANSI mode on > (https://github.com/apache/spark/actions/runs/3709431687/jobs/6288067223). I > tried to fix but they are tricky to fix because Spark Connect does not > respect the runtime configuration at the server side. > It is also tricky to fix the test to pass in both ANSI mode on and off. > Therefore, it disables temporarily to make other tests pass. Note that > PySpark tests stop in the middle if one fails. > {code:java} > ====================================================================== > 1322ERROR [0.264s]: test_date_ts_functions > (pyspark.sql.tests.connect.test_connect_function.SparkConnectFunctionTests) > 1323---------------------------------------------------------------------- > 1324Traceback (most recent call last): > 1325 File > "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_function.py", > line 1149, in test_date_ts_functions > 1326 cdf.select(cfunc(cdf.ts1)).toPandas(), > 1327 File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line > 1533, in toPandas > 1328 return self._session.client._to_pandas(query) > 1329 File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, > in _to_pandas > 1330 return self._execute_and_fetch(req) > 1331 File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, > in _execute_and_fetch > 1332 for b in self._stub.ExecutePlan(req, > metadata=self._builder.metadata()): > 1333 File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line > 426, in __next__ > 1334 return self._next() > 1335 File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line > 826, in _next > 1336 raise self > 1337grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC > that terminated with: > 1338 status = StatusCode.UNKNOWN > 1339 details = "[CAST_INVALID_INPUT] The value '1997/02/28 10:30:00' of the > type "STRING" cannot be cast to "DATE" because it is malformed. Correct the > value as per the syntax, or change its target type. Use `try_cast` to > tolerate malformed input and return NULL instead. If necessary set > "spark.sql.ansi.enabled" to "false" to bypass this error." > 1340 debug_error_string = "UNKNOWN:Error received from peer > ipv4:127.0.0.1:15002 {grpc_message:"[CAST_INVALID_INPUT] The value > \'1997/02/28 10:30:00\' of the type \"STRING\" cannot be cast to \"DATE\" > because it is malformed. Correct the value as per the syntax, or change its > target type. Use `try_cast` to tolerate malformed input and return NULL > instead. If necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass > this error.", grpc_status:2, > created_time:"2022-12-16T01:49:15.71844837+00:00"}" > 1341> > 1342 > 1343====================================================================== > 1344ERROR [0.527s]: test_string_functions_one_arg > (pyspark.sql.tests.connect.test_connect_function.SparkConnectFunctionTests) > 1345---------------------------------------------------------------------- > 1346Traceback (most recent call last): > 1347 File > "/__w/spark/spark/python/pyspark/sql/tests/connect/test_connect_function.py", > line 985, in test_string_functions_one_arg > 1348 cdf.select(cfunc("a"), cfunc(cdf.b)).toPandas(), > 1349 File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line > 1533, in toPandas > 1350 return self._session.client._to_pandas(query) > 1351 File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 333, > in _to_pandas > 1352 return self._execute_and_fetch(req) > 1353 File "/__w/spark/spark/python/pyspark/sql/connect/client.py", line 418, > in _execute_and_fetch > 1354 for b in self._stub.ExecutePlan(req, > metadata=self._builder.metadata()): > 1355 File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line > 426, in __next__ > 1356 return self._next() > 1357 File "/usr/local/lib/python3.9/dist-packages/grpc/_channel.py", line > 826, in _next > 1358 raise self > 1359grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC > that terminated with: > 1360 status = StatusCode.UNKNOWN > 1361 details = "[CAST_INVALID_INPUT] The value ' ab ' of the type > "STRING" cannot be cast to "BIGINT" because it is malformed. Correct the > value as per the syntax, or change its target type. Use `try_cast` to > tolerate malformed input and return NULL instead. If necessary set > "spark.sql.ansi.enabled" to "false" to bypass this error." > 1362 debug_error_string = "UNKNOWN:Error received from peer > ipv4:127.0.0.1:15002 {grpc_message:"[CAST_INVALID_INPUT] The value \' ab > \' of the type \"STRING\" cannot be cast to \"BIGINT\" because it is > malformed. Correct the value as per the syntax, or change its target type. > Use `try_cast` to tolerate malformed input and return NULL instead. If > necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass this error.", > grpc_status:2, created_time:"2022-12-16T01:49:25.529953492+00:00"}" > 1363> > 1364 > 1365---------------------------------------------------------------------- > 1366Ran 14 tests in 40.832s > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org