Hyukjin Kwon created SPARK-48087:
------------------------------------
Summary: Python UDTF incompatibility in 3.5 client <> 4.0 server
Key: SPARK-48087
URL: https://issues.apache.org/jira/browse/SPARK-48087
Project: Spark
Issue Type: Sub-task
Components: Connect, PySpark
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon
{code}
======================================================================
FAIL [0.103s]: test_udtf_init_with_additional_args
(pyspark.sql.tests.connect.test_parity_udtf.ArrowUDTFParityTests.test_udtf_init_with_additional_args)
----------------------------------------------------------------------
pyspark.errors.exceptions.connect.PythonException:
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1816, in main
func, profiler, deserializer, serializer = read_udtf(pickleSer, infile,
eval_type)
self._check_result_or_exception(TestUDTF, ret_type, expected)
File
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", line
598, in _check_result_or_exception
with self.assertRaisesRegex(err_type, expected):
AssertionError: "AttributeError" does not match "
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1834, in main
process()
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1826, in process
serializer.dump_stream(out_iter, outfile)
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
line 224, in dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
line 145, in dump_stream
for obj in iterator:
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/serializers.py",
line 213, in _batched
for item in iterator:
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1391, in mapper
yield eval(*[a[o] for o in args_kwargs_offsets])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1371, in evaluate
return tuple(map(verify_and_convert_result, res))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1340, in verify_and_convert_result
return toInternal(result)
^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py",
line 1291, in toInternal
return tuple(
^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py",
line 1292, in <genexpr>
f.toInternal(v) if c else v
^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py",
line 907, in toInternal
return self.dataType.toInternal(obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/sql/types.py",
line 372, in toInternal
calendar.timegm(dt.utctimetuple()) if dt.tzinfo else
time.mktime(dt.timetuple())
..."
{code}
{code}
======================================================================
FAIL [0.096s]: test_udtf_init_with_additional_args
(pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_init_with_additional_args)
----------------------------------------------------------------------
pyspark.errors.exceptions.connect.PythonException:
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1816, in main
func, profiler, deserializer, serializer = read_udtf(pickleSer, infile,
eval_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
946, in read_udtf
raise PySparkRuntimeError(
pyspark.errors.exceptions.base.PySparkRuntimeError:
[UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the
user-defined table function 'TestUDTF' because its constructor is invalid: the
function does not implement the 'analyze' method, and its constructor has more
than one argument (including the 'self' reference). Please update the table
function so that its constructor accepts exactly one 'self' argument, and try
the query again.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", line
274, in test_udtf_init_with_additional_args
with self.assertRaisesRegex(
AssertionError: "__init__\(\) missing 1 required positional argument: 'a'" does
not match "
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1816, in main
func, profiler, deserializer, serializer = read_udtf(pickleSer, infile,
eval_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
946, in read_udtf
raise PySparkRuntimeError(
pyspark.errors.exceptions.base.PySparkRuntimeError:
[UDTF_CONSTRUCTOR_INVALID_NO_ANALYZE_METHOD] Failed to evaluate the
user-defined table function 'TestUDTF' because its constructor is invalid: the
function does not implement the 'analyze' method, and its constructor has more
than one argument (including the 'self' reference). Please update the table
function so that its constructor accepts exactly one 'self' argument, and try
the query again.
"
{code}
{code}
======================================================================
FAIL [0.087s]: test_udtf_with_wrong_num_input
(pyspark.sql.tests.connect.test_parity_udtf.UDTFParityTests.test_udtf_with_wrong_num_input)
----------------------------------------------------------------------
pyspark.errors.exceptions.connect.PythonException:
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1816, in main
func, profiler, deserializer, serializer = read_udtf(pickleSer, infile,
eval_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1082, in read_udtf
raise PySparkRuntimeError(
pyspark.errors.exceptions.base.PySparkRuntimeError:
[UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the
user-defined table function 'TestUDTF' because the function arguments did not
match the expected signature of the 'eval' method (missing a required argument:
'a'). Please update the query so that this table function call provides
arguments matching the expected signature, or else update the table function so
that its 'eval' method accepts the provided arguments, and then try the query
again.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/runner/work/spark/spark-3.5/python/pyspark/sql/tests/test_udtf.py", line
255, in test_udtf_with_wrong_num_input
with self.assertRaisesRegex(
AssertionError: "eval\(\) missing 1 required positional argument: 'a'" does not
match "
An exception was thrown from the Python worker. Please see the stack trace
below.
Traceback (most recent call last):
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1816, in main
func, profiler, deserializer, serializer = read_udtf(pickleSer, infile,
eval_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/runner/work/spark/spark/python/lib/pyspark.zip/pyspark/worker.py", line
1082, in read_udtf
raise PySparkRuntimeError(
pyspark.errors.exceptions.base.PySparkRuntimeError:
[UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to evaluate the
user-defined table function 'TestUDTF' because the function arguments did not
match the expected signature of the 'eval' method (missing a required argument:
'a'). Please update the query so that this table function call provides
arguments matching the expected signature, or else update the table function so
that its 'eval' method accepts the provided arguments, and then try the query
again.
"
----------------------------------------------------------------------
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]