Haejoon Lee created SPARK-42965:
-----------------------------------
Summary: metadata mismatch for StructField when running some tests.
Key: SPARK-42965
URL: https://issues.apache.org/jira/browse/SPARK-42965
Project: Spark
Issue Type: Sub-task
Components: Connect, Pandas API on Spark
Affects Versions: 3.5.0
Reporter: Haejoon Lee
For some reason, the metadata of StructField is different in a few tests when
using Spark Connect. However, the function works properly.
For example, when running `python/run-tests --testnames
'pyspark.pandas.tests.connect.data_type_ops.test_parity_binary_ops
BinaryOpsParityTests.test_add'` it complains `AssertionError:
([InternalField(dtype=int64, struct_field=StructField('bool', LongType(),
False))], [StructField('bool', LongType(), False)])` but they have same name,
type and nullable, so the function just works well.
Therefore, we have temporarily added a branch for Spark Connect in the code so
that we can create InternalFrame properly to provide more pandas APIs in Spark
Connect. If a clear cause is found, we may need to revert it back to its
original state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]