Haejoon Lee created SPARK-42965:
-----------------------------------

             Summary: metadata mismatch for StructField when running some tests.
                 Key: SPARK-42965
                 URL: https://issues.apache.org/jira/browse/SPARK-42965
             Project: Spark
          Issue Type: Sub-task
          Components: Connect, Pandas API on Spark
    Affects Versions: 3.5.0
            Reporter: Haejoon Lee


For some reason, the metadata of StructField is different in a few tests when 
using Spark Connect. However, the function works properly.

For example, when running `python/run-tests --testnames 
'pyspark.pandas.tests.connect.data_type_ops.test_parity_binary_ops 
BinaryOpsParityTests.test_add'` it complains `AssertionError: 
([InternalField(dtype=int64, struct_field=StructField('bool', LongType(), 
False))], [StructField('bool', LongType(), False)])` but they have same name, 
type and nullable, so the function just works well.

Therefore, we have temporarily added a branch for Spark Connect in the code so 
that we can create InternalFrame properly to provide more pandas APIs in Spark 
Connect. If a clear cause is found, we may need to revert it back to its 
original state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to