Ben Wan created SPARK-39130:
-------------------------------
Summary: How do I read parquet with python object
Key: SPARK-39130
URL: https://issues.apache.org/jira/browse/SPARK-39130
Project: Spark
Issue Type: Question
Components: PySpark
Affects Versions: 2.4.5
Environment: pyspark2.4.5
Reporter: Ben Wan
{{python:}}
import pandas as pd
a=pd.DataFrame([[1,[2.3,1.2]]],columns=['a','b'])
a.to_parquet('a.parquet')
pyspark:
d2 = spark.read.parquet('a.parquet')
will return error:
An error was encountered: An error occurred while calling o277.showString. :
org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in
stage 9.0 failed 4 times, most recent failure: Lost task 14.2 in stage 9.0 (TID
63, 10.169.0.196, executor 15): java.lang.IllegalArgumentException: Illegal
Capacity: -221
how can I fix it?
Thanks.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]