[ 
https://issues.apache.org/jira/browse/SPARK-34219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271049#comment-17271049
 ] 

Hyukjin Kwon commented on SPARK-34219:
--------------------------------------

The error message says something was too big during (de)serializing the 
function. You will have to reduce the size, and the root cause is apparently 
the size limitation, which seems not from Spark.
This one, it looks like it should be best to interact via Spark mailing list 
before filing it as an issue.

> yark-cluster  java.io.UTFDataFormatException
> --------------------------------------------
>
>                 Key: SPARK-34219
>                 URL: https://issues.apache.org/jira/browse/SPARK-34219
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, YARN
>    Affects Versions: 2.4.0
>            Reporter: zhouliutao
>            Priority: Major
>
> The spark program reports an error when writing the DataFrame to the mysql 
> data table, and the error information is shown in the log.
> ------------------------------------------------------------------------------------------------------------------
> Caused by: java.io.UTFDataFormatException: encoded string too long: 87824 
> bytes
>  at java.io.DataOutputStream.writeUTF(DataOutputStream.java:364)
>  at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:314)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValueData(SerializedConfigValue.java:328)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeValue(SerializedConfigValue.java:388)
>  at 
> com.typesafe.config.impl.SerializedConfigValue.writeExternal(SerializedConfigValue.java:454)
>  at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
>  at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>  at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>  at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>  at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>  at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>  at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
>  at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>  at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>  at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
>  at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)
>  
> --------------------------------------------------------------------------------------------------------------------
> I tested it and found that if the "field A" field in the dataframe is 
> removed, it can be written to mysql normally; if "field A" is written, the 
> same error will be reported.
> The value of field A is approximately 1000 bytes in length.
> If you change the running mode of the program in app-env.sh to local and 
> client mode, you can write to mysql normally; use yarn and cluster mode, and 
> report the error in the attached log.
> Is this error related to the cluster version and configuration?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to