vladatcdlx opened a new issue, #9443:
URL: https://github.com/apache/hudi/issues/9443
**Describe the problem you faced**
Getting the following error when trying to run a spark job which reads and
upserts a large amount of data into a hudi table.
```
org.apache.hudi.com.esotericsoftware.kryo.KryoException:
java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
maxValue (org.apache.hudi.avro.model.HoodieMetadataColumnStats)
columnStatMetadata (org.apache.hudi.metadata.HoodieMetadataPayload)
data (org.apache.hudi.common.model.HoodieAvroRecord)
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
~[__app__.jar:?]
...
Caused by: java.lang.UnsupportedOperationException
at
java.util.Collections$UnmodifiableCollection.add(Collections.java:1057)
~[?:1.8.0_382]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
~[__app__.jar:?]
... 55 more
```
Getting it at the following job.stage
```
Preparing compaction metadata: redactedtablename_metadata
collect at HoodieJavaRDD.java:155
```
**To Reproduce**
Reproduction may be difficult. The job successfully ran at least 7 times
before beginning to have this error.
Steps to reproduce the behavior:
2. Create a spark job which write to the Hudi.table
3. Execute the spark job
**Expected behavior**
The hudi write should successfully execute
**Environment Description**
* Hudi version :
```
<hudi-aws.version>0.12.2</hudi-aws.version>
```
* Spark version :
```
<spark.major.version>3.3</spark.major.version>
<spark.build.version>3.3.0</spark.build.version>
```
* Hive version :
```
<scala.binary.version>2.12</scala.binary.version>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
```
* Hadoop version :
```
<hadoop.common.version>3.3.3</hadoop.common.version>
```
* Storage (HDFS/S3/GCS..) :
S3
* Running on Docker? (yes/no) :
No, AWS.EMR
**Stacktrace**
```
org.apache.hudi.com.esotericsoftware.kryo.KryoException:
java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
maxValue (org.apache.hudi.avro.model.HoodieMetadataColumnStats)
columnStatMetadata (org.apache.hudi.metadata.HoodieMetadataPayload)
data (org.apache.hudi.common.model.HoodieAvroRecord)
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
~[__app__.jar:?]
...
Caused by: java.lang.UnsupportedOperationException
at
java.util.Collections$UnmodifiableCollection.add(Collections.java:1057)
~[?:1.8.0_382]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
~[__app__.jar:?]
at
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
~[__app__.jar:?]
... 55 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]