Re: [I] [SUPPORT] org.apache.avro.SchemaParseException: Can't redefine: array When there are Top level variables , Struct and Array[struct] (no complex datatype within array[struct]) [hudi]

2024-04-11 Thread via GitHub


ad1happy2go commented on issue #7717:
URL: https://github.com/apache/hudi/issues/7717#issuecomment-2049464984

   @Jonathanrodrigr12 Did you also had multiple "value" column across structs? 
This may be same as issue raised by @junkri 
https://github.com/apache/hudi/issues/10983  and not this original issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] org.apache.avro.SchemaParseException: Can't redefine: array When there are Top level variables , Struct and Array[struct] (no complex datatype within array[struct]) [hudi]

2024-04-09 Thread via GitHub


junkri commented on issue #7717:
URL: https://github.com/apache/hudi/issues/7717#issuecomment-2045350745

   @Jonathanrodrigr12 I think I ran into the same problem as you, I can see on 
your screenshot that you have the field called "value" defined multiple times, 
once as a `decimal` and once as a `struct`. I've just raised a separate issue 
covering this, please check out if it covers your situation as well: 
https://github.com/apache/hudi/issues/10983
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] org.apache.avro.SchemaParseException: Can't redefine: array When there are Top level variables , Struct and Array[struct] (no complex datatype within array[struct]) [hudi]

2024-02-28 Thread via GitHub


Jonathanrodrigr12 commented on issue #7717:
URL: https://github.com/apache/hudi/issues/7717#issuecomment-1969960684

   i am use Spark version : 3.4.1 and hudi 0.14.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] org.apache.avro.SchemaParseException: Can't redefine: array When there are Top level variables , Struct and Array[struct] (no complex datatype within array[struct]) [hudi]

2024-02-27 Thread via GitHub


ad1happy2go commented on issue #7717:
URL: https://github.com/apache/hudi/issues/7717#issuecomment-1966705910

   What hudi and spark version you are using @Jonathanrodrigr12 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] org.apache.avro.SchemaParseException: Can't redefine: array When there are Top level variables , Struct and Array[struct] (no complex datatype within array[struct]) [hudi]

2024-02-23 Thread via GitHub


Jonathanrodrigr12 commented on issue #7717:
URL: https://github.com/apache/hudi/issues/7717#issuecomment-1962262653

   Hi, i have the same problem but i am use the HoodieMultiTableStreamer 
   **Description**
   I have a lot parquet files, all of them have this struct
   
![image](https://github.com/apache/hudi/assets/53848036/2c15084d-b17c-471f-8a5d-0b77391a7958)
   
   
   but the first time when i run the job in emr serverless the data is saved, 
but int the second attemp i have this error
   
   **Expected behavior**
   The second write succeeds.
   
   **Environment Description**
   Hudi hudi-utilities-bundle_2.12-0.14.0-amzn-0.jar
   Spark version : 3.4.1
   EMR: 6.15.0
   Stack Trace
   `org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType 
UPDATE for partition :0
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleInsertPartition(BaseSparkCommitActionExecutor.java:348)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:259)
at 
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
at 
org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
at 
org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
at 
org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
at 
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
at 
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:563)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:566)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieException: 
org.apache.avro.SchemaParseException: Can't redefine: value
at 
org.apache.hudi.table.action.commit.HoodieMergeHelper.runMerge(HoodieMergeHelper.java:149)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpdateInternal(BaseSparkCommitActionExecutor.java:387)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpdate(BaseSparkCommitActionExecutor.java:369)
at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
... 30 more
   Caused by: org.apache.avro.SchemaParseException: Can't redefine: value
at org.apache.avro.Schema$Names.put(Schema.java:1586)
at org.apache.avro.Schema$NamedSchema.writeNameRef(Schema.java:844)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:1011)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:1278)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:1039)
at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:1023)
at org.apache.avro.Schema$ArraySchema.toJson(Schema.java:1173)
at org.apache.avro.Schema$UnionSchema.toJson(Schema.java:1278)
at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:1039)
at