aizain opened a new issue, #7375:
URL: https://github.com/apache/hudi/issues/7375
**Describe the problem you faced**
When i enable async clustering, hudi write xxx.replacecommit.requested is
avro.schema. but canSkipBatch function read it file use json reader, throw
Unrecognized token 'Obj^A^B^Vavro'.
How can i fixed it ?i deleted it but it also happend in next replacecommit
**To Reproduce**
Steps to reproduce the behavior:
1. run spark stream use sink hudi
spark.
sql(conf.getSql).
na.fill("").
writeStream.
format("hudi").
options( conf.getHudiConf).
option("checkpointLocation", conf.getCheckpointPath).
trigger(conf.getTrigger).
outputMode(OutputMode.Append()).
start(conf.getOutputPath(conf.getHudiTableName))
2. when doing clusting => 20221204152715580__replacecommit__REQUESTED
=> 20221204150150328.replacecommit.requested
is avro file
<img width="856" alt="image"
src="https://user-images.githubusercontent.com/17040353/205480214-149835a4-d256-4b0e-a373-4a846426d4a1.png">
3. keep run stream
4. when run function HoodieStreamingSink.addBatch => canSkipBatch =>
CommitUtils.getLatestCommitMetadataWithValidCheckpointInfo
5. throws error
Caused by: org.apache.hudi.exception.HoodieIOException: Failed to parse
HoodieCommitMetadata for [==>20221204152715580_
_replacecommit__REQUESTED]
Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token
'Obj^A^B^Vavro': was expecting ('true', 'f
alse' or 'null')
**Expected behavior**
**Environment Description**
* Hudi version :
0.12.1
* Spark version :
2.4.3.2
* Hive version :
no
* Hadoop version :
* Storage (HDFS/S3/GCS..) :
use HDFS
* Running on Docker? (yes/no) :
no
**Additional context**
Add any other context about the problem here.
**Stacktrace**
```Add the stacktrace of the error.```
<img width="1432" alt="image"
src="https://user-images.githubusercontent.com/17040353/205480108-03422fb7-fbe7-40dd-93e7-75b85eecbe21.png">
<img width="548" alt="image"
src="https://user-images.githubusercontent.com/17040353/205480341-98de0cd8-6062-4f6f-8801-f380464d4e85.png">
<img width="1289" alt="image"
src="https://user-images.githubusercontent.com/17040353/205480351-131bab02-8710-414c-aa6c-c0c065562d56.png">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]