huangxiaopingRD opened a new pull request, #8179:
URL: https://github.com/apache/hudi/pull/8179
### Change Logs
In the existing implementation, if the preCombine field is not specified,
the default value (ts) of the preCombine field will be obtained, and "ts" filed
will not be recognized in the case of Full record Bootstrap, resulting in
failure to generate input records. Therefore, we hope that we do not need to
specify the preCombine field when executing bootstrap.
```
Caused by: org.apache.hudi.exception.HoodieException: ts(Part -ts) field not
found in record. Acceptable fields were :[timestamp, _row_key, partition_path,
rider, driver, begin_lat, begin_lon, end_lat, end_lon, fare, tip_history,
_hoodie_is_deleted, datestr]
at
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:557)
at
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldValAsString(HoodieAvroUtils.java:535)
at
org.apache.hudi.bootstrap.SparkFullBootstrapDataProviderBase.lambda$generateInputRecords$cbf13809$1(SparkFullBootstrapDataProviderBase.java:87)
at
org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```
### Impact
Users do not need to specify preCombine when executing bootstrap.
### Risk level (write none, low medium or high below)
None
### Documentation Update
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]