[ https://issues.apache.org/jira/browse/HUDI-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HUDI-898: -------------------------------- Labels: pull-request-available (was: ) > Need to add Schema parameter to HoodieRecordPayload::preCombine > --------------------------------------------------------------- > > Key: HUDI-898 > URL: https://issues.apache.org/jira/browse/HUDI-898 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core > Reporter: Yixue Zhu > Assignee: Balaji Varadarajan > Priority: Major > Labels: pull-request-available > > We are working on Mongo Oplog integration with Hudi, to stream Mongo updates > to Hudi tables. > There are 4 Mongo OpLog operations we need to handle, CRUD (create, read, > update, delete). > Currently Hudi handle create/read, delete, but not update well with existing > preCombine API in HoodieRecordPayload class. In particularly, Update > operation contains "patch" field, which is extended Json describing update > for dot separated field paths. > We need to pass Avro schema to preCombine API for it to work: > Even though BaseAvroPayload constructor accepts GenericRecord, which has Avro > schema reference, but it materialize GenericRecord to bytes, to support > serialization/deserialization by ExternalSpillableMap. > > Is there concern/objection to this? in other words, have I overlooked > something? > -- This message was sent by Atlassian Jira (v8.3.4#803005)