Hi Danny,

Thanks for summarizing the current progress towards the 0.10.0 release.
I'm good with Nov 26th cutoff.

Regarding my blockers:
- [HUDI-2332] Implement scheduling of compaction/ clustering for Kafka
   Connect (Owner: Ethan Guo)
PR is up.  I'm addressing comments.

- [HUDI-2737] Use earliest instant by default for compaction and
   clustering job (Owner: Ethan Guo)
PR is up and approved.  It's near-landing after fixing CI failures.

- [HUDI-2745] Record count does not match input after compaction is
   scheduled when running Hudi Kafka Connect sink (Owner: Ethan Guo)
HUDI-2745 is going to be blocked on HUDI-2480, which is going to resolve
this issue once done.

- [HUDI-2735] Fix archival of commits in Java client for Kafka Connect
   (Owner: Ethan Guo)
This is pending and requires investigation into the archival logic which is
not Kafka-connect specific.

Best,
- Ethan


On Fri, Nov 19, 2021 at 4:41 PM Rajesh Mahindra <rmahin...@gmail.com> wrote:

> Hi Danny,
>
> I have the following blockers that have a PR up. I am working on a PR for
> the Debezium Source. I am fine with Nov 26th as cut off.
>
>    - [HUDI-2325] Implement and test Hive Sync support for Kafka Connect
>    (Owner: Rajesh Mahindra)
>    - [HUDI-2671] Fix record offset handling in Kafka connect transaction
>    participant (Owner: Rajesh Mahindra)
>    - [HUDI-2672] Avoid empty commits and rollbacks when there is no event
>    from the topic (Owner: Rajesh Mahindra)
>
> ** Pending
>    - [HUDI-1290] Implement Debezium avro source for Delta Streamer
>
> Thanks
> Rajesh
>
>
> On Fri, Nov 19, 2021 at 4:01 PM Udit Mehrotra <udi...@apache.org> wrote:
>
> > Hi Danny,
> >
> > I have a blocker as well
> > https://issues.apache.org/jira/browse/HUDI-2802. Nov 26th cut off date
> > works fine for me.
> >
> > Also, just an update on the above list: HUDI-2641, HUDI-2314,
> > HUDI-2362 are unblocked/merged. HUDI-2314 and HUDI-2362 can be marked
> > in the highlights section. We will work on getting some doc updates
> > for the same by next week.
> >
> > Thanks,
> > Udit
> >
> > On Fri, Nov 19, 2021 at 3:49 PM Vinoth Chandar <vin...@apache.org>
> wrote:
> > >
> > > Hi Danny,
> > >
> > > I have one blocker. I plan to complete it by end of next week. I am
> good
> > > with the prior Nov 26 cutoff.
> > > Does that work for everyone?
> > >
> > > Thanks
> > > Vinoth
> > >
> > > On Fri, Nov 19, 2021 at 12:12 AM Danny Chan <danny0...@apache.org>
> > wrote:
> > >
> > > > Hi Community,
> > > >
> > > > As we draw close to doing Hudi 0.10.0 release, I am happy to share a
> > > > summary of the key features/improvements that would be going in the
> > release
> > > > and the current blockers for everyone's visibility.
> > > >
> > > > *Highlights*
> > > >
> > > >    - [HUDI-1290] Implement Debezium avro source for Delta Streamer
> > > >    - [HUDI-1491] Support partition pruning for MOR snapshot query
> > > >    - [HUDI-1763] DefaultHoodieRecordPayload does not honor ordering
> > value
> > > >    when records within multiple log files are merged
> > > >    - [HUDI-1827] Add ORC support in Bootstrap Op
> > > >    - [HUDI-1869] Upgrading Spark3 To 3.1
> > > >    - [HUDI-2101] support z-order for hudi
> > > >    - [HUDI-2276] Enable Metadata Table by default for both writers
> and
> > > >    readers
> > > >    - [HUDI-2581] Analyze metadata size estimate in hudi with Hfile
> for
> > col
> > > >    stats partition
> > > >    - [HUDI-2634] Improve bootstrap performance for very large tables
> > > >    - [HUDI-2086] redo the logical of mor_incremental_view for hive
> > > >    - [HUDI-2191] Bump flink version to 1.13.1
> > > >    - [HUDI-2285] Metadata Table Synchronous Design
> > > >    - [HUDI-2316] Support Flink batch upsert
> > > >    - [HUDI-2371] Improve flink streaming reader
> > > >    - [HUDI-2394] [Kafka Connect Mileston 1] Implement kafka connect
> for
> > > >    immutable data
> > > >    - [HUDI-2449] Incremental read for Flink
> > > >    - [HUDI-2562] Embedded timeline server on JobManager
> > > >
> > > > *Current Blockers*
> > > >
> > > >    - [HUDI-1856] Upstream changes made in PrestoDB to eliminate file
> > > >    listing to Trino (Owner: Sagar Sumit)
> > > >    - [HUDI-1912] Presto defaults to GenericHiveRecordCursor for all
> > Hudi
> > > >    tables (Owner: Sagar Sumit)
> > > >    - [HUDI-1932] Hive Sync should not always update
> > last_commit_time_sync
> > > >    (Owner: Raymond Xu)
> > > >    - [HUDI-1937] When clustering fail, generating unfinished
> > replacecommit
> > > >    timeline. (Owner: Sagar Sumit)
> > > >    - [HUDI-2077] Flaky test: TestHoodieDeltaStreamer (Owner: Sagar
> > Sumit)
> > > >    - [HUDI-2314] Add DynamoDb based lock provider (Owner: Wenning
> Ding)
> > > >    - [HUDI-2325] Implement and test Hive Sync support for Kafka
> Connect
> > > >    (Owner: Rajesh Mahindra)
> > > >    - [HUDI-2332] Implement scheduling of compaction/ clustering for
> > Kafka
> > > >    Connect (Owner: Ethan Guo)
> > > >    - [HUDI-2362] Hudi external configuration file support (Owner:
> > Wenning
> > > >    Ding)
> > > >    - [HUDI-2409] Using HBase shaded jars in Hudi presto bundle
> (Owner:
> > > >    Sagar Sumit)
> > > >    - [HUDI-2443] KVComparator in HFile for metadata table is tied to
> > HBase
> > > >    version and shading (Owner: Sagar Sumit)
> > > >    - [HUDI-2472] Tests failure follow up when metadata is enabled by
> > > >    default (Owner: Manoj Govindassamy)
> > > >    - [HUDI-2475] Rolling Upgrade downgrade story for 0.10 & enabling
> > > >    metadata (Owner: Manoj Govindassamy)
> > > >    - [HUDI-2478] Handle failure mid-way during init buckets (Owner:
> > Vinoth
> > > >    Chandar)
> > > >    - [HUDI-2480] FileSlice after pending compaction-requested
> > instant-time
> > > >    is ignored by MOR snapshot reader (Owner: Danny Chen)
> > > >    - [HUDI-2488] Support bootstrapping a single or more partitions in
> > > >    metadata table while regular writers and table services are in
> > progress
> > > >    (Owner: Vinoth Chandar)
> > > >    - [HUDI-2527] Flaky test:
> > > >
> > > >
> >
> TestHoodieClientMultiWriter.testMultiWriterWithAsyncTableServicesWithConflict
> > > >    (Owner: sivabalan narayanan)
> > > >    - [HUDI-2559] Ensure unique timestamps are generated for commit
> > times
> > > >    with concurrent writers (Owner: sivabalan narayanan)
> > > >    - [HUDI-2593] Virtual keys support for metadata table (Owner:
> Manoj
> > > >    Govindassamy)
> > > >    - [HUDI-2599] [Performance] Lower parallelism with snapshot query
> > on COW
> > > >    tables in Presto (Owner: Sagar Sumit)
> > > >    - [HUDI-2628] Fix Chinese Docs (Owner: Kyle Weller)
> > > >    - [HUDI-2636] Make release notes discoverable (Owner: Kyle Weller)
> > > >    - [HUDI-2637] Triage all bugs around Multi-writer and certify the
> > tested
> > > >    flows (Owner: sivabalan narayanan)
> > > >    - [HUDI-2641] One inflight commit rolling back other concurrent
> > inflight
> > > >    commits causing them to fail (Owner: Udit Mehrotra)
> > > >    - [HUDI-2649] Kick off all the Hive query issues for 0.10.0
> (Owner:
> > > >    Sagar Sumit)
> > > >    - [HUDI-2666] async compaction failing with timeline mismatches
> > between
> > > >    server and client when metadata is enabled (Owner: Manoj
> > Govindassamy)
> > > >    - [HUDI-2667] Avoid fs.exists() and fs.mkdirs() call to partitions
> > in
> > > >    AbstractTablefileSystemView (Owner: Sagar Sumit)
> > > >    - [HUDI-2671] Fix record offset handling in Kafka connect
> > transaction
> > > >    participant (Owner: Rajesh Mahindra)
> > > >    - [HUDI-2672] Avoid empty commits and rollbacks when there is no
> > event
> > > >    from the topic (Owner: Rajesh Mahindra)
> > > >    - [HUDI-2716] Fix InLineFS path conversions for S3FS paths (Owner:
> > Manoj
> > > >    Govindassamy)
> > > >    - [HUDI-2725] Add precommit validators doc (Owner: Kyle Weller)
> > > >    - [HUDI-2731] Clustering should work regardless of whether there
> are
> > > >    base files (Owner: Sagar Sumit)
> > > >    - [HUDI-2734] Disable metadata by default for flink and java
> (Owner:
> > > >    sivabalan narayanan)
> > > >    - [HUDI-2735] Fix archival of commits in Java client for Kafka
> > Connect
> > > >    (Owner: Ethan Guo)
> > > >    - [HUDI-2737] Use earliest instant by default for compaction and
> > > >    clustering job (Owner: Ethan Guo)
> > > >    - [HUDI-2741] Validate metadata config for all readers (Owner:
> Sagar
> > > >    Sumit)
> > > >    - [HUDI-2745] Record count does not match input after compaction
> is
> > > >    scheduled when running Hudi Kafka Connect sink (Owner: Ethan Guo)
> > > >    - [HUDI-2762] Ensure hive can query insert only logs in MOR
> (Owner:
> > agar
> > > >    Sumit)
> > > >    - [HUDI-2763] Avoid persisting redundant key field in the Metadata
> > table
> > > >    record payload (Owner: Manoj Govindassamy)
> > > >    - [HUDI-2764] Address test failures after enabling virtual keys
> > support
> > > >    for the metadata table (Owner: Manoj Govindassamy)
> > > >    - [HUDI-2766] Enable marker based rollback by default (Owner:
> > sivabalan
> > > >    narayanan)
> > > >    - [HUDI-2767] Enable timeline server based marker type as default
> > > >    (Owner: sivabalan narayanan)
> > > >    - [HUDI-2770] Update docs for HoodieCompactor (compaction) and
> > > >    HoodieClusteringJob (clustering) (Owner: Kyle Weller)
> > > >
> > > >
> > > > Please respond to the thread if you think that I have missed
> capturing
> > any
> > > > of the highlights or blockers for Hudi 0.10.0 release. For the owners
> > of
> > > > these release blockers, can you please provide a specific timeline
> you
> > are
> > > > willing to commit to for finishing these so we can cut an RC ?
> > > >
> > > > Thanks,
> > > > Danny
> > > >
> >
>
>
> --
> Take Care,
> Rajesh Mahindra
>

Reply via email to