cshuo commented on PR #12729:
URL: https://github.com/apache/hudi/pull/12729#issuecomment-2677323700

   > > > > @Alowator @cshuo @danny0405 great to see this effort. Do we agree on 
two separate rfcs?
   > > > > Let's make a call and either land this PR to claim 87 or add 
@Alowator to 88?
   > > > 
   > > > 
   > > > As discussed offline, it's ok to keep the flink optimizing stuff 
separate, so as to make it possible to land solving avro performance issue in 
release 1.1. @Alowator Could you briefly outline the scope of RFC-87, so that 
optimization works about other aspect of Flink integration can start 
simultaneously, such as reading/compaction..just make sure we are not doing the 
same things.
   > > 
   > > 
   > > @cshuo I agree with your point. To ensure we don't duplicate efforts, 
here’s the plan I propose:
   > > 
   > > 1. RFC-88 will focus solely on defining new abstractions with old avro 
writers.
   > > 2. RFC-87 will be dedicated to performance improvements that leverage 
those abstractions.
   > >    Since RFC-88 is big, I can assist by implementing the abstractions 
for the writer as part of that RFC. Then, I will move forward with implementing 
Avro elimination in RFC-87.
   > > 
   > > For RFC-87, I can start drafting the design for the performance 
improvements. However, the actual implementation of this design will require 
the abstractions from RFC-88 to be completed first.
   > > For RFC-87 it does not affect the reader's logic directly. However, 
conflicts could arise in the compactor logic, for writers which utilize it. 
These potential conflicts should be clearly outlined in the design of RFC-87.
   > > TL;DR: It’s possible to start designing RFC-87 without RFC-88, but full 
implementation depends on completing the abstractions in RFC-88 first.
   > 
   > Hi @Alowator, after discussing with @danny0405, we think the optimizing of 
Flink reading/writing has a higher priority, at least for release 1.1. In 
summary, we have the following suggestions:
   > 
   > * Push 87 forward without depending on RFC-88 at all, including the 
abstraction of writers/readers. As the abstraction of reader/writer in RFC-88 
has other dependencies, for eg, Data Type. So let's keep RFC-87 orthogonal, and 
make it possible to land without any blocker.
   > * As for the scope of this RFC, it's ok this RFC focuses on writing part 
of Flink integration, and for design choice wise, it'd be better that data is 
kept as `RowData` all the way in the writing path without any conversions, 
e.g., HoodieRecord, to achieve optimal write performance.
   
   @Alowator kindly reminder, hope we can moving fast to solving the flink 
read/writing performance issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to