alexeykudinkin commented on PR #5629:
URL: https://github.com/apache/hudi/pull/5629#issuecomment-1195937698

   @danny0405 
   
   > I agree we should decouple Hudi from Avro, but that does not mean we 
should lean back to engine-specific data structures which is very hard to 
maintain as a engine neutral project, see how hard it is for hudi to integrate 
with a new engine now :),
   i kind of expect hudi's own reader/writer/data structures, which is the 
right direction we should elaborate with.
   
   I don't think we are aligned on this one: Hudi is and will be staying 
engine-neutral project. However for the top-of-the-line performance on *any* 
engine (to stay competitive with other formats) we *have to* use 
engine-specific representations (think `Row`, `RowData`, `ArrayWritable`, 
Arrow, etc). There's just no other way -- any intermediate representation will 
be a tax on performance, and general direction is that we want to provide best 
possible performance in any supported workload be it a a read or write.
   
   > And another concern i always have in my mind is hudi needs a stable 
release tooo much ! We can not make huge changes to core reader/writers now at 
this moment before we do enough tests/practice, and we should not rush in the 
code for just the reason of code rebase effort.
   
   Totally agree with you there, and it's one of the reasons why we decided 
that it's a good idea to take a more measured approach here and avoid pushing 
really hard (and compromising on quality testing) to meet 0.12 deadline.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to