[GitHub] [hudi] danny0405 commented on pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

GitBox Mon, 01 Aug 2022 23:11:38 -0700


danny0405 commented on PR #5629:
URL: https://github.com/apache/hudi/pull/5629#issuecomment-1202059975


   > I don't think we are aligned on this one: Hudi is and will be staying 
engine-neutral project. However for the top-of-the-line performance on _any_ 
engine (to stay competitive with other formats) we _have to_ use 
engine-specific representations (think `Row`, `RowData`, `ArrayWritable`, 
Arrow, etc). There's just no other way
   
   I do have some different thoughts, performance is on first priority if it is 
critical, say Hudi performs bad on some benchmark, but for the long run, i do 
think as a storage we should have our own data structures/reader writers/schema 
like every storage engine do, looks like how easy it is to a new engine to 
integrate with Iceberg and how hard it is for Hudi, the ease to integrate  is 
important for the ecosystem especially as a `format`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] danny0405 commented on pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

Reply via email to