polyzos commented on PR #1992: URL: https://github.com/apache/fluss/pull/1992#issuecomment-3575819560
@leekeiabstraction . Indeed, we are looking at a 2x performance penalty. I tested writing/scanning on both PK and Log tables with 10 million records. I ran 5 iterations for each and calculated the average. This is basically a trade-off for the users. Using InternalRow/GenericRow directly is way more efficient; however, this might come with some extra complexity and boilerplatete code. For this reason, I want to give flexibility, probably leave the docs as is, with GenericRow being the go-to approach, but also add a section that Pojos can be used directly and maybe highlight this trade-off. Moving forward I'm thinking that maybe it makes sense to add some helper classes that also derive the schema for the table from a Pojo. **Log Table** <img width="726" height="1046" alt="Screenshot 2025-11-25 at 3 32 34 PM" src="https://github.com/user-attachments/assets/f955e85c-d5b6-4a1b-86ed-13f8113a68a8" /> **Primary Key Table** <img width="749" height="1058" alt="Screenshot 2025-11-25 at 3 51 46 PM" src="https://github.com/user-attachments/assets/ba8a1ae3-29ac-4d1e-b2c0-922fd405c9a0" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
