bvaradar commented on issue #2066:
URL: https://github.com/apache/hudi/issues/2066#issuecomment-687274706


   In this case, you are using complex key combining 5 columns. From what I 
have seen with user deployments, this is very unusual (most common case being 1 
or 2 columns). Having materialized record key has its benefits. 
   
   That being said, this could be because the individual columns themselves 
would have been highly compressible but not the concatenation of them. The 
other factor being what is the proportion of columns that constitute a record 
key. You can try using parquet tools to see column/block level stats on both 
parquet and hudi files to get more insights. 
   
   BTW, if you have noticed the dev@ , user@ community emails, there is work 
happening on making the record_key virtual. 
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to