Jian Feng created HUDI-2414:
-------------------------------

             Summary: enable Hot and cold data separate when ingest data
                 Key: HUDI-2414
                 URL: https://issues.apache.org/jira/browse/HUDI-2414
             Project: Apache Hudi
          Issue Type: Improvement
          Components: Writer Core
            Reporter: Jian Feng


when using Hudi to ingest e-commercial company's item data,there are massive 
update data into old partitions,if one record need update, then the whole file 
it belongs need rewrite, that result in every commit nearly rewrite the whole 
table.

I'm thinking if Hudi can provide a hot and cold data separate tool, work with 
specific column(such as create time and update time) to distinguish hot data 
and cold data, then rebuild table to separate them into different file groups, 
after recreate table, the performance will be much better 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to