Hi devs,

We would like to start a discussion about AIP-3: Event-Triggered Optimization 
of Iceberg Tables in Amoro [1].


Currently, Amoro uses a periodic scan mechanism to evaluate and optimize 
Iceberg tables. However, with the increasing size of tables and frequent data 
changes, this periodic approach can lead to inefficiencies, including redundant 
scans and resource wastage. These inefficiencies can ultimately cause system 
delays, reduced performance, and scalability limitations, negatively impacting 
the user experience.


In this proposal, we introduce an event-driven optimization mechanism that 
triggers optimization evaluation based on the loaded table metadata/metrics or 
Iceberg commit operations.


The key goals of this proposal include:

Reducing unnecessary full-table scans.


Improving scan efficiency by triggering scans for tables that need optimization 
based on table metadata.


Enhancing system performance and reducing resource consumption for 
large-scale tables.



For ease of discussion, the Google Doc for this AIP is provided in [2]. We 
are excited to hear your thoughts, feedback, and any concerns you may have 
about this new approach.


Looking forward to your input!


[1] 
https://cwiki.apache.org/confluence/display/AMORO/AIP-3%3A+Event-Triggered+Optimization+of+Iceberg+Tables+in+Amoro
[2] 
https://docs.google.com/document/d/1vNdMPjnKZeGukEFBrJLoIdXILT7HXflkBYSJpEqzkM4/edit?usp=sharing


Best regards,
Zhuojun


Jzjsnow
5797...@qq.com

Reply via email to