zhangwl9 commented on PR #4238:
URL: https://github.com/apache/amoro/pull/4238#issuecomment-4627673199

   > **Overall: Good design direction, but suggest refining the category 
taxonomy before merging.**
   > 
   > The idea of separating table processes into different tabs is great. 
However, I think `MAINTENANCE` is too broad a category and we should consider a 
more precise taxonomy that better aligns with the nature of these operations.
   > 
   > ### Suggested Classification
   > Instead of a binary `OPTIMIZING` / `MAINTENANCE` split, I'd suggest a 
three-way classification:
   > 
   > Category   Purpose Current Operations      Future Extensions
   > **OPTIMIZING**     Performance optimization (data reorganization)  Minor / 
Major / Full Compaction Clustering, Sort Rewrite
   > **CLEANUP**        Space reclamation & lifecycle management        Expire 
Snapshots, Expire Data, Clean Orphan Files, Clean Dangling Delete Files  
VACUUM, Remove Old Metadata
   > **PROFILING**      Information enrichment & metadata augmentation  Auto 
Create Tags        Collect Statistics, Build Index
   > Additionally, `Sync Hive Tables` is more of an internal implementation 
detail and probably should **not** be exposed to users in any tab.
   > 
   > ### Rationale
   > * "Maintenance" is too vague — compaction could also be considered 
"maintenance" in a broad sense.
   > * The operations currently under `MAINTENANCE` actually serve two distinct 
purposes: space reclamation (Expire/Clean) vs. metadata enrichment (Auto Create 
Tags). As we add more operations (e.g., statistics collection), this 
distinction will become more important.
   > * This three-way split aligns with industry conventions: Delta Lake has 
`OPTIMIZE` / `VACUUM`, Iceberg docs separate "rewrite" from "expire/remove".
   > 
   > ### Suggested Approach
   > For the **backend API**, I'd recommend defining three `processCategory` 
values upfront: `OPTIMIZING`, `CLEANUP`, `PROFILING`. This makes the API 
future-proof.
   > 
   > For the **frontend**, there are two pragmatic options:
   > 
   > 1. **Three tabs** (`Optimizing` / `Cleanup` / `Profiling`) — cleanest 
separation
   > 2. **Two tabs for now** (`Optimizing` / `Cleanup`) — merge Profiling into 
Cleanup since there's currently only one profiling operation (Auto Create 
Tags), and split it out later when more profiling operations are added
   > 
   > Either way, the endpoint could be generalized from `/maintenance-types` to 
something like `/process-types?category=CLEANUP` for better extensibility.
   > 
   > What do you think?
   
   the new pr fix it.
   
   Changes:
   - Add ProcessCategory enum with OPTIMIZING, CLEANUP, PROFILING
   - Replace getTableMaintenanceTypes() with getTableProcessTypes(category)
   - Remove excludeTypes parameter from TableProcessMapper.listProcessMeta()
   - Merge /optimizing-types and /maintenance-types into /process-types endpoint
   - Split Maintenance.vue into Cleanup.vue and Profiling.vue
   - Update HudiTableDescriptor and PaimonTableDescriptor
   - Fix TestIcebergServerTableDescriptor to pass processCategory parameter
   
   BREAKING CHANGE: Removed /optimizing-types and /maintenance-types REST
   endpoints. Use /process-types?processCategory=OPTIMIZING|CLEANUP|PROFILING.
   SPI method getTableMaintenanceTypes() replaced by getTableProcessTypes().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to