Hi Sanskar, Thank you for proposing this optimization. While simplifying the schema is compelling, I have reservations about merging the celeborn_cluster_system_config and celeborn_cluster_tenant_config tables. Below are my key concerns and alternative suggestions:
1. Data Isolation and Accidental Overrides Risks Blurred Boundaries: System-level configurations (e.g., memory manager settings, global settings) and tenant-specific configurations (e.g., tenant settings) serve fundamentally different purposes. Merging them risks accidental overrides (e.g., a tenant config unintentionally modifying a global system parameter). 2. Maintainability and Scalability Challenges Implicit Coupling: Code would rely heavily on filters like WHERE level='SYSTEM' or tenant_id='', increasing the risk of bugs (e.g., missing filters leading to unintended data exposure or performance issues). Schema Rigidity: Future evolution of system-specific fields (e.g., cluster_version) would force tenant records to carry redundant columns, whereas separate tables allow independent schema changes. 3. Performance Trade-offs Index Efficiency: System configs are typically small but queried frequently, while tenant configs are larger and require tenant-specific indexing. A merged table may degrade query performance for both use cases. 4. Alternative Approaches Unified View: Create a database view (e.g., system_and_tenant_config_unified) to combine data from both tables for querying, avoiding physical consolidation. Before proceeding, could you share additional context to help evaluate the proposal more thoroughly? 1. This change will introduce some incompatible behavior between different server versions. We'll need an approach to migrate data to new tables. I’m open to further discussion and exploring solutions that balance simplicity with long-term maintainability. Best regards, Ethan Feng Sanskar Modi <sanskarmod...@gmail.com> 于2025年3月4日周二 02:52写道: > > Hi Celeborn Community, > > I wanted to get an opinion about merging SystemConfigs table > (`celeborn_cluster_system_config`) and TenantConfig table > (`celeborn_cluster_tenant_config`) for DB config service. As they > essentially represent similar data with some additional metadata like > level, tenant_id and name. We can use the same schema as of > `celeborn_cluster_tenant_config` for the new table and for SystemConfigs – > `level` can be defined as `SYSTEM`. tenant_id and user can be '' (empty > string). > > Pros: > - Config's precedence can be seen by a single query. Currently users will > have to check both tables for a config override. > - Less number of DB queries while refreshing cache. > - All the configs will be present on the same table providing ease of > maintaining only a single table and will reduce code around it. > > Thanks > Sanskar Modi