xiearthur opened a new issue, #13447: URL: https://github.com/apache/hudi/issues/13447
### **Describe the problem you faced** We are trying to use CONSISTENT_HASHING bucket index with COW (Copy-on-Write) tables but encountering runtime failures. The current implementation appears to only support MOR tables, which limits our architecture choices for workloads that prefer COW semantics. ### **To Reproduce** Steps to reproduce the behavior: 1. Create a COW table with bucket index configuration 2. Set `hoodie.index.bucket.engine=CONSISTENT_HASHING` 3. Attempt to perform insert/upsert operations 4. Observe runtime failure with HoodieUpsertException **Configuration used:** ```properties hoodie.table.type=COPY_ON_WRITE hoodie.index.type=BUCKET hoodie.index.bucket.engine=CONSISTENT_HASHING hoodie.bucket.index.num.buckets=4 ``` ### **Expected behavior** COW tables should support CONSISTENT_HASHING bucket index similar to MOR tables, allowing for: - Dynamic bucket resizing based on data volume - Better data distribution compared to simple bucket index - Consistent write performance across varying data sizes ### **Environment Description** * **Hudi version**: 0.14.0+ * **flink version**: 1.13 * **Storage**: S3/HDFS * **Running on Docker**: No ### **Additional context** **Business Impact:** - Prevents optimal indexing strategy for COW-based workloads - Forces choice between table type preference and indexing capabilities - Simple bucket index doesn't scale well with varying data volumes ``` **Questions for the community:** 1. Are there plans to support CONSISTENT_HASHING for COW tables? 2. What are the technical barriers preventing this support? 3. Would the community be open to contributions implementing this feature? 4. Are there alternative indexing strategies that provide similar benefits for COW tables? --- -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
