Hi all:
I'd like to propose issue 1291(https://github.com/apache/incubator-doris/issues/1291) This proposal would refactor the logic of creating and managing colocated tables. The original design can be found here: [ISSUE 245](https://github.com/apache/incubator-doris/issues/245) The original design has some has some problem which I would like to fix: 1. All colocated tables attach to a table, called PARENT TABLE, and all other tables attached to it are called CHILD TABLE. This could cause a problem that when modifying PARENT TABLE's properties, such as dropping, renaming, modifying colocate properties, etc. other CHILD TABLE must be modified along with it. This tight coupling design is not flexible and make the code not easy to maintain. 2. Currently, the repair and balance of tablets of colocated tables are not managed by the newly implemented tablet scheduler. These colocated tablets still use the old CloneJob to do balance and repair. I want to unify it. So I want to make the following changes: 1. To introduce a new concept COLOCATION GROUP, and all colocated tables will attach to exactly one COLOCATION GROUP. GROUP will describes all colocation properties, such as distribution column types, buckets num, replication num and data distribution info. User can modify a table's colocation group flexibly without touching other tables in this group. 2. To use a independent Balance manager to handle colocation tables's tables repair and balance, and the real repair task is created by the new Tablet Scheduler Also, I will add the document of Colocation Join feature. -- 此致!Best Regards 陈明雨 Mingyu Chen Email: [email protected]
