Hi all:

I'd like to propose issue 
1291(https://github.com/apache/incubator-doris/issues/1291)


This proposal would refactor the logic of creating and managing colocated 
tables.
The original design can be found here: [ISSUE 
245](https://github.com/apache/incubator-doris/issues/245)


The original design has some has some problem which I would like to fix:


1. All colocated tables attach to a table,  called PARENT TABLE, and all other 
tables attached to it are called CHILD TABLE.
This could cause a problem that when modifying PARENT TABLE's properties, such 
as dropping, renaming, modifying colocate properties, etc.
other CHILD TABLE must be modified along with it. This tight coupling design is 
not flexible and make the code not easy to
maintain.


2. Currently, the repair and balance of tablets of colocated tables are not 
managed by the newly implemented tablet scheduler.
These colocated tablets still use the old CloneJob to do balance and repair. I 
want to unify it.


So I want to make the following changes:


1. To introduce a new concept COLOCATION GROUP, and all colocated tables will 
attach to exactly one COLOCATION GROUP.
GROUP will describes all colocation properties, such as distribution column 
types, buckets num, replication num and data distribution info.
User can modify a table's colocation group flexibly without touching other 
tables in this group.


2. To use a independent Balance manager to handle colocation tables's tables 
repair and balance, and the real repair task is created by the new
Tablet Scheduler


Also, I will add the document of Colocation Join feature.







--
此致!Best Regards
陈明雨 Mingyu Chen

Email:
[email protected]

Reply via email to