[jira] [Updated] (PHOENIX-7500) Add PARENT_PARTITION_ID to SYSTEM.CDC_STREAM table's composite pk

Viraj Jasani (Jira) Tue, 07 Jan 2025 15:23:04 -0800


     [ 
https://issues.apache.org/jira/browse/PHOENIX-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Viraj Jasani updated PHOENIX-7500:
----------------------------------
    Description: 
There are two ways to capture CDC Stream's parent/child relationship among data 
table's merged regions:
 # In the current schema of SYSTEM.CDC_STREAM table, provide comma separated 
parent regions in the PARENT_PARTITION_ID column.
 # Use one row for each merged parent region. Each row represents one child to 
parent relationship.

Any CDC Consumer can continue consuming partition records from parent to child 
partitions. Since any num of regions can be merged simultaneously, it can be 
expensive to write a query that uses IN clause to check whether one of the 
merged parent region is same as current region (partition) being consumed by 
the client.

Using one parent partition id for each row is efficient solution. In order to 
achieve this, we need to add PARENT_PARTITION_ID to SYSTEM.CDC_STREAM table's 
composite pk. This is needed because the child partition id remains same for 
different merged parent regions.

  was:
There are two ways to capture CDC Stream's parent/child relationship among data 
table's merged regions:
 # In the current schema of SYSTEM.CDC_STREAM table, provide comma separated 
parent regions in the PARENT_PARTITION_ID column.
 # Use one row for each merged parent region. Each row represents one child to 
parent relationship.

Any CDC Consumer can continue consuming partition records from parent to child 
partitions. Since any num of regions can be merged simultaneously, it can be 
expensive to write a query that uses IN clause to check whether one of the 
merged parent region is same as current region (partition) being consumed by 
the client.

Using one parent partition id for each row is efficient solution. In order to 
achieve this, we need to remove PARTITION_ID from SYSTEM.CDC_STREAM table's 
composite pk. This is needed because the child partition id remains same for 
different merged parent regions.


> Add PARENT_PARTITION_ID to SYSTEM.CDC_STREAM table's composite pk
> -----------------------------------------------------------------
>
>                 Key: PHOENIX-7500
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7500
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Viraj Jasani
>            Assignee: Palash Chauhan
>            Priority: Major
>
> There are two ways to capture CDC Stream's parent/child relationship among 
> data table's merged regions:
>  # In the current schema of SYSTEM.CDC_STREAM table, provide comma separated 
> parent regions in the PARENT_PARTITION_ID column.
>  # Use one row for each merged parent region. Each row represents one child 
> to parent relationship.
> Any CDC Consumer can continue consuming partition records from parent to 
> child partitions. Since any num of regions can be merged simultaneously, it 
> can be expensive to write a query that uses IN clause to check whether one of 
> the merged parent region is same as current region (partition) being consumed 
> by the client.
> Using one parent partition id for each row is efficient solution. In order to 
> achieve this, we need to add PARENT_PARTITION_ID to SYSTEM.CDC_STREAM table's 
> composite pk. This is needed because the child partition id remains same for 
> different merged parent regions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PHOENIX-7500) Add PARENT_PARTITION_ID to SYSTEM.CDC_STREAM table's composite pk

Reply via email to