[ 
https://issues.apache.org/jira/browse/FLINK-27002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27002:
---------------------------------
    Description: 
By default, batch sink should sort the input by partition and sequence_field to 
avoid generating a large number of small files. Too many small files cause poor 
performance, especially object storage.

We can not implement `SupportsPartitioning.requiresPartitionGrouping`.  we need 
sequence.field to sort, otherwise we can't confirm what the last record is.

  was:We can implement `SupportsPartitioning.requiresPartitionGrouping`. Write 
table_store after the planner is ordered by partition to avoid OOM caused by 
writing too many partitions at the same time.


> Optimize batch multiple partitions inserting
> --------------------------------------------
>
>                 Key: FLINK-27002
>                 URL: https://issues.apache.org/jira/browse/FLINK-27002
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table Store
>            Reporter: Jingsong Lee
>            Priority: Minor
>             Fix For: table-store-0.3.0
>
>
> By default, batch sink should sort the input by partition and sequence_field 
> to avoid generating a large number of small files. Too many small files cause 
> poor performance, especially object storage.
> We can not implement `SupportsPartitioning.requiresPartitionGrouping`.  we 
> need sequence.field to sort, otherwise we can't confirm what the last record 
> is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to