[ 
https://issues.apache.org/jira/browse/FLINK-35701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033029#comment-18033029
 ] 

Edward Zhang commented on FLINK-35701:
--------------------------------------

Hi, could you provide more information about the envs including the version of 
SQL Server, flink-cdc and flink?

> SqlServer the primary key type is uniqueidentifier, the 
> scan.incremental.snapshot.chunk.size parameter does not take effect during 
> split chunk
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-35701
>                 URL: https://issues.apache.org/jira/browse/FLINK-35701
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>            Reporter: yangxiao
>            Priority: Major
>
> 1. The source table in the SQL Server database contains 1000000 inventory 
> data records. The default value of scan.incremental.snapshot.chunk.size is 
> 8096.
> 2. Only one chunk is split, which should be 124 chunks.
>  
> Problem reproduction:
> 1. Create a test table in the SQL Server and import data.
>  
> BEGIN TRANSACTION
> USE [testdb];
> DROP TABLE [dbo].[testtable];
> CREATE TABLE [dbo].[testtable] (
>   [TestId] varchar(64),
>   [CustomerId] varchar(64),
>   [Id] uniqueidentifier NOT NULL,
> PRIMARY KEY CLUSTERED ([Id])
> );
> ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE);
> COMMIT
>  
>  
> declare @Id int;
> set @Id=1;
> while @Id<=1000000
> begin
> insert into testtable values(NEWID(), NEWID(), NEWID());
>     set @Id=@Id+1;
> end;
>  
> 2. Use flinkcdc sqlserver connector to collect data.
> CREATE TABLE testtable (
>   TestId STRING,
>   CustomerId STRING,
>   Id STRING,
>   PRIMARY KEY (Id) NOT ENFORCED
> ) WITH (
>   'connector' = 'sqlserver-cdc',
>   'hostname' = '',
>   'port' = '1433',
>   'username' = '',
>   'password' = '',
>   'database-name' = 'testdb',
>   'table-name' = 'dbo.testtable'
> );
>  
> 3、LOG
> 2024-06-26 10:04:43,377 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
> Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size 
> is 8096 | 
> com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268)
> 2024-06-26 10:04:43,385 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
> Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. | 
> com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to