[
https://issues.apache.org/jira/browse/FLINK-35701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033029#comment-18033029
]
Edward Zhang commented on FLINK-35701:
--------------------------------------
Hi, could you provide more information about the envs including the version of
SQL Server, flink-cdc and flink?
> SqlServer the primary key type is uniqueidentifier, the
> scan.incremental.snapshot.chunk.size parameter does not take effect during
> split chunk
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-35701
> URL: https://issues.apache.org/jira/browse/FLINK-35701
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Reporter: yangxiao
> Priority: Major
>
> 1. The source table in the SQL Server database contains 1000000 inventory
> data records. The default value of scan.incremental.snapshot.chunk.size is
> 8096.
> 2. Only one chunk is split, which should be 124 chunks.
>
> Problem reproduction:
> 1. Create a test table in the SQL Server and import data.
>
> BEGIN TRANSACTION
> USE [testdb];
> DROP TABLE [dbo].[testtable];
> CREATE TABLE [dbo].[testtable] (
> [TestId] varchar(64),
> [CustomerId] varchar(64),
> [Id] uniqueidentifier NOT NULL,
> PRIMARY KEY CLUSTERED ([Id])
> );
> ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE);
> COMMIT
>
>
> declare @Id int;
> set @Id=1;
> while @Id<=1000000
> begin
> insert into testtable values(NEWID(), NEWID(), NEWID());
> set @Id=@Id+1;
> end;
>
> 2. Use flinkcdc sqlserver connector to collect data.
> CREATE TABLE testtable (
> TestId STRING,
> CustomerId STRING,
> Id STRING,
> PRIMARY KEY (Id) NOT ENFORCED
> ) WITH (
> 'connector' = 'sqlserver-cdc',
> 'hostname' = '',
> 'port' = '1433',
> 'username' = '',
> 'password' = '',
> 'database-name' = 'testdb',
> 'table-name' = 'dbo.testtable'
> );
>
> 3、LOG
> 2024-06-26 10:04:43,377 | INFO | [SourceCoordinator-Source: testtable[1]] |
> Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size
> is 8096 |
> com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268)
> 2024-06-26 10:04:43,385 | INFO | [SourceCoordinator-Source: testtable[1]] |
> Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. |
> com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)