yangxiao created FLINK-35701:
--------------------------------
Summary: SqlServer the primary key type is uniqueidentifier, the
scan.incremental.snapshot.chunk.size parameter does not take effect during
split chunk
Key: FLINK-35701
URL: https://issues.apache.org/jira/browse/FLINK-35701
Project: Flink
Issue Type: Bug
Components: Flink CDC
Reporter: yangxiao
1. The source table in the SQL Server database contains 1000000 inventory data
records. The default value of scan.incremental.snapshot.chunk.size is 8096.
2. Only one chunk is split, which should be 124 chunks.
Problem reproduction:
1. Create a test table in the SQL Server and import data.
BEGIN TRANSACTION
USE [testdb];
DROP TABLE [dbo].[testtable];
CREATE TABLE [dbo].[testtable] (
[TestId] varchar(64),
[CustomerId] varchar(64),
[Id] uniqueidentifier NOT NULL,
PRIMARY KEY CLUSTERED ([Id])
);
ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE);
COMMIT
declare @Id int;
set @Id=1;
while @Id<=1000000
begin
insert into testtable values(NEWID(), NEWID(), NEWID());
set @Id=@Id+1;
end;
2. Use flinkcdc sqlserver connector to collect data.
CREATE TABLE testtable (
TestId STRING,
CustomerId STRING,
Id STRING,
PRIMARY KEY (Id) NOT ENFORCED
) WITH (
'connector' = 'sqlserver-cdc',
'hostname' = '',
'port' = '1433',
'username' = '',
'password' = '',
'database-name' = 'testdb',
'table-name' = 'dbo.testtable'
);
3、LOG
2024-06-26 10:04:43,377 | INFO | [SourceCoordinator-Source: testtable[1]] |
Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size is
8096 |
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268)
2024-06-26 10:04:43,385 | INFO | [SourceCoordinator-Source: testtable[1]] |
Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. |
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)