[jira] [Created] (FLINK-35701) SqlServer the primary key type is uniqueidentifier, the scan.incremental.snapshot.chunk.size parameter does not take effect during split chunk

yangxiao (Jira) Tue, 25 Jun 2024 21:22:56 -0700

yangxiao created FLINK-35701:
--------------------------------

             Summary: SqlServer the primary key type is uniqueidentifier, the 
scan.incremental.snapshot.chunk.size parameter does not take effect during 
split chunk
                 Key: FLINK-35701
                 URL: https://issues.apache.org/jira/browse/FLINK-35701
             Project: Flink
          Issue Type: Bug
          Components: Flink CDC
            Reporter: yangxiao



1. The source table in the SQL Server database contains 1000000 inventory data 
records. The default value of scan.incremental.snapshot.chunk.size is 8096.
2. Only one chunk is split, which should be 124 chunks.
 
Problem reproduction:
1. Create a test table in the SQL Server and import data.
 
BEGIN TRANSACTION
USE [testdb];
DROP TABLE [dbo].[testtable];
CREATE TABLE [dbo].[testtable] (
  [TestId] varchar(64),
  [CustomerId] varchar(64),
  [Id] uniqueidentifier NOT NULL,
PRIMARY KEY CLUSTERED ([Id])
);
ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE);
COMMIT
 
 
declare @Id int;
set @Id=1;
while @Id<=1000000
begin
insert into testtable values(NEWID(), NEWID(), NEWID());
    set @Id=@Id+1;
end;
 
2. Use flinkcdc sqlserver connector to collect data.
CREATE TABLE testtable (
  TestId STRING,
  CustomerId STRING,
  Id STRING,
  PRIMARY KEY (Id) NOT ENFORCED
) WITH (
  'connector' = 'sqlserver-cdc',
  'hostname' = '',
  'port' = '1433',
  'username' = '',
  'password' = '',
  'database-name' = 'testdb',
  'table-name' = 'dbo.testtable'
);
 
3、LOG
2024-06-26 10:04:43,377 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size is 
8096 | 
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268)
2024-06-26 10:04:43,385 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. | 
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35701) SqlServer the primary key type is uniqueidentifier, the scan.incremental.snapshot.chunk.size parameter does not take effect during split chunk

Reply via email to