[GitHub] [seatunnel] cs3163077 opened a new issue, #5003: SqlServer-CDC 没有数据

via GitHub Fri, 30 Jun 2023 00:32:32 -0700


cs3163077 opened a new issue, #5003:
URL: https://github.com/apache/seatunnel/issues/5003


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   SqlServer-CDC 没有数据,也没报错
   
   ### SeaTunnel Version
   
   2.3.2
   
   ### SeaTunnel Config
   
   ```conf
   env {
     job.mode = "STREAMING"
   }
   source {
     SqlServer-CDC {
       result_table_name = "area"    
        username = "**"
        password = "****"
        database-names = ["BI"]
        table-names = ["BI.dbo.area"]
       base-url="jdbc:sqlserver://******:**;databaseName=BI"
     }
   }
   transform {
   }
   sink {
        Console {
       }
   }
   ```
   
   
   ### Running Command
   
   ```shell
   ./start-seatunnel-spark-2-connector-v2.sh --config 
/home/apache-seatunnel-incubating-2.3.1/config/cdc.conf
   ```
   
   
   ### Error Exception
   
   ```log
   23/06/30 15:24:25 INFO spark.SparkContext: Added JAR 
file:/home/apache-seatunnel-incubating-2.3.1/starter/seatunnel-spark-2-starter.jar
 at spark://hadoop3:37437/jars/seatunnel-spark-2-starter.jar with timestamp 
1688109865948
   23/06/30 15:24:26 INFO executor.Executor: Starting executor ID driver on 
host localhost
   23/06/30 15:24:26 INFO util.Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 41040.
   23/06/30 15:24:26 INFO netty.NettyBlockTransferService: Server created on 
hadoop3:41040
   23/06/30 15:24:26 INFO storage.BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   23/06/30 15:24:26 INFO storage.BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, hadoop3, 41040, None)
   23/06/30 15:24:26 INFO storage.BlockManagerMasterEndpoint: Registering block 
manager hadoop3:41040 with 366.3 MB RAM, BlockManagerId(driver, hadoop3, 41040, 
None)
   23/06/30 15:24:26 INFO storage.BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, hadoop3, 41040, None)
   23/06/30 15:24:26 INFO storage.BlockManager: external shuffle service port = 
7337
   23/06/30 15:24:26 INFO storage.BlockManager: Initialized BlockManager: 
BlockManagerId(driver, hadoop3, 41040, None)
   23/06/30 15:24:26 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@41b1f51e{/metrics/json,null,AVAILABLE,@Spark}
   23/06/30 15:24:27 INFO scheduler.EventLoggingListener: Logging events to 
hdfs://hadoop1:8020/user/spark/applicationHistory/local-1688109866035
   23/06/30 15:24:27 WARN lineage.LineageWriter: Lineage directory 
/var/log/spark/lineage doesn't exist or is not writable. Lineage for this 
application will be disabled.
   23/06/30 15:24:27 INFO util.Utils: Extension 
com.cloudera.spark.lineage.NavigatorAppListener not being initialized.
   23/06/30 15:24:27 INFO logging.DriverLogger$DfsAsyncWriter: Started driver 
log file sync to: /user/spark/driverLogs/local-1688109866035_driver.log
   23/06/30 15:24:28 INFO discovery.AbstractPluginDiscovery: Load 
SeaTunnelSource Plugin from 
/home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel
   23/06/30 15:24:28 INFO discovery.AbstractPluginDiscovery: Discovery plugin 
jar: SqlServer-CDC at: 
file:/home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel/connector-cdc-sqlserver-2.3.2.jar
   23/06/30 15:24:28 INFO discovery.AbstractPluginDiscovery: Load plugin: 
PluginIdentifier{engineType='seatunnel', pluginType='source', 
pluginName='SqlServer-CDC'} from classpath
   23/06/30 15:24:28 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:28 INFO utils.TableDiscoveryUtils: Read list of available 
databases
   23/06/30 15:24:29 INFO utils.TableDiscoveryUtils:     list of available 
databases is: [master, tempdb, model, msdb, BI, BIDEMO, SYSD_BI_DEMO, 
SYSD_BI_DEV_JAVA, *****, SY_BI_COST, SY_BI_1141, TEST1, SYSD_BI_TEST_20211026, 
SYSD_BI_DEV_JAVA1, test123, testdatabig, test12134, SYBI_Test_TW, zhuhm_BIDW, 
bigdatatest, zhuhm_BIDW_E, zhuhm_BIDW1, *****, test_yx, S*****Test, 
SY_BI_CNCEC, BIDW5, *****, S*****, SY_BI_LSC]
   23/06/30 15:24:29 INFO utils.TableDiscoveryUtils: Read list of available 
tables in each database
   23/06/30 15:24:29 INFO utils.TableDiscoveryUtils:     including 
'BI.dbo.area' for further processing
   23/06/30 15:24:29 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:29 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:29 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:29 WARN utils.TableDiscoveryUtils:     skipping database 
'S*****' due to error reading tables: 由于数据库 'S*****' 离线，无法打开该数据库。
   23/06/30 15:24:29 INFO jdbc.JdbcConnection: Connection gracefully closed
   23/06/30 15:24:29 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:29 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:29 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:29 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:30 INFO jdbc.JdbcConnection: Connection gracefully closed
   23/06/30 15:24:30 INFO execution.SparkRuntimeEnvironment: register plugins 
:[file:/home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel/connector-cdc-sqlserver-2.3.2.jar]
   23/06/30 15:24:30 INFO discovery.AbstractPluginDiscovery: Load 
SeaTunnelTransform Plugin from /home/apache-seatunnel-incubating-2.3.1/lib
   23/06/30 15:24:30 INFO execution.SparkRuntimeEnvironment: register plugins 
:[]
   23/06/30 15:24:30 INFO discovery.AbstractPluginDiscovery: Load SeaTunnelSink 
Plugin from /home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel
   23/06/30 15:24:30 INFO discovery.AbstractPluginDiscovery: Discovery plugin 
jar: Console at: 
file:/home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel/connector-console-2.3.1.jar
   23/06/30 15:24:30 INFO discovery.AbstractPluginDiscovery: Load plugin: 
PluginIdentifier{engineType='seatunnel', pluginType='sink', 
pluginName='Console'} from classpath
   23/06/30 15:24:30 INFO execution.SparkRuntimeEnvironment: register plugins 
:[file:/home/apache-seatunnel-incubating-2.3.1/connectors/seatunnel/connector-console-2.3.1.jar]
   23/06/30 15:24:30 INFO internal.SharedState: loading hive config file: 
file:/etc/hive/conf.cloudera.hive/hive-site.xml
   23/06/30 15:24:30 INFO internal.SharedState: spark.sql.warehouse.dir is not 
set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir 
to the value of hive.metastore.warehouse.dir ('/user/hive/warehouse').
   23/06/30 15:24:30 INFO internal.SharedState: Warehouse path is 
'/user/hive/warehouse'.
   23/06/30 15:24:30 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@4966bab1{/SQL,null,AVAILABLE,@Spark}
   23/06/30 15:24:30 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@7f0f84d4{/SQL/json,null,AVAILABLE,@Spark}
   23/06/30 15:24:30 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@590d6c76{/SQL/execution,null,AVAILABLE,@Spark}
   23/06/30 15:24:30 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@25791d40{/SQL/execution/json,null,AVAILABLE,@Spark}
   23/06/30 15:24:30 INFO handler.ContextHandler: Started 
o.s.j.s.ServletContextHandler@5a05dd30{/static/sql,null,AVAILABLE,@Spark}
   23/06/30 15:24:31 INFO state.StateStoreCoordinatorRef: Registered 
StateStoreCoordinator endpoint
   23/06/30 15:24:31 WARN lineage.LineageWriter: Lineage directory 
/var/log/spark/lineage doesn't exist or is not writable. Lineage for this 
application will be disabled.
   23/06/30 15:24:31 INFO util.Utils: Extension 
com.cloudera.spark.lineage.NavigatorQueryListener not being initialized.
   23/06/30 15:24:34 INFO v2.DataSourceV2Strategy: 
   Pushing operators to class 
org.apache.seatunnel.translation.spark.source.SeaTunnelSourceSupport
   Pushed Filters: 
   Post-Scan Filters: 
   Output: Areaid#0, Areaname#1
            
   23/06/30 15:24:34 INFO codegen.CodeGenerator: Code generated in 199.1582 ms
   23/06/30 15:24:34 INFO v2.WriteToDataSourceV2Exec: Start processing data 
source writer: 
org.apache.seatunnel.translation.spark.sink.writer.SparkDataSourceWriter@7b8d6c66.
 The input RDD has 1 partitions.
   23/06/30 15:24:34 INFO spark.SparkContext: Starting job: save at 
SinkExecuteProcessor.java:123
   23/06/30 15:24:34 INFO scheduler.DAGScheduler: Got job 0 (save at 
SinkExecuteProcessor.java:123) with 1 output partitions
   23/06/30 15:24:34 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 
(save at SinkExecuteProcessor.java:123)
   23/06/30 15:24:34 INFO scheduler.DAGScheduler: Parents of final stage: List()
   23/06/30 15:24:34 INFO scheduler.DAGScheduler: Missing parents: List()
   23/06/30 15:24:34 INFO scheduler.DAGScheduler: Submitting ResultStage 0 
(MapPartitionsRDD[1] at save at SinkExecuteProcessor.java:123), which has no 
missing parents
   23/06/30 15:24:34 INFO memory.MemoryStore: Block broadcast_0 stored as 
values in memory (estimated size 5.9 KB, free 366.3 MB)
   23/06/30 15:24:35 INFO memory.MemoryStore: Block broadcast_0_piece0 stored 
as bytes in memory (estimated size 3.1 KB, free 366.3 MB)
   23/06/30 15:24:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
memory on hadoop3:41040 (size: 3.1 KB, free: 366.3 MB)
   23/06/30 15:24:35 INFO spark.SparkContext: Created broadcast 0 from 
broadcast at DAGScheduler.scala:1164
   23/06/30 15:24:35 INFO scheduler.DAGScheduler: Submitting 1 missing tasks 
from ResultStage 0 (MapPartitionsRDD[1] at save at 
SinkExecuteProcessor.java:123) (first 15 tasks are for partitions Vector(0))
   23/06/30 15:24:35 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 
1 tasks
   23/06/30 15:24:35 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 16026 bytes)
   23/06/30 15:24:35 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 
0)
   23/06/30 15:24:35 INFO executor.Executor: Fetching 
spark://hadoop3:37437/jars/connector-cdc-sqlserver-2.3.2.jar with timestamp 
1688109865947
   23/06/30 15:24:35 INFO client.TransportClientFactory: Successfully created 
connection to hadoop3/192.168.0.213:37437 after 75 ms (0 ms spent in bootstraps)
   23/06/30 15:24:35 INFO util.Utils: Fetching 
spark://hadoop3:37437/jars/connector-cdc-sqlserver-2.3.2.jar to 
/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/fetchFileTemp8739054726675673564.tmp
   23/06/30 15:24:35 INFO executor.Executor: Adding 
file:/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/connector-cdc-sqlserver-2.3.2.jar
 to class loader
   23/06/30 15:24:35 INFO executor.Executor: Fetching 
spark://hadoop3:37437/jars/seatunnel-spark-2-starter.jar with timestamp 
1688109865948
   23/06/30 15:24:35 INFO util.Utils: Fetching 
spark://hadoop3:37437/jars/seatunnel-spark-2-starter.jar to 
/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/fetchFileTemp4417905759660334726.tmp
   23/06/30 15:24:36 INFO executor.Executor: Adding 
file:/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/seatunnel-spark-2-starter.jar
 to class loader
   23/06/30 15:24:36 INFO executor.Executor: Fetching 
spark://hadoop3:37437/jars/connector-console-2.3.1.jar with timestamp 
1688109865947
   23/06/30 15:24:36 INFO util.Utils: Fetching 
spark://hadoop3:37437/jars/connector-console-2.3.1.jar to 
/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/fetchFileTemp8783962742653812302.tmp
   23/06/30 15:24:36 INFO executor.Executor: Adding 
file:/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/connector-console-2.3.1.jar
 to class loader
   23/06/30 15:24:36 INFO executor.Executor: Fetching 
spark://hadoop3:37437/jars/seatunnel-transforms-v2.jar with timestamp 
1688109865945
   23/06/30 15:24:36 INFO util.Utils: Fetching 
spark://hadoop3:37437/jars/seatunnel-transforms-v2.jar to 
/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/fetchFileTemp6075022225571179099.tmp
   23/06/30 15:24:36 INFO executor.Executor: Adding 
file:/tmp/spark-3382b920-11e6-4daf-ad0c-a523df4ead04/userFiles-a4ca48a0-10f3-468f-b5e3-b953bfa97755/seatunnel-transforms-v2.jar
 to class loader
   23/06/30 15:24:36 INFO sink.ConsoleSinkWriter: output rowType: 
Areaid<STRING>, Areaname<STRING>
   23/06/30 15:24:36 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:36 INFO utils.TableDiscoveryUtils: Read list of available 
databases
   23/06/30 15:24:36 INFO utils.TableDiscoveryUtils:     list of available 
databases is: [BI]
   23/06/30 15:24:36 INFO utils.TableDiscoveryUtils: Read list of available 
tables in each database
   23/06/30 15:24:36 INFO utils.TableDiscoveryUtils:     including 
'BI.dbo.area' for further processing
   23/06/30 15:24:36 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:36 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:36 WARN utils.TableDiscoveryUtils:     skipping database 
'*****' due to error reading tables: 由于数据库 '*****' 离线，无法打开该数据库。
   23/06/30 15:24:36 WARN utils.TableDiscoveryUtils:     skipping database 
'S*****' due to error reading tables: 由于数据库 'S*****' 离线，无法打开该数据库。
   23/06/30 15:24:36 INFO jdbc.JdbcConnection: Connection gracefully closed
   23/06/30 15:24:36 INFO reader.SourceReaderBase: Open Source Reader.
   23/06/30 15:24:36 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:36 INFO eumerator.SqlServerChunkSplitter: Start splitting 
table BI.dbo.area into chunks...
   23/06/30 15:24:37 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:37 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:37 WARN sqlserver.SqlServerDefaultValueConverter: Cannot 
parse column default value '(getdate())' to type 'datetime'. Expression 
evaluation is not supported.
   23/06/30 15:24:37 INFO eumerator.SqlServerChunkSplitter: Use unevenly-sized 
chunks for table BI.dbo.area, the chunk size is 8096
   23/06/30 15:24:37 INFO eumerator.SqlServerChunkSplitter: Split table 
BI.dbo.area into 1 chunks, time cost: 971ms.
   23/06/30 15:24:37 INFO jdbc.JdbcConnection: Connection gracefully closed
   23/06/30 15:24:37 INFO fetcher.SplitFetcher: Starting split fetcher 0
   23/06/30 15:24:37 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:37 WARN sqlserver.SqlServerConnection: The 'server.timezone' 
option should be specified to avoid incorrect timestamp values in case of 
different timezones between the database server and this connector's JVM.
   23/06/30 15:24:37 INFO history.DatabaseHistoryMetrics: Finished database 
history recovery of 0 change(s) in 1 ms
   23/06/30 15:24:38 INFO fetcher.SplitFetcher: Finished reading from splits 
[BI.dbo.area:0]
   ```
   
   
   ### Flink or Spark Version
   
   spark：2.4.0
   
   ### Java or Scala Version
   
   java：8
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [seatunnel] cs3163077 opened a new issue, #5003: SqlServer-CDC 没有数据

Reply via email to