[GitHub] [incubator-seatunnel] wineternity opened a new issue, #4003: [Bug] [Connector-V2] Clickhouse File Connector

via GitHub Mon, 30 Jan 2023 00:43:14 -0800


wineternity opened a new issue, #4003:
URL: https://github.com/apache/incubator-seatunnel/issues/4003


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   I am use ClickHouseFile sink with spark 2.4.x, and the clickhouse version is 
21.8.15.7。 the table in server defined with strorage_policy for ssd disk,  so 
the clickhouse-local command will try to create a local table with this 
storage_policy. As in compatible mode, the config file will be generated 
automatically, the storage_policy config can not be added to it
   
   ### SeaTunnel Version
   
   2.3.0
   
   ### SeaTunnel Config
   
   ```conf
   env {
     execution.parallelism = 1
     job.mode = "BATCH"
   }
   
   source {
       FakeSource {
         result_table_name = "fake"
         row.num = 16
         schema = {
           fields {
             name = "string"
             age = "int"
           }
         }
       }
   }
   
   transform {
   
   }
   
   sink {
     Console {}
     ClickhouseFile {
       host = "qabb-qa-clickhouse101:8123,qabb-qa-clickhouse102:8123"
       database = "test"
       table = "test02_rand_dist"
       username = "default"
       password = "clickhouse"
       clickhouse_local_path = "/usr/bin/clickhouse-local"
       node_free_password = true
       node_pass = []
       compatible_mode = true
     }
   }
   ```
   
   
   ### Running Command
   
   ```shell
   apache-seatunnel-incubating-2.3.0/bin/start-seatunnel-spark-connector-v2.sh 
--master local --deploy-mode client --config seatunnel_test/job1.conf
   ```
   
   
   ### Error Exception
   
   ```log
   23/01/30 16:41:44 INFO ClickhouseFileSinkWriter: Generate clickhouse local 
file command: /usr/bin/clickhouse-local local --file 
/tmp/seatunnel/clickhouse-local/file/98a0f464_b/local_data.log 
--format_csv_delimiter "      " -S "name String,age Int64" -N 
"temp_table98a0f464_b" -q "CREATE TABLE test01 (name String DEFAULT '', age 
Int64) ENGINE = MergeTree() ORDER BY name SETTINGS index_granularity = 8192, 
storage_policy = 'disk_ssd'; INSERT INTO TABLE test01 SELECT name,age FROM 
temp_table98a0f464_b;" --config-file 
"/tmp/seatunnel/clickhouse-local/file/98a0f464_b/config.xml"
   23/01/30 16:41:44 ERROR ClickhouseFileSinkWriter: Processing configuration 
file '/tmp/seatunnel/clickhouse-local/file/98a0f464_b/config.xml'.
   23/01/30 16:41:44 ERROR ClickhouseFileSinkWriter: Saved preprocessed 
configuration to ' /tmp/seatunnel/clickhouse-local/file/98a0f464_b 
/preprocessed_configs/config.xml'.
   23/01/30 16:41:44 ERROR ClickhouseFileSinkWriter: Code: 478, e.displayText() 
= DB::Exception: Unknown storage policy `disk_ssd` (version 21.8.15.7)
   23/01/30 16:41:44 ERROR Utils: Aborting task
   
org.apache.seatunnel.connectors.seatunnel.clickhouse.exception.ClickhouseConnectorException:
 ErrorCode:[COMMON-10], ErrorDescription:[Flush data operation that in sink 
connector failed] - Flush data into clickhouse file error
        at 
org.apache.seatunnel.connectors.seatunnel.clickhouse.sink.file.ClickhouseFileSinkWriter.lambda$prepareCommit$3(ClickhouseFileSinkWriter.java:139)
        at java.util.HashMap.forEach(HashMap.java:1289)
        at 
org.apache.seatunnel.connectors.seatunnel.clickhouse.sink.file.ClickhouseFileSinkWriter.prepareCommit(ClickhouseFileSinkWriter.java:131)
        at 
org.apache.seatunnel.translation.spark.sink.SparkDataWriter.commit(SparkDataWriter.java:69)
        at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:127)
        at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at 
org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at 
org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   ```
   
   
   ### Flink or Spark Version
   
   Spark 2.4.8
   
   ### Java or Scala Version
   
   2.11
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-seatunnel] wineternity opened a new issue, #4003: [Bug] [Connector-V2] Clickhouse File Connector

Reply via email to