YuriyGavrilov commented on issue #4406:
URL: https://github.com/apache/seatunnel/issues/4406#issuecomment-1852601049
I understood how to load data from s3 to local file but receiving an error
when trying to copy from s3 to s3 with same config to another folder with
cutting columns option sink_columns = ["a","b"]. So receiving errors
```
2023-12-12 21:20:20 2023-12-12 18:20:20,797 WARN
org.apache.seatunnel.engine.server.TaskExecutionService - [localhost]:5801
[seatunnel-86182] [5.1] Exception in
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask@6668d0db
2023-12-12 21:20:20 java.lang.NoSuchMethodError:
org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V
2023-12-12 21:20:20 at
org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824)
~[hadoop-aws-3.2.4.jar:?]
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057)
~[seatunnel-hadoop3-3.1.4-uber-2.3.1-optional.jar:2.3.1]
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:108)
~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:138)
~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:81)
~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:126)
~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43)
~[connector-file-s3-2.3.3.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:227)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:61)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:76)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:51)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:52)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
~[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:613)
[seatunnel-starter.jar:2.3.3]
2023-12-12 21:20:20 at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_342]
2023-12-12 21:20:20 at
java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_342]
2023-12-12 21:20:20 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_342]
2023-12-12 21:20:20 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_342]
2023-12-12 21:20:20 at java.lang.Thread.run(Thread.java:750)
[?:1.8.0_342]
```
Next error
```
2023-12-12 21:20:20 2023-12-12 18:20:20,809 ERROR
org.apache.seatunnel.engine.server.dag.physical.PhysicalVertex - Job
SeaTunnel_Job (787020937437380609), Pipeline: [(1/1)], task: [pipeline-1
[Source[0]-S3File-default-identifier]-SourceTask (1/1)] end with state FAILED
and Exception: java.lang.NoSuchMethodError:
org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(Ljava/util/concurrent/ExecutorService;IZ)V
2023-12-12 21:20:20 at
org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:824)
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
2023-12-12 21:20:20 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1057)
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.util.FileSystemUtils.getOutputStream(FileSystemUtils.java:108)
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.getOrCreateOutputStream(TextWriteStrategy.java:138)
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.writer.TextWriteStrategy.write(TextWriteStrategy.java:81)
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:126)
2023-12-12 21:20:20 at
org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:43)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:227)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:61)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:76)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:51)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:52)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
2023-12-12 21:20:20 at
org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:613)
2023-12-12 21:20:20 at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
2023-12-12 21:20:20 at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
2023-12-12 21:20:20 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
2023-12-12 21:20:20 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
2023-12-12 21:20:20 at java.lang.Thread.run(Thread.java:750)
2023-12-12 21:20:20
2023-12-12 21:20:20 2023-12-12 18:20:20,809 ERROR
org.apache.seatunnel.engine.server.dag.physical.SubPlan - Task
TaskGroupLocation{jobId=787020937437380609, pipelineId=1, taskGroupId=50000}
Failed in Job SeaTunnel_Job (787020937437380609), Pipeline: [(1/1)], Begin to
cancel other tasks in this pipeline.
```
for this config:
```
2023-12-12 21:20:13 "env" : {
2023-12-12 21:20:13 "execution.parallelism" : 1,
2023-12-12 21:20:13 "job.mode" : "BATCH"
2023-12-12 21:20:13 },
2023-12-12 21:20:13 "source" : [
2023-12-12 21:20:13 {
2023-12-12 21:20:13 "bucket" : "s3a://test",
2023-12-12 21:20:13 "path" : "/seatunnel/",
2023-12-12 21:20:13 "secret_key" : "XXXXXX",
2023-12-12 21:20:13 "file_format_type" : "parquet",
2023-12-12 21:20:13 "access_key" : "XXXXXXX",
2023-12-12 21:20:13 "fs.s3a.aws.credentials.provider" :
"org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
2023-12-12 21:20:13 "plugin_name" : "S3File",
2023-12-12 21:20:13 "fs.s3a.endpoint" : "gateway.storjshare.io"
2023-12-12 21:20:13 }
2023-12-12 21:20:13 ],
2023-12-12 21:20:13 "transform" : [],
2023-12-12 21:20:13 "sink" : [
2023-12-12 21:20:13 {
2023-12-12 21:20:13 "bucket" : "s3a://test",
2023-12-12 21:20:13 "path" : "/seatunnel2/",
2023-12-12 21:20:13 "secret_key" : "XXXXX",
2023-12-12 21:20:13 "access_key" : "XXXXX",
2023-12-12 21:20:13 "fs.s3a.aws.credentials.provider" :
"org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
2023-12-12 21:20:13 "plugin_name" : "S3File",
2023-12-12 21:20:13 "fs.s3a.endpoint" : "gateway.storjshare.io",
2023-12-12 21:20:13 "sink_columns" : [
2023-12-12 21:20:13 "a",
2023-12-12 21:20:13 "b"
2023-12-12 21:20:13 ]
2023-12-12 21:20:13 }
2023-12-12 21:20:13 ]
2023-12-12 21:20:13 }
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]