hewanghw opened a new issue, #7252:
URL: https://github.com/apache/hudi/issues/7252

   
   **Describe the problem you faced**
   
   I'm trying to write a hudi table into minio s3 bucket by flink SQL, but it 
fails. 
   The hudi table is created, but only contains meta data diretory .hoodie
   the directory tree  is as follows:
   ```
   myminio/flink-hudi
   └─ t1
      └─ .hoodie
         ├─ .aux
         │  ├─ .bootstrap
         │  │  ├─ .fileids
         │  │  └─ .partitions
         │  └─ ckp_meta
         ├─ .schema
         ├─ .temp
         └─ archived
   ```
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Creates a Flink Hudi table
   ```
   CREATE TABLE t1(
     uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
     name VARCHAR(10),
     age INT,
     ts TIMESTAMP(3),
     `partition` VARCHAR(20)
   )
   PARTITIONED BY (`partition`)
   WITH (
     'connector' = 'hudi',
     'path' = 's3a://flink-hudi/t1',
     'table.type' = 'MERGE_ON_READ'
   );
   ```
   2. Insert data into the Hudi table
   ```
   INSERT INTO t1 VALUES ('id1','Danny',23,TIMESTAMP '1970-01-01 
00:00:01','par1');
   ```
   
   **Expected behavior**
   
   write hudi table into s3 bucket successfuly.
   
   **Environment Description**
   
   * Hudi version : 0.12.0
   
   * Hadoop version : 3.2.4
   
   * Flink version: 1.15.2
   
   * Storage (HDFS/S3/GCS..) : minio S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Added dependency:
   - hadoop-aws-3.2.4.jar
   - aws-java-sdk-bundle-1.11.901.jar
   - flink-s3-fs-hadoop-1.15.2.jar
   
   properties in hadoop core-site.xml:
   ```
   <property>
     <name>fs.s3a.access.key</name>
     <value>xxx</value>
   </property>
   <property>
     <name>fs.s3a.secret.key</name>
     <value>xxx</value>
   </property>
   <property>
     <name>fs.s3a.endpoint</name>
     <value>xxx</value>
   </property>
   <property>
     <name>fs.s3a.path.style.access</name>
     <value>true</value>
   </property>
   <property>
     <name>fs.s3.impl</name>
     <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
   </property>
   ```
   flink-conf.yaml:
   ```
   taskmanager.numberOfTaskSlots: 4
   
   s3a.endpoint: xxx
   s3a.access-key: xxx
   s3a.secret-key: xxx
   s3a.path.style.access: true
   
   fs.hdfs.hadoopconf: /export/servers/hadoop-3.2.4/etc/hadoop
   
   state.backend: rocksdb
   state.backend.incremental: true
   state.checkpoints.dir: s3a://flink-state/checkpoint
   execution.checkpointing.interval: 30000
   
   classloader.check-leaked-classloader: false
   ```
   execute flink :
   ```
   export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
   ./bin/start-cluster.sh
   ./bin/sql-client.sh embedded -j 
/opt/flink/jars/hudi-flink1.15-bundle-0.12.0.jar shell
   ```
   **Stacktrace**
   
   ```
   
   org.apache.hudi.exception.HoodieException: Exception while scanning the 
checkpoint meta files under path: s3a://flink-hudi/t1/.hoodie/.aux/ckp_meta
        at org.apache.hudi.sink.meta.CkpMetadata.load(CkpMetadata.java:169)
        at 
org.apache.hudi.sink.meta.CkpMetadata.lastPendingInstant(CkpMetadata.java:175)
        at 
org.apache.hudi.sink.common.AbstractStreamWriteFunction.lastPendingInstant(AbstractStreamWriteFunction.java:243)
        at 
org.apache.hudi.sink.common.AbstractStreamWriteFunction.initializeState(AbstractStreamWriteFunction.java:151)
        at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:189)
        at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:171)
        at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:94)
        at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122)
        at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286)
        at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:700)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:676)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:643)
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.io.FileNotFoundException: No such file or directory: 
s3a://flink-hudi/t1/.hoodie/.aux/ckp_meta
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2344)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1961)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$9(S3AFileSystem.java:1940)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1940)
        at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$listStatus$15(HoodieWrapperFileSystem.java:365)
        at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:106)
        at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.listStatus(HoodieWrapperFileSystem.java:364)
        at 
org.apache.hudi.sink.meta.CkpMetadata.scanCkpMetadata(CkpMetadata.java:216)
        at org.apache.hudi.sink.meta.CkpMetadata.load(CkpMetadata.java:167)
        ... 18 more
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to