[GitHub] [hudi] sunneebaby opened a new issue, #9073: [SUPPORT]

via GitHub Tue, 27 Jun 2023 18:54:57 -0700


sunneebaby opened a new issue, #9073:
URL: https://github.com/apache/hudi/issues/9073


   **_Tips before filing an issue_**
   Spark-3.2 Insert Into Hudi Table UnsupportedOperationException: S3A streams 
are not Syncable 
   **Describe the problem you faced**
   I used hudi-0.12.3 （recompile with my hadoop cluster) ，
   hudi-0.12-3
   hadoop-3.1.1.3.1.4.0-315
   hive-3.1.0.3.1.4.0-315
   spark-3.2.2
   flink-1.15.1
   
   when i use spark-shell and spark-sql  to insert into  a hudi table，it can't 
execute。But flink  executed successfully. 
   Is some Configurition parameters to solve the problem? or it's a bug?
   
   The error message is as follows：
   java.lang.UnsupportedOperationException: S3A streams are not Syncable. See 
HADOOP-17597.
     at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.handleSyncableInvocation(S3ABlockOutputStream.java:656)
     at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.hsync(S3ABlockOutputStream.java:649)
     at 
org.apache.hadoop.fs.FSDataOutputStream.hsync(FSDataOutputStream.java:145)
     at 
org.apache.hadoop.fs.FSDataOutputStream.hsync(FSDataOutputStream.java:145)
     at 
org.apache.hudi.common.table.log.HoodieLogFormatWriter.flush(HoodieLogFormatWriter.java:261)
     at 
org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlocks(HoodieLogFormatWriter.java:194)
     at 
org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlock(HoodieLogFormatWriter.java:135)
     at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.initializeFileGroups(HoodieBackedTableMetadataWriter.java:728)
     at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.initializeEnabledFileGroups(HoodieBackedTableMetadataWriter.java:683)
     at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.initializeFromFilesystem(HoodieBackedTableMetadataWriter.java:561)
     at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.initializeIfNeeded(HoodieBackedTableMetadataWriter.java:395)
     at 
org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.initialize(SparkHoodieBackedTableMetadataWriter.java:121)
     at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.<init>(HoodieBackedTableMetadataWriter.java:175)
     at 
org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.<init>(SparkHoodieBackedTableMetadataWriter.java:90)
     at 
org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.create(SparkHoodieBackedTableMetadataWriter.java:76)
     at 
org.apache.hudi.client.SparkRDDWriteClient.initializeMetadataTable(SparkRDDWriteClient.java:458)
     at 
org.apache.hudi.client.SparkRDDWriteClient.initMetadataTable(SparkRDDWriteClient.java:447)
     at 
org.apache.hudi.client.BaseHoodieWriteClient.doInitTable(BaseHoodieWriteClient.java:1458)
     at 
org.apache.hudi.client.BaseHoodieWriteClient.initTable(BaseHoodieWriteClient.java:1494)
     at 
org.apache.hudi.client.BaseHoodieWriteClient.initTable(BaseHoodieWriteClient.java:1524)
     at 
org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:161)
     at 
org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:206)
     at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:340)
     at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145)
     at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
     at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
     at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97)
     at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
     at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
     at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
     at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
     at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97)
     at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93)
     at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
     at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
     at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
     at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
     at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
     at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
     at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
     at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
     at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457)
     at 
org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:93)
     at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:80)
     at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:78)
     at 
org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:115)
     at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:848)
     at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)
     at 
org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355)
     at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
     ... 66 elided
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] sunneebaby opened a new issue, #9073: [SUPPORT]

Reply via email to