matthijseikelenboom opened a new issue, #11170:
URL: https://github.com/apache/hudi/issues/11170

   **Describe the problem you faced**
   
   For work we had needed to have a concurrent read/write support for our data 
lake, which uses Spark. We where noticing some inconsistencies, so we wrote a 
test that can verify whether something like Hudi adheres to ACID. We did 
however find that Hudi fails this test.
   
   Now, it can be that we've wrongly configured Hudi or that there is some 
mistake in the test code.
   
   My question is if someone of you can take a look at it, and perhaps can 
explain what is going wrong here.
   
   **To Reproduce**
   
   How to run the test and it's findings are described in the README of the 
repository, but here is a short run down
   
   Steps to reproduce the behavior:
   
   1. Check out repo: 
[hudi-acid-verification](https://github.com/matthijseikelenboom/hudi-acid-verification)
   2. Start Docker if not already running
   3. Run the test 
[TransactionManagerTest.java](https://github.com/matthijseikelenboom/hudi-acid-verification/blob/master/src/test/java/org/example/writer/TransactionManagerTest.java)
   4. Observe that writers breakdown and that very transactions have been 
processed.
   
   **Expected behavior**
   
   1. I expect the writers not to break down
   2. I expect that the full amount of transactions are executed
   
   **Environment Description**
   
   * Hudi version : 0.14.1
   
   * Spark version : 3.4.2
   
   * Hive version : 4.0.0-beta-1
   
   * Hadoop version : 3.2.2
   
   * Storage (HDFS/S3/GCS..) : NTFS(Windows), APFS(macOS) & HDFS
   
   * Running on Docker? (yes/no) : No
   
   **Additional context**
   It's worth noting that other solutions, Iceberg and Delta Lake, have also 
been tested this way. Iceberg also didn't pass this test. Delta Lake did pass 
the test.
   
   **Stacktrace**
   
   ```
   24/05/07 21:49:38 ERROR TransactionWriter: Exception in writer.
   org.example.writer.TransactionFailedException: 
org.apache.hudi.exception.HoodieRollbackException: Failed to rollback 
file:/tmp/lakehouse/concurrencytestdb.db/acid_verification commits 
20240507214932607
        at 
org.example.writer.TransactionWriter.wrapOrRethrowException(TransactionWriter.java:190)
        at 
org.example.writer.TransactionWriter.tryTransaction(TransactionWriter.java:184)
        at 
org.example.writer.TransactionWriter.updateTransaction(TransactionWriter.java:143)
        at 
org.example.writer.TransactionWriter.lambda$handleTransaction$0(TransactionWriter.java:89)
        at 
org.example.writer.TransactionWriter.withRetryOnException(TransactionWriter.java:109)
        at 
org.example.writer.TransactionWriter.handleTransaction(TransactionWriter.java:83)
        at org.example.writer.TransactionWriter.run(TransactionWriter.java:70)
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Failed to 
rollback file:/tmp/lakehouse/concurrencytestdb.db/acid_verification commits 
20240507214932607
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1065)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1012)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:940)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:922)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:917)
        at 
org.apache.hudi.client.BaseHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(BaseHoodieWriteClient.java:941)
        at 
org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:222)
        at 
org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:940)
        at 
org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:933)
        at 
org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:501)
        at 
org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:204)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:121)
        at 
org.apache.spark.sql.hudi.command.MergeIntoHoodieTableCommand.executeUpsert(MergeIntoHoodieTableCommand.scala:439)
        at 
org.apache.spark.sql.hudi.command.MergeIntoHoodieTableCommand.run(MergeIntoHoodieTableCommand.scala:282)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
        at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
        at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
        at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)
        at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:104)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)
        at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
        at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
        at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
        at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
        at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
        at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)
        at 
org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
        at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
        at 
org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:218)
        at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:95)
        at 
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:640)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:630)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:671)
        at 
org.example.writer.TransactionWriter.lambda$updateTransaction$2(TransactionWriter.java:160)
        at 
org.example.writer.TransactionWriter.tryTransaction(TransactionWriter.java:181)
        ... 5 more
   Caused by: org.apache.hudi.exception.HoodieRollbackException: Found commits 
after time :20240507214932607, please rollback greater commits first
        at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.validateRollbackCommitSequence(BaseRollbackActionExecutor.java:179)
        at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.doRollbackAndGetStats(BaseRollbackActionExecutor.java:218)
        at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.runRollback(BaseRollbackActionExecutor.java:111)
        at 
org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.execute(BaseRollbackActionExecutor.java:138)
        at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.rollback(HoodieSparkCopyOnWriteTable.java:298)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1048)
        ... 51 more
   24/05/07 21:49:38 INFO TransactionWriter: acid-writer-1 finished.
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to