Looks more like a problem with stage retries than eventual consistency.
For that, please ensure
http://hudi.apache.org/configurations.html#withConsistencyCheckEnabled is
on. (we should probably auto turn for s3?)

So, this is the PR. https://github.com/apache/incubator-hudi/pull/651  We
are planning on landing this week. You can try out sometime this week
if you have time :)



On Tue, May 14, 2019 at 1:03 AM Jun Zhu <[email protected]> wrote:

> Hi Balaji,
> Glad to hear back. That plan looks great, really looking forward to it.
>
>
> On Tue, May 14, 2019 at 3:07 PM Balaji Varadarajan
> <[email protected]> wrote:
>
> >  Hi Jun,
> > Good catch. We do have a cleanup mechanism to remove these partially
> > written files before inserting to duplicate files but that itself could
> > fail because of eventual consistency.
> > We had reworked handling of these failure scenarios and eventual
> > consistency in this PR. :
> > https://github.com/apache/incubator-hudi/pull/651
> > The initial motivation was to completely avoid temp file writing and
> > renaming files which are costly operations in cloud. As part of this
> > change, eventual consistency handling is also redone. This change should
> > handle eventual consistency correctly with fine-granular consistency
> guards
> > and using optimistic approach to handle duplicate file generation.
> > This change is scheduled to be available in 0.4.7 and we have been
> vetting
> > this change by running large-scale testing.
> >
> > Balaji.V    On Monday, May 13, 2019, 8:41:37 PM PDT, Jun Zhu
> > <[email protected]> wrote:
> >
> >  Hi team,
> > Feedback for eventually consistency problem in s3.
> >
> > *Scenario*:
> > Found files with same `bucket`(variable in hudi code) number in same
> > partition:
> >
> > 2019-05-07 20:21:39  11993262 0806a716-54ee-4343-bc0e-ca26a4cbbbce_*7*
> > _20190507122110.parquet
> >
> > 2019-05-07 20:21:34  11983784 c3790f3b-5a0e-4f2b-b934-3875175f6f9a_*7*
> > _20190507122110.parquet
> >
> > *Exception in spark log*:
> >
> > > 19/05/07 12:21:34 WARN TaskSetManager: Lost task 7.0 in stage 1709.0
> (TID
> > > 289978, ip-172-19-111-50, executor 7): java.lang.RuntimeException:
> > > com.uber.hoodie.exception.HoodieException:
> > > com.uber.hoodie.exception.HoodieException:
> > > java.util.concurrent.ExecutionException:
> > > com.uber.hoodie.exception.HoodieInsertException: Failed to close the
> > Insert
> > > Handle for path
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at
> > >
> >
> com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:121)
> > > at
> > >
> >
> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43)
> > > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
> > > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
> > > at
> > >
> >
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:378)
> > > at
> > >
> >
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1109)
> > > at
> > >
> >
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1083)
> > > at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1018)
> > > at
> > >
> >
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1083)
> > > at
> > >
> >
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:809)
> > > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
> > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
> > > at
> > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> > > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> > > at org.apache.spark.scheduler.Task.run(Task.scala:109)
> > > at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
> > > at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:748)
> > > Caused by: com.uber.hoodie.exception.HoodieException:
> > > com.uber.hoodie.exception.HoodieException:
> > > java.util.concurrent.ExecutionException:
> > > com.uber.hoodie.exception.HoodieInsertException: Failed to close the
> > Insert
> > > Handle for path
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at
> > >
> >
> com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:106)
> > > at
> > >
> >
> com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:45)
> > > at
> > >
> >
> com.uber.hoodie.func.LazyIterableIterator.next(LazyIterableIterator.java:119)
> > > ... 20 more
> > > Caused by: com.uber.hoodie.exception.HoodieException:
> > > java.util.concurrent.ExecutionException:
> > > com.uber.hoodie.exception.HoodieInsertException: Failed to close the
> > Insert
> > > Handle for path
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at
> > >
> >
> com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:146)
> > > at
> > >
> >
> com.uber.hoodie.func.CopyOnWriteLazyInsertIterable.computeNext(CopyOnWriteLazyInsertIterable.java:102)
> > > ... 22 more
> > > Caused by: java.util.concurrent.ExecutionException:
> > > com.uber.hoodie.exception.HoodieInsertException: Failed to close the
> > Insert
> > > Handle for path
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> > > at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> > > at
> > >
> >
> com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.execute(BoundedInMemoryExecutor.java:144)
> > > ... 23 more
> > > Caused by: com.uber.hoodie.exception.HoodieInsertException: Failed to
> > > close the Insert Handle for path
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at com.uber.hoodie.io
> > .HoodieCreateHandle.close(HoodieCreateHandle.java:177)
> > > at
> > >
> >
> com.uber.hoodie.func.CopyOnWriteLazyInsertIterable$CopyOnWriteInsertHandler.finish(CopyOnWriteLazyInsertIterable.java:168)
> > > at
> > >
> >
> com.uber.hoodie.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:42)
> > > at
> > >
> >
> com.uber.hoodie.common.util.queue.BoundedInMemoryExecutor.lambda$null$77(BoundedInMemoryExecutor.java:124)
> > > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > ... 3 more
> > > Caused by: java.io.FileNotFoundException: No such file or directory:
> > >
> >
> s3a://vungle2-dataeng/jun-test/stagebugfix20190507/2019-05-07_12/c3790f3b-5a0e-4f2b-b934-3875175f6f9a_7_20190507122110.parquet
> > > at
> > >
> >
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:993)
> > > at
> > >
> >
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
> > > at com.uber.hoodie.common.util.FSUtils.getFileSize(FSUtils.java:126)
> > > at com.uber.hoodie.io
> > .HoodieCreateHandle.close(HoodieCreateHandle.java:168)
> > > ... 7 more
> >
> >
> > *Cause*:
> > After write files, FSUtils failed to find files in this line:
> >
> >
> https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/main/java/com/uber/hoodie/io/HoodieCreateHandle.java#L168
> >
> > So it throw exception out, and write another bucket 7, which exactly same
> > with "failed" one, and cause duplications in this partition.
> >
> > Any thing we can do to avoid this case?
> >
> > Thanks,
> > Jun
> > --
> > [image: vshapesaqua11553186012.gif] <https://vungle.com/>  *Jun Zhu*
> > Sr. Engineer I, Data
> > +86 18565739171
> >
> > [image: in1552694272.png] <https://www.linkedin.com/company/vungle>
> > [image:
> > fb1552694203.png] <https://facebook.com/vungle>      [image:
> > tw1552694330.png] <https://twitter.com/vungle>      [image:
> > ig1552694392.png] <https://www.instagram.com/vungle>
> > Units 3801, 3804, 38F, C Block, Beijing Yintai Center, Beijing, China
>
>
>
> --
> [image: vshapesaqua11553186012.gif] <https://vungle.com/>   *Jun Zhu*
> Sr. Engineer I, Data
> +86 18565739171
>
> [image: in1552694272.png] <https://www.linkedin.com/company/vungle>
> [image:
> fb1552694203.png] <https://facebook.com/vungle>      [image:
> tw1552694330.png] <https://twitter.com/vungle>      [image:
> ig1552694392.png] <https://www.instagram.com/vungle>
> Units 3801, 3804, 38F, C Block, Beijing Yintai Center, Beijing, China
>

Reply via email to