[ https://issues.apache.org/jira/browse/HUDI-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921922#comment-16921922 ]
leesf commented on HUDI-148: ---------------------------- Fixed via master: a1483f2c5f3b921d1117d31f75453e45e5717259 > Small File selection logic for MOR must skip fileIds selected for pending > compaction correctly > ---------------------------------------------------------------------------------------------- > > Key: HUDI-148 > URL: https://issues.apache.org/jira/browse/HUDI-148 > Project: Apache Hudi (incubating) > Issue Type: Bug > Components: Common Core > Reporter: BALAJI VARADARAJAN > Assignee: BALAJI VARADARAJAN > Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The current small file-handling selection could sometimes throw exception > similar to > com.uber.hoodie.exception.HoodieUpsertException: Failed to upsert for commit > time 20190608082556 at > com.uber.hoodie.HoodieWriteClient.upsert(HoodieWriteClient.java:175) ...... > ...... > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.NoSuchElementException: No value present at > java.util.Optional.get(Optional.java:135) at > com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.lambda$getSmallFiles$0(HoodieMergeOnReadTable.java:417) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174) > at > .... > java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1359) at > java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126) > at > com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.getSmallFiles(HoodieMergeOnReadTable.java:421) > at > com.uber.hoodie.table.HoodieCopyOnWriteTable$UpsertPartitioner.assignInserts(HoodieCopyOnWriteTable.java:635) > at > com.uber.hoodie.table.HoodieCopyOnWriteTable$UpsertPartitioner.<init>(HoodieCopyOnWriteTable.java:598) > at > com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.<init>(HoodieMergeOnReadTable.java:387) > at > com.uber.hoodie.table.HoodieMergeOnReadTable.getUpsertPartitioner(HoodieMergeOnReadTable.java:101) > at > com.uber.hoodie.HoodieWriteClient.getPartitioner(HoodieWriteClient.java:469) > at > com.uber.hoodie.HoodieWriteClient.upsertRecordsInternal(HoodieWriteClient.java:453) > at com.uber.hoodie.HoodieWriteClient.upsert(HoodieWriteClient.java:170) ... > 10 more > ========= > > This is because the initial stream of file-slices considered for Small files > also includes those in pending compaction. -- This message was sent by Atlassian Jira (v8.3.2#803003)