[
https://issues.apache.org/jira/browse/HUDI-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921922#comment-16921922
]
leesf commented on HUDI-148:
----------------------------
Fixed via master: a1483f2c5f3b921d1117d31f75453e45e5717259
> Small File selection logic for MOR must skip fileIds selected for pending
> compaction correctly
> ----------------------------------------------------------------------------------------------
>
> Key: HUDI-148
> URL: https://issues.apache.org/jira/browse/HUDI-148
> Project: Apache Hudi (incubating)
> Issue Type: Bug
> Components: Common Core
> Reporter: BALAJI VARADARAJAN
> Assignee: BALAJI VARADARAJAN
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.5.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The current small file-handling selection could sometimes throw exception
> similar to
> com.uber.hoodie.exception.HoodieUpsertException: Failed to upsert for commit
> time 20190608082556 at
> com.uber.hoodie.HoodieWriteClient.upsert(HoodieWriteClient.java:175) ......
> ......
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.NoSuchElementException: No value present at
> java.util.Optional.get(Optional.java:135) at
> com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.lambda$getSmallFiles$0(HoodieMergeOnReadTable.java:417)
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
> at
> ....
> java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1359) at
> java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
> at
> com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.getSmallFiles(HoodieMergeOnReadTable.java:421)
> at
> com.uber.hoodie.table.HoodieCopyOnWriteTable$UpsertPartitioner.assignInserts(HoodieCopyOnWriteTable.java:635)
> at
> com.uber.hoodie.table.HoodieCopyOnWriteTable$UpsertPartitioner.<init>(HoodieCopyOnWriteTable.java:598)
> at
> com.uber.hoodie.table.HoodieMergeOnReadTable$MergeOnReadUpsertPartitioner.<init>(HoodieMergeOnReadTable.java:387)
> at
> com.uber.hoodie.table.HoodieMergeOnReadTable.getUpsertPartitioner(HoodieMergeOnReadTable.java:101)
> at
> com.uber.hoodie.HoodieWriteClient.getPartitioner(HoodieWriteClient.java:469)
> at
> com.uber.hoodie.HoodieWriteClient.upsertRecordsInternal(HoodieWriteClient.java:453)
> at com.uber.hoodie.HoodieWriteClient.upsert(HoodieWriteClient.java:170) ...
> 10 more
> =========
>
> This is because the initial stream of file-slices considered for Small files
> also includes those in pending compaction.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)