[jira] [Updated] (HUDI-4845) HiveSync fails while scanning a table with a large number of partitions

Volodymyr Burenin (Jira) Wed, 14 Sep 2022 13:18:19 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Volodymyr Burenin updated HUDI-4845:
------------------------------------
    Description: 
 

When I try to recreate a table in metastore that has around 4k partitions, I 
get this exceptions during 
org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths call.
{code:java}
Caused by: java.util.concurrent.RejectedExecutionException: Thread limit 
exceeded replacing blocked worker
at 
java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
at 
java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
at 
java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:73)
at 
org.apache.hadoop.fs.impl.FutureIOSupport.awaitFuture(FutureIOSupport.java:65)
at org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:821)
at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:612)
at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:536)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:173)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:148)
at 
org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:414){code}
I tracked down the problem down here:

[https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java]

Apparently this value is too high:
private static final int DEFAULT_LISTING_PARALLELISM = 1500;

After I dropped this value to 500 the problem has been resolved.

I am not sure this is the best fix, however I think this value is definitely 
has to be configurable and even the related TODO note in the code says that.

  was:
 

When I try to recreate a table in metastore that has around 4k partitions, I 
get this exceptions during 
org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths call.


{code:java}
Caused by: java.util.concurrent.RejectedExecutionException: Thread limit 
exceeded replacing blocked worker
at 
java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
at 
java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
at 
java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at 
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:73)
at 
org.apache.hadoop.fs.impl.FutureIOSupport.awaitFuture(FutureIOSupport.java:65)
at org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:821)
at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:612)
at 
org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:536)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:173)
at 
org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:148)
at 
org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:414){code}
I tracked down the problem down here:

[https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java]

Apparently this value is too high:
private static final int DEFAULT_LISTING_PARALLELISM = 1500;

 


> HiveSync fails while scanning a table with a large number of partitions
> -----------------------------------------------------------------------
>
>                 Key: HUDI-4845
>                 URL: https://issues.apache.org/jira/browse/HUDI-4845
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Volodymyr Burenin
>            Priority: Major
>
>  
> When I try to recreate a table in metastore that has around 4k partitions, I 
> get this exceptions during 
> org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths call.
> {code:java}
> Caused by: java.util.concurrent.RejectedExecutionException: Thread limit 
> exceeded replacing blocked worker
> at 
> java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
> at 
> java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
> at 
> java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
> at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
> at org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:73)
> at 
> org.apache.hadoop.fs.impl.FutureIOSupport.awaitFuture(FutureIOSupport.java:65)
> at 
> org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:821)
> at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:612)
> at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:536)
> at 
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:173)
> at 
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:148)
> at 
> org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:414){code}
> I tracked down the problem down here:
> [https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java]
> Apparently this value is too high:
> private static final int DEFAULT_LISTING_PARALLELISM = 1500;
> After I dropped this value to 500 the problem has been resolved.
> I am not sure this is the best fix, however I think this value is definitely 
> has to be configurable and even the related TODO note in the code says that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HUDI-4845) HiveSync fails while scanning a table with a large number of partitions

Reply via email to