[ https://issues.apache.org/jira/browse/MAPREDUCE-7465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801918#comment-17801918 ]
Steve Loughran commented on MAPREDUCE-7465: ------------------------------------------- bq. I understand that developpers are reluctant to integrate the PR, but it does solve my problem correctly. I would appreciate that it is disabled by default, but configurable with a property, so I can enable it and still use the official version without local patch applied to my jars. i really, really really don't want to do this as (a) its a critical piece of code and (b) we'd be committing to maintaining it. bq. There is no problem of "controlling the throttling" in Hadoop Azure Abfss code... it is already very bad at using legacy class java.net.HttpURLConnection, so establishing very slow https connection 1 by 1 without keeping TCP sockets alive. bq. We do have throthling however, not from a single JVM, but because we have so many (>= 1000) spark applications running concurrently, and so many useless "Prefetching Threads" ! Trying to control throtling on a single JVM is in my opinion useless. Azure Abfss can support 20 Millions operations per hours (and per StorageAccount), and Microsoft Azure was even able to increase it more. bq. trying to control throtling on a single JVM is in my opinion useless. we see scale issues in job commits from renames...if you aren't seeing them then it's because if our attempts to handle this (HADOOP-18002 and related). Abfs also supports 100-continue, self-throttling and other things, for cross-JVM throttling. better than the s3a code by a large margin, where even large directory deletions can blow up and make a mess of retries within the aws sdk itself. it's precisely because of rename scale issues during job commit that the manifest committer was written and is used in production ABFS deployments. If you are targeting abfs storage, it's the way to get robust job commit even on heavily loaded stores. And it is shipping in current releases. If you are having problems getting it working, happy to assist you getting set up. > performance problem in FileOutputCommiter for big list processed by single > thread > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-7465 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7465 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: performance > Affects Versions: 3.2.3, 3.3.2, 3.2.4, 3.3.5, 3.3.3, 3.3.4, 3.3.6 > Reporter: Arnaud Nauwynck > Priority: Minor > Labels: pull-request-available > > when commiting a big hadoop job (for example via Spark) having many > partitions, > the class FileOutputCommiter process thousands of dirs/files to rename with a > single Thread. This is performance issue, caused by lot of waits on > FileStystem storage operations. > I propose that above a configurable threshold (default=3, configurable via > property 'mapreduce.fileoutputcommitter.parallel.threshold'), the class > FileOutputCommiter process the list of files to rename using parallel > threads, using the default jvm ExecutorService (ForkJoinPool.commonPool()) > See Pull-Request: > [https://github.com/apache/hadoop/pull/6378|https://github.com/apache/hadoop/pull/6378] > Notice that sub-class instances of FileOutputCommiter are supposed to be > created at runtime dependending of a configurable property > ([https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/PathOutputCommitterFactory.java|PathOutputCommitterFactory.java]). > But for example in Parquet + Spark, this is buggy and can not be changed at > runtime. > There is an ongoing Jira and PR to fix it in Parquet + Spark: > [https://issues.apache.org/jira/browse/PARQUET-2416|https://issues.apache.org/jira/browse/PARQUET-2416] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org