[ https://issues.apache.org/jira/browse/SPARK-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943305#comment-15943305 ]
Sean Owen commented on SPARK-20107: ----------------------------------- 11 minutes versus what? Does it work on stock Hadoop 2.6 (sounds like it)? I'd not open a JIRA until you're ready to explain the change. > Speed up FileOutputCommitter#commitJob for many output files > ------------------------------------------------------------ > > Key: SPARK-20107 > URL: https://issues.apache.org/jira/browse/SPARK-20107 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.1.0 > Reporter: Yuming Wang > > It can speed up {{11 minutes}} for 216869 output files. > This improvement can effect all cloudera's hadoop cdh5-2.6.0_5.4.0 higher > versions,(see: > https://github.com/cloudera/hadoop-common/commit/1c1236182304d4075276c00c4592358f428bc433 > and > https://github.com/cloudera/hadoop-common/commit/16b2de27321db7ce2395c08baccfdec5562017f0) > and apache's hadoop 2.7.0 higher versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org