Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64

2018-03-22 Thread Apache Jenkins Server
For more details, see https://builds.apache.org/job/hadoop-trunk-win/414/

[Mar 21, 2018 8:53:35 PM] (xyao) HDFS-11043. TestWebHdfsTimeouts fails. 
Contributed by Xiaoyu Yao and
[Mar 21, 2018 10:19:20 PM] (jlowe) YARN-8054. Improve robustness of the 
LocalDirsHandlerService
[Mar 21, 2018 11:46:03 PM] (shv) HDFS-12884. 
BlockUnderConstructionFeature.truncateBlock should be of




-1 overall


The following subsystems voted -1:
unit


The following subsystems are considered long running:
(runtime bigger than 1h 00m 00s)
unit


Specific tests:

Failed CTEST tests :

   test_test_libhdfs_threaded_hdfs_static 

Failed junit tests :

   hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec 
   hadoop.fs.contract.rawlocal.TestRawlocalContractAppend 
   hadoop.fs.TestFileUtil 
   hadoop.fs.TestFsShellCopy 
   hadoop.fs.TestFsShellList 
   hadoop.fs.TestLocalFileSystem 
   hadoop.fs.TestRawLocalFileSystemContract 
   hadoop.fs.TestTrash 
   hadoop.http.TestHttpServer 
   hadoop.http.TestHttpServerLogs 
   hadoop.io.compress.TestCodec 
   hadoop.io.nativeio.TestNativeIO 
   hadoop.ipc.TestIPC 
   hadoop.ipc.TestSocketFactory 
   hadoop.metrics2.impl.TestStatsDMetrics 
   hadoop.metrics2.sink.TestRollingFileSystemSinkWithLocal 
   hadoop.security.TestSecurityUtil 
   hadoop.security.TestShellBasedUnixGroupsMapping 
   hadoop.security.token.TestDtUtilShell 
   hadoop.util.TestNativeCodeLoader 
   hadoop.fs.TestResolveHdfsSymlink 
   hadoop.hdfs.client.impl.TestBlockReaderLocalLegacy 
   hadoop.hdfs.crypto.TestHdfsCryptoStreams 
   hadoop.hdfs.qjournal.client.TestQuorumJournalManager 
   hadoop.hdfs.qjournal.server.TestJournalNode 
   hadoop.hdfs.qjournal.server.TestJournalNodeSync 
   hadoop.hdfs.security.TestDelegationTokenForProxyUser 
   hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks 
   hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages 
   hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy 
   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement 
   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica 
   hadoop.hdfs.server.datanode.TestBlockPoolSliceStorage 
   hadoop.hdfs.server.datanode.TestBlockRecovery 
   hadoop.hdfs.server.datanode.TestBlockScanner 
   hadoop.hdfs.server.datanode.TestDataNodeFaultInjector 
   hadoop.hdfs.server.datanode.TestDataNodeLifeline 
   hadoop.hdfs.server.datanode.TestDataNodeMetrics 
   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.server.datanode.TestHSync 
   hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport 
   hadoop.hdfs.server.datanode.web.TestDatanodeHttpXFrame 
   hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand 
   hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC 
   hadoop.hdfs.server.mover.TestMover 
   hadoop.hdfs.server.mover.TestStorageMover 
   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA 
   hadoop.hdfs.server.namenode.ha.TestHAAppend 
   hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA 
   hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics 
   
hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot 
   hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot 
   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots 
   hadoop.hdfs.server.namenode.snapshot.TestSnapRootDescendantDiff 
   hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport 
   hadoop.hdfs.server.namenode.TestAddBlock 
   hadoop.hdfs.server.namenode.TestAuditLoggerWithCommands 
   hadoop.hdfs.server.namenode.TestCheckpoint 
   hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate 
   hadoop.hdfs.server.namenode.TestEditLogRace 
   hadoop.hdfs.server.namenode.TestFileTruncate 
   hadoop.hdfs.server.namenode.TestFsck 
   hadoop.hdfs.server.namenode.TestFSImage 
   hadoop.hdfs.server.namenode.TestFSImageWithSnapshot 
   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport 
   hadoop.hdfs.server.namenode.TestNameNodeMXBean 
   

Re: [VOTE] Release Apache Hadoop 3.0.1 (RC1)

2018-03-22 Thread Elek, Marton


+1 (non binding)

I did a full build from source code, created a docker container and did 
various basic level tests with robotframework based automation and 
docker-compose based pseudo clusters[1].


Including:

* Hdfs federation smoke test
* Basic ViewFS configuration
* Yarn example jobs
* Spark example jobs (with and without yarn)
* Simple hive table creation

Marton


[1]: https://github.com/flokkr/runtime-compose

On 03/18/2018 05:11 AM, Lei Xu wrote:

Hi, all

I've created release candidate RC-1 for Apache Hadoop 3.0.1

Apache Hadoop 3.0.1 will be the first bug fix release for Apache
Hadoop 3.0 release. It includes 49 bug fixes and security fixes, which
include 12
blockers and 17 are critical.

Please note:
* HDFS-12990. Change default NameNode RPC port back to 8020. It makes
incompatible changes to Hadoop 3.0.0.  After 3.0.1 releases, Apache
Hadoop 3.0.0 will be deprecated due to this change.

The release page is:
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0+Release

New RC is available at: http://home.apache.org/~lei/hadoop-3.0.1-RC1/

The git tag is release-3.0.1-RC1, and the latest commit is
496dc57cc2e4f4da117f7a8e3840aaeac0c1d2d0

The maven artifacts are available at:
https://repository.apache.org/content/repositories/orgapachehadoop-1081/

Please try the release and vote; the vote will run for the usual 5
days, ending on 3/22/2017 6pm PST time.

Thanks!

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.0 (RC0)

2018-03-22 Thread Wangda Tan
Thanks @Bharat for the quick check, the previously staged repository has
some issues. I re-deployed jars to nexus.

Here's the new repo (1087)

https://repository.apache.org/content/repositories/orgapachehadoop-1087/

Other artifacts remain same, no additional code changes.

On Wed, Mar 21, 2018 at 11:54 PM, Bharat Viswanadham <
bviswanad...@hortonworks.com> wrote:

> Hi Wangda,
> Maven Artifact repositories is not having all Hadoop jars. (It is missing
> many like hadoop-hdfs, hadoop-client etc.,)
> https://repository.apache.org/content/repositories/orgapachehadoop-1086/
>
>
> Thanks,
> Bharat
>
>
> On 3/21/18, 11:44 PM, "Wangda Tan"  wrote:
>
> Hi folks,
>
> Thanks to the many who helped with this release since Dec 2017 [1].
> We've
> created RC0 for Apache Hadoop 3.1.0. The artifacts are available here:
>
> http://people.apache.org/~wangda/hadoop-3.1.0-RC0/
>
> The RC tag in git is release-3.1.0-RC0.
>
> The maven artifacts are available via repository.apache.org at
> https://repository.apache.org/content/repositories/
> orgapachehadoop-1086/
>
> This vote will run 7 days (5 weekdays), ending on Mar 28 at 11:59 pm
> Pacific.
>
> 3.1.0 contains 727 [2] fixed JIRA issues since 3.0.0. Notable additions
> include the first class GPU/FPGA support on YARN, Native services,
> Support
> rich placement constraints in YARN, S3-related enhancements, allow HDFS
> block replicas to be provided by an external storage system, etc.
>
> We’d like to use this as a starting release for 3.1.x [1], depending
> on how
> it goes, get it stabilized and potentially use a 3.1.1 in several
> weeks as
> the stable release.
>
> We have done testing with a pseudo cluster and distributed shell job.
> My +1
> to start.
>
> Best,
> Wangda/Vinod
>
> [1]
> https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104b
> c9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E
> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> (3.1.0)
> AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved ORDER
> BY
> fixVersion ASC
>
>
>


[jira] [Created] (MAPREDUCE-7068) Fix Reduce Exception was overwrited by ReduceTask

2018-03-22 Thread tartarus (JIRA)
tartarus created MAPREDUCE-7068:
---

 Summary: Fix Reduce Exception was overwrited by ReduceTask
 Key: MAPREDUCE-7068
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7068
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 2.7.1
 Environment: CentOS 7

Hadoop-2.7.1

Hive-1.2.1
Reporter: tartarus


 
{code:java}
try {
  //increment processed counter only if skipping feature is enabled
  boolean incrProcCount = SkipBadRecords.getReducerMaxSkipGroups(job)>0 &&
SkipBadRecords.getAutoIncrReducerProcCount(job);
  
  ReduceValuesIterator values = isSkipping() ? 
  new SkippingReduceValuesIterator(rIter, 
  comparator, keyClass, valueClass, 
  job, reporter, umbilical) :
  new ReduceValuesIterator(rIter, 
  comparator, keyClass, valueClass,
  job, reporter);
  values.informReduceProgress();
  while (values.more()) {
reduceInputKeyCounter.increment(1);
reducer.reduce(values.getKey(), values, collector, reporter);
if(incrProcCount) {
  reporter.incrCounter(SkipBadRecords.COUNTER_GROUP, 
  SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS, 1);
}
values.nextKey();
values.informReduceProgress();
  }


  reducer.close();
  reducer = null;
  
  out.close(reporter);
  out = null;
} finally {
  IOUtils.cleanupWithLogger(LOG, reducer);
  closeQuietly(out, reporter);
}
  }
{code}
if  {color:#d04437}reducer.close();{color} throw Exception , 
{color:#d04437}reducer = null;{color} will not run, then 
{color:#d04437}IOUtils.cleanupWithLogger(LOG, reducer); {color}

 

will throw Exception and overwrite the Exception of reducer.close();

so we should catch it and print log to help targeting issues

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-03-22 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/

[Mar 21, 2018 2:41:19 AM] (rohithsharmaks) YARN-7581. HBase filters are not 
constructed correctly in ATSv2.
[Mar 21, 2018 2:51:35 AM] (yqlin) HDFS-13307. RBF: Improve the use of setQuota 
command. Contributed by
[Mar 21, 2018 3:32:05 AM] (yqlin) HDFS-13250. RBF: Router to manage requests 
across multiple subclusters.
[Mar 21, 2018 4:12:20 AM] (mackrorysd) HADOOP-15332. Fix typos in hadoop-aws 
markdown docs. Contributed by
[Mar 21, 2018 5:01:26 AM] (aajisaka) HADOOP-15330. Remove jdk1.7 profile from 
hadoop-annotations module
[Mar 21, 2018 6:04:05 AM] (yzhang) HDFS-13315. Add a test for the issue 
reported in HDFS-11481 which is
[Mar 21, 2018 8:53:35 PM] (xyao) HDFS-11043. TestWebHdfsTimeouts fails. 
Contributed by Xiaoyu Yao and
[Mar 21, 2018 10:19:20 PM] (jlowe) YARN-8054. Improve robustness of the 
LocalDirsHandlerService
[Mar 21, 2018 11:46:03 PM] (shv) HDFS-12884. 
BlockUnderConstructionFeature.truncateBlock should be of




-1 overall


The following subsystems voted -1:
findbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs-rbf 
   
org.apache.hadoop.hdfs.server.federation.resolver.NamenodePriorityComparator 
implements Comparator but not Serializable At 
NamenodePriorityComparator.java:Serializable At 
NamenodePriorityComparator.java:[lines 33-61] 
   org.apache.hadoop.hdfs.server.federation.router.RemoteMethod.getTypes() 
may expose internal representation by returning RemoteMethod.types At 
RemoteMethod.java:by returning RemoteMethod.types At RemoteMethod.java:[line 
114] 
   new org.apache.hadoop.hdfs.server.federation.router.RemoteMethod(String, 
Class[], Object[]) may expose internal representation by storing an externally 
mutable object into RemoteMethod.types At RemoteMethod.java:expose internal 
representation by storing an externally mutable object into RemoteMethod.types 
At RemoteMethod.java:[line 79] 

FindBugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
   org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources At Resource.java:by 
returning Resource.resources At Resource.java:[line 234] 

Failed junit tests :

   hadoop.fs.shell.TestCopyPreserveFlag 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-compile-javac-root.txt
  [288K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/whitespace-eol.txt
  [9.2M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/whitespace-tabs.txt
  [288K]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/xml.txt
  [4.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [188K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/728/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [300K]
   

Re: [VOTE] Release Apache Hadoop 3.0.1 (RC1)

2018-03-22 Thread Akira Ajisaka

+1 (binding)

- verified signatures and checksums
- built from source and setup 4 node cluster
- ran some examples jobs
- verified some HDFS operations

Thanks,
Akira

On 2018/03/22 10:08, Robert Kanter wrote:

+1 (binding)

- verified signatures and checksums
- setup pseudo cluster using Fair Scheduler from binary tarball
- ran some of the example jobs, clicked around the UI a bit

By the way, ASF recommends using a 4096 bit key for signing, though 2048 is
fine for now.
https://www.apache.org/dev/release-signing.html#key-length


thanks
- Robert




On Wed, Mar 21, 2018 at 2:21 AM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:


Thanks Lei Xu for driving this.

+1(binding)


--Built from the source
--Installed the HA Cluster
--Verified hdfs operations
--Ran sample jobs
--Verified the UI's



--Brahma Reddy Battula


-Original Message-
From: Lei Xu [mailto:l...@apache.org]
Sent: 18 March 2018 12:11
To: Hadoop Common ; Hdfs-dev <
hdfs-...@hadoop.apache.org>; mapreduce-dev@hadoop.apache.org;
yarn-...@hadoop.apache.org
Subject: [VOTE] Release Apache Hadoop 3.0.1 (RC1)

Hi, all

I've created release candidate RC-1 for Apache Hadoop 3.0.1

Apache Hadoop 3.0.1 will be the first bug fix release for Apache Hadoop
3.0 release. It includes 49 bug fixes and security fixes, which include 12
blockers and 17 are critical.

Please note:
* HDFS-12990. Change default NameNode RPC port back to 8020. It makes
incompatible changes to Hadoop 3.0.0.  After 3.0.1 releases, Apache Hadoop
3.0.0 will be deprecated due to this change.

The release page is:
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0+Release

New RC is available at: http://home.apache.org/~lei/hadoop-3.0.1-RC1/

The git tag is release-3.0.1-RC1, and the latest commit is
496dc57cc2e4f4da117f7a8e3840aaeac0c1d2d0

The maven artifacts are available at:
https://repository.apache.org/content/repositories/orgapachehadoop-1081/

Please try the release and vote; the vote will run for the usual 5 days,
ending on 3/22/2017 6pm PST time.

Thanks!

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org






-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[VOTE] Release Apache Hadoop 3.1.0 (RC0)

2018-03-22 Thread Wangda Tan
Hi folks,

Thanks to the many who helped with this release since Dec 2017 [1]. We've
created RC0 for Apache Hadoop 3.1.0. The artifacts are available here:

http://people.apache.org/~wangda/hadoop-3.1.0-RC0/

The RC tag in git is release-3.1.0-RC0.

The maven artifacts are available via repository.apache.org at
https://repository.apache.org/content/repositories/orgapachehadoop-1086/

This vote will run 7 days (5 weekdays), ending on Mar 28 at 11:59 pm
Pacific.

3.1.0 contains 727 [2] fixed JIRA issues since 3.0.0. Notable additions
include the first class GPU/FPGA support on YARN, Native services, Support
rich placement constraints in YARN, S3-related enhancements, allow HDFS
block replicas to be provided by an external storage system, etc.

We’d like to use this as a starting release for 3.1.x [1], depending on how
it goes, get it stabilized and potentially use a 3.1.1 in several weeks as
the stable release.

We have done testing with a pseudo cluster and distributed shell job. My +1
to start.

Best,
Wangda/Vinod

[1]
https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104bc9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E
[2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.0)
AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved ORDER BY
fixVersion ASC


[jira] [Reopened] (MAPREDUCE-6823) FileOutputFormat to support configurable PathOutputCommitter factory

2018-03-22 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened MAPREDUCE-6823:


I'm making a guess that this is a dup of HADOOP-13786 like the others I just 
closed as dups.

Reopening and closing this as a dup. [~ste...@apache.org], please revert back 
if this is incorrect.

> FileOutputFormat to support configurable PathOutputCommitter factory
> 
>
> Key: MAPREDUCE-6823
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6823
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0-alpha2
> Environment: Targeting S3 as the output of work
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> MAPREDUCE-6823-002.patch, MAPREDUCE-6823-002.patch, MAPREDUCE-6823-004.patch
>
>
> In HADOOP-13786 I'm adding a custom subclass for FileOutputFormat, one which 
> can talk direct to the S3A Filesystem for more efficient operations, better 
> failure modes, and, most critically, as part of HADOOP-13345, atomic commit 
> of output. The normal committer relies on directory rename() being atomic for 
> this; for S3 we don't have that luxury.
> To support a custom committer, we need to be able to tell FileOutputFormat 
> (and implicitly, all subclasses which don't have their own custom committer), 
> to use our new {{S3AOutputCommitter}}.
> I propose: 
> # {{FileOutputFormat}} takes a factory to create committers.
> # The factory to take a URI and {{TaskAttemptContext}} and return a committer
> # the default implementation always returns a {{FileOutputCommitter}}
> # A configuration option allows a new factory to be named
> # An {{S3AOutputCommitterFactory}} to return a  {{FileOutputCommitter}} or 
> new {{S3AOutputCommitter}} depending upon the URI of the destination.
> Note that MRv1 already supports configurable committers; this is only the V2 
> API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6823) FileOutputFormat to support configurable PathOutputCommitter factory

2018-03-22 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-6823.

   Resolution: Duplicate
Fix Version/s: (was: 3.1.0)

> FileOutputFormat to support configurable PathOutputCommitter factory
> 
>
> Key: MAPREDUCE-6823
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6823
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0-alpha2
> Environment: Targeting S3 as the output of work
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> MAPREDUCE-6823-002.patch, MAPREDUCE-6823-002.patch, MAPREDUCE-6823-004.patch
>
>
> In HADOOP-13786 I'm adding a custom subclass for FileOutputFormat, one which 
> can talk direct to the S3A Filesystem for more efficient operations, better 
> failure modes, and, most critically, as part of HADOOP-13345, atomic commit 
> of output. The normal committer relies on directory rename() being atomic for 
> this; for S3 we don't have that luxury.
> To support a custom committer, we need to be able to tell FileOutputFormat 
> (and implicitly, all subclasses which don't have their own custom committer), 
> to use our new {{S3AOutputCommitter}}.
> I propose: 
> # {{FileOutputFormat}} takes a factory to create committers.
> # The factory to take a URI and {{TaskAttemptContext}} and return a committer
> # the default implementation always returns a {{FileOutputCommitter}}
> # A configuration option allows a new factory to be named
> # An {{S3AOutputCommitterFactory}} to return a  {{FileOutputCommitter}} or 
> new {{S3AOutputCommitter}} depending upon the URI of the destination.
> Note that MRv1 already supports configurable committers; this is only the V2 
> API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org