[jira] [Created] (MAPREDUCE-6423) MapOutput Sampler
Ram Manohar Bheemana created MAPREDUCE-6423: --- Summary: MapOutput Sampler Key: MAPREDUCE-6423 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6423 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ram Manohar Bheemana Priority: Minor Need a sampler based on the MapOutput Keys. Current InputSampler implementation has a major drawback which is input and output of a mapper should be same, generally this isn't the case. approach: 1. Create a Sampler which samples the data based on the input. 2. Run a small map reduce in uber task mode using the original job mapper and identity reducer to generate required MapOutputSample keys 3. Optionally, we can input the input file to be sample. For example inputs files A, B; we should be able to specify to use only file A for sampling. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6422) Add REST API for getting all attempts for all the tasks
Lavkesh Lahngir created MAPREDUCE-6422: -- Summary: Add REST API for getting all attempts for all the tasks Key: MAPREDUCE-6422 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6422 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Lavkesh Lahngir Assignee: Lavkesh Lahngir Web UI has the feature where one can get all attempts for all map tasks or reduce tasks. REST api seems to be missing it. Should we add this in both HsWebService and AMWebService ? {code} @GET @Path(/mapreduce/jobs/{jobid}/tasks/attempts) @Produces({ MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML }) public JobTaskAttemptsInfo getAllJobTaskAttempts(@Context HttpServletRequest hsr, @PathParam(jobid) String jid, @QueryParam(type) String type) { } {code} We might also add queryparam on state to filter by succeeded attempts etc. Thoughts ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Hadoop-Mapreduce-trunk - Build # 2191 - Failure
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2191/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 32616 lines...] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop MapReduce Client SUCCESS [ 2.748 s] [INFO] Apache Hadoop MapReduce Core .. SUCCESS [01:31 min] [INFO] Apache Hadoop MapReduce Common SUCCESS [ 28.753 s] [INFO] Apache Hadoop MapReduce Shuffle ... SUCCESS [ 4.807 s] [INFO] Apache Hadoop MapReduce App ... SUCCESS [08:38 min] [INFO] Apache Hadoop MapReduce HistoryServer . SUCCESS [05:35 min] [INFO] Apache Hadoop MapReduce JobClient . FAILURE [ 01:42 h] [INFO] Apache Hadoop MapReduce HistoryServer Plugins . SKIPPED [INFO] Apache Hadoop MapReduce NativeTask SKIPPED [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] Apache Hadoop MapReduce ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 01:59 h [INFO] Finished at: 2015-07-01T16:03:50+00:00 [INFO] Final Memory: 34M/751M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on project hadoop-mapreduce-client-jobclient: There are test failures. [ERROR] [ERROR] Please refer to /home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/surefire-reports for the individual test results. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-mapreduce-client-jobclient Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE Archiving artifacts Sending artifact delta relative to Hadoop-Mapreduce-trunk #2190 Archived 1 artifacts Archive block size is 32768 Received 1 blocks and 20415997 bytes Compression is 0.2% Took 7.2 sec Recording test results Updating HADOOP-12124 Updating HADOOP-12116 Updating HADOOP-10798 Updating YARN-3827 Updating MAPREDUCE-6121 Updating HDFS-8635 Updating MAPREDUCE-6384 Updating YARN-3768 Updating HADOOP-12164 Updating YARN-3823 Updating HADOOP-12149 Updating MAPREDUCE-6407 Updating HADOOP-12159 Updating HADOOP-12158 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 1 tests failed. REGRESSION: org.apache.hadoop.mapred.TestLazyOutput.testLazyOutput Error Message: java.io.IOException: ResourceManager failed to start. Final state is STOPPED Stack Trace: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: ResourceManager failed to start. Final state is STOPPED at org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:329) at org.apache.hadoop.yarn.server.MiniYARNCluster.access$500(MiniYARNCluster.java:98) at org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:455) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.mapred.MiniMRClientClusterFactory.create(MiniMRClientClusterFactory.java:80) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:187) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:175) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:167) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:159) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:152) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:145) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:138) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:133) at
Re: Planning Hadoop 2.6.1 release
Any update on a release plan for 2.6.1? On Wed, Jun 10, 2015 at 1:25 AM, Brahma Reddy Battula brahmareddy.batt...@huawei.com wrote: HI vinod any update on this..? are we planning to give 2.6.1 Or can we make 2.7.1 as stable give..? Thanks Regards Brahma Reddy Battula From: Zhihai Xu [z...@cloudera.com] Sent: Wednesday, May 13, 2015 12:04 PM To: mapreduce-dev@hadoop.apache.org Cc: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Hi Akira, Can we also include YARN-3242? YARN-3242 fixed a critical ZKRMStateStore bug. It will work better with YARN-2992. thanks zhihai On Tue, May 12, 2015 at 10:38 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Thanks all for collecting jiras for 2.6.1 release. In addition, I'd like to include the following: * HADOOP-11343. Overflow is not properly handled in calculating final iv for AES CTR * YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps * YARN-2992. ZKRMStateStore crashes due to session expiry * YARN-3013. AMRMClientImpl does not update AMRM token properly * YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die * MAPREDUCE-6303. Read timeout when retrying a fetch error can be fatal to a reducer All of these are marked as blocker bug for 2.7.0 but not fixed in 2.6.0. Regards, Akira On 5/4/15 11:15, Brahma Reddy Battula wrote: Hello Vinod, I am thinking,can we include HADOOP-11491 also..? wihout this jira harfs will not be usable when cluster installed in HA mode and try to get filecontext like below.. Path path = new Path(har:///archivedLogs/application_1428917727658_0005-application_1428917727658_0008-1428927448352.har); FileSystem fs = path.getFileSystem(new Configuration()); path = fs.makeQualified(path); FileContext fc = FileContext.getFileContext(path.toUri(),new Configuration()); Thanks Regards Brahma Reddy Battula From: Chris Nauroth [cnaur...@hortonworks.com] Sent: Friday, May 01, 2015 4:32 AM To: mapreduce-dev@hadoop.apache.org; common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Thank you, Arpit. In addition, I suggest we include the following: HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during shutdown of DomainSocketWatcher thread. HADOOP-11648. Set DomainSocketWatcher thread name explicitly HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm HADOOP-11604 and 11648 are not critical by themselves, but they are pre-requisites to getting a clean cherry-pick of 11802, which we believe finally fixes the root cause of this issue. --Chris Nauroth On 4/30/15, 3:55 PM, Arpit Agarwal aagar...@hortonworks.com wrote: HDFS candidates for back-porting to Hadoop 2.6.1. The first two were requested in [1]. HADOOP-11674. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static HADOOP-11710. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization HDFS-7009. Active NN and standby NN have different live nodes. HDFS-7035. Make adding a new data directory to the DataNode an atomic and improve error handling HDFS-7425. NameNode block deletion logging uses incorrect appender. HDFS-7443. Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume. HDFS-7489. Incorrect locking in FsVolumeList#checkDirs can hang datanodes HDFS-7503. Namenode restart after large deletions can cause slow processReport. HDFS-7575. Upgrade should generate a unique storage ID for each volume. HDFS-7579. Improve log reporting during block report rpc failure. HDFS-7587. Edit log corruption can happen if append fails with a quota violation. HDFS-7596. NameNode should prune dead storages from storageMap. HDFS-7611. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. HDFS-7714. Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode. HDFS-7733. NFS: readdir/readdirplus return null directory attribute on failure. HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). HDFS-7885. Datanode should not trust the generation stamp provided by client. HDFS-7960. The full block report should prune zombie storages even if they're not empty. HDFS-8072. Reserved RBW space is not released if client terminates while writing block. HDFS-8127. NameNode
RE: [VOTE] Release Apache Hadoop 2.7.1 RC0
+1 (non-binding) Build from source deployed in 4 nodes cluster for Secure Mode and Non-Secure Mode. Tested with applications spark and MapReduce for RM HA, RM workPreservingRestat, NM work preserving restart. - Rohith Sharma K S -Original Message- From: Mit Desai [mailto:mitdesa...@gmail.com] Sent: 30 June 2015 23:33 To: hdfs-...@hadoop.apache.org Cc: common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.7.1 RC0 +1 (non-binding) + Built from source + Verified signatures + Deployed on a single node cluster + Ran some sample jobs to successful completion Thanks for driving the release Vinod! -Mit Desai On Tue, Jun 30, 2015 at 12:51 PM, Varun Vasudev vvasu...@apache.org wrote: +1 (non-binding) Built from source, deployed in a single node cluster and ran some test jobs. -Varun On 6/30/15, 9:58 AM, Zhijie Shen zs...@hortonworks.com wrote: +1 (binding) Built from source, deployed a single node cluster and tried some MR jobs. - Zhijie From: Devaraj K deva...@apache.org Sent: Monday, June 29, 2015 9:24 PM To: common-...@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [VOTE] Release Apache Hadoop 2.7.1 RC0 +1 (non-binding) Deployed in a 3 node cluster and ran some Yarn Apps and MR examples, works fine. On Tue, Jun 30, 2015 at 1:46 AM, Xuan Gong xg...@hortonworks.com wrote: +1 (non-binding) Compiled and deployed a single node cluster, ran all the tests. Xuan Gong On 6/29/15, 1:03 PM, Arpit Gupta ar...@hortonworks.com wrote: +1 (non binding) We have been testing rolling upgrades and downgrades from 2.6 to this release and have had successful runs. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Jun 29, 2015, at 12:45 PM, Lei Xu l...@cloudera.com wrote: +1 binding Downloaded src and bin distribution, verified md5, sha1 and sha256 checksums of both tar files. Built src using mvn package. Ran a pseudo HDFS cluster Ran dfs -put some files, and checked files on NN's web interface. On Mon, Jun 29, 2015 at 11:54 AM, Wangda Tan wheele...@gmail.com wrote: +1 (non-binding) Compiled and deployed a single node cluster, tried to change node labels and run distributed_shell with node label specified. On Mon, Jun 29, 2015 at 10:30 AM, Ted Yu yuzhih...@gmail.com wrote: +1 (non-binding) Compiled hbase branch-1 with Java 1.8.0_45 Ran unit test suite which passed. On Mon, Jun 29, 2015 at 7:22 AM, Steve Loughran ste...@hortonworks.com wrote: +1 binding from me. Tests: Rebuild slider with Hadoop.version=2.7.1; ran all the tests including against a secure cluster. Repeated for windows running Java 8. All tests passed On 29 Jun 2015, at 09:45, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.1. As discussed before, this is the next stable release to follow up 2.6.0, and the first stable one in the 2.7.x line. The RC is available for validation at: *http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/ http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/* The RC tag in git is: release-2.7.1-RC0 The maven artifacts are available via repository.apache.org at * https://repository.apache.org/content/repositories/orgapachehadoop- 101 9/ https://repository.apache.org/content/repositories/orgapachehadoop- 101 9/ * Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod PS: It took 2 months instead of the planned [1] 2 weeks in getting this release out: post-mortem in a separate thread. [1]: A 2.7.1 release to follow up 2.7.0 http://markmail.org/thread/zwzze6cqqgwq4rmw -- Lei (Eddy) Xu Software Engineer, Cloudera -- Thanks Devaraj K
Re: [VOTE] Release Apache Hadoop 2.7.1 RC0
Vinod, thanks for putting together this release. +1 (binding) - Verified signatures - Installed binary release on Centos 6 pseudo cluster * Copied files in and out of HDFS using the shell * Mounted HDFS via NFS and copied a 10GB file in and out over NFS * Ran example MapReduce jobs - Deployed pseudo cluster from sources on Centos 6, verified native bits - Deployed pseudo cluster from sources on Windows 2008 R2, verified native bits and ran example MR jobs Arpit On 6/29/15, 1:45 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.1. As discussed before, this is the next stable release to follow up 2.6.0, and the first stable one in the 2.7.x line. The RC is available for validation at: *http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/ http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/* The RC tag in git is: release-2.7.1-RC0 The maven artifacts are available via repository.apache.org at *https://repository.apache.org/content/repositories/orgapachehadoop-1019/ https://repository.apache.org/content/repositories/orgapachehadoop-1019/* Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod PS: It took 2 months instead of the planned [1] 2 weeks in getting this release out: post-mortem in a separate thread. [1]: A 2.7.1 release to follow up 2.7.0 http://markmail.org/thread/zwzze6cqqgwq4rmw