from:"Jason Lowe"

Re: [VOTE] Release Apache Hadoop 3.2.0 - RC0

2018-11-28 Thread Jason Lowe

Thanks for driving this release, Sunil!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build
- Deployed a single-node cluster
- Ran some sample jobs

Jason

On Fri, Nov 23, 2018 at 6:07 AM Sunil G  wrote:

> Hi folks,
>
>
>
> Thanks to all contributors who helped in this release [1]. I have created
>
> first release candidate (RC0) for Apache Hadoop 3.2.0.
>
>
> Artifacts for this RC are available here:
>
> http://home.apache.org/~sunilg/hadoop-3.2.0-RC0/
>
>
>
> RC tag in git is release-3.2.0-RC0.
>
>
>
> The maven artifacts are available via repository.apache.org at
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1174/
>
>
> This vote will run 7 days (5 weekdays), ending on Nov 30 at 11:59 pm PST.
>
>
>
> 3.2.0 contains 1079 [2] fixed JIRA issues since 3.1.0. Below feature
> additions
>
> are the highlights of this release.
>
> 1. Node Attributes Support in YARN
>
> 2. Hadoop Submarine project for running Deep Learning workloads on YARN
>
> 3. Support service upgrade via YARN Service API and CLI
>
> 4. HDFS Storage Policy Satisfier
>
> 5. Support Windows Azure Storage - Blob file system in Hadoop
>
> 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
>
> 7. Improvements in Router-based HDFS federation
>
>
>
> Thanks to Wangda, Vinod, Marton for helping me in preparing the release.
>
> I have done few testing with my pseudo cluster. My +1 to start.
>
>
>
> Regards,
>
> Sunil
>
>
>
> [1]
>
>
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
>
> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.2.0)
> AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status = Resolved
> ORDER BY fixVersion ASC
>

Re: [VOTE] Release Apache Hadoop 2.9.2 (RC0)

2018-11-19 Thread Jason Lowe

Thanks for driving this release, Akira!

+1 (binding)

- Verified signatures and digests
- Successfully performed native build from source
- Deployed a single-node cluster and ran some sample jobs

Jason

On Tue, Nov 13, 2018 at 7:02 PM Akira Ajisaka  wrote:

> Hi folks,
>
> I have put together a release candidate (RC0) for Hadoop 2.9.2. It
> includes 204 bug fixes and improvements since 2.9.1. [1]
>
> The RC is available at http://home.apache.org/~aajisaka/hadoop-2.9.2-RC0/
> Git signed tag is release-2.9.2-RC0 and the checksum is
> 826afbeae31ca687bc2f8471dc841b66ed2c6704
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1166/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Please try the release and vote. The vote will run for 5 days.
>
> [1] https://s.apache.org/2.9.2-fixed-jiras
>
> Thanks,
> Akira
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>

[jira] [Resolved] (MAPREDUCE-6440) Duplicate Key in Json Output for Job details

2018-09-13 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6440.
---
  Resolution: Duplicate
Target Version/s:   (was: )

This has been fixed by MAPREDUCE-7133.

> Duplicate Key in Json Output for Job details
> 
>
> Key: MAPREDUCE-6440
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6440
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Anushri
>Priority: Minor
>
> Duplicate key in Json Output for Job details for the url : 
> http://:/ws/v1/history/mapreduce/jobs/job_id/tasks/task_id/attempts
> If the task type is "REDUCE" the json output for this url contains duplicate 
> key for "type".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-10 Thread Jason Lowe

Thanks for driving the release, Junping!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build from source
- Successfully deployed a single-node cluster with the timeline server
- Ran some sample jobs and examined the web UI and job logs

Jason

On Mon, Sep 10, 2018 at 7:00 AM, 俊平堵  wrote:

> Hi all,
>
>  I've created the first release candidate (RC0) for Apache
> Hadoop 2.8.5. This is our next point release to follow up 2.8.4. It
> includes 33 important fixes and improvements.
>
>
> The RC artifacts are available at:
> http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
>
>
> The RC tag in git is: release-2.8.5-RC0
>
>
>
> The maven artifacts are available via repository.apache.org<
> http://repository.apache.org> at:
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1140
>
>
> Please try the release and vote; the vote will run for the usual 5
> working
> days, ending on 9/15/2018 PST time.
>
>
> Thanks,
>
>
> Junping
>

[jira] [Resolved] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2018-07-17 Thread Jason Lowe (JIRA)



 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6948.
---
Resolution: Cannot Reproduce

I agree as well.  I have not seen any recent precommit failures on 3.x releases 
for this unit test.

> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>Priority: Major
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INITED to SETUP
> 2017-08-30 10:12:21,954 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-08-30 10:12:21,978 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> SETUP to RUNNING
> 2017-08-30 10:12:21,983 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$5
> 2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 1
> 2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 2
> 2017-08-30 1

[jira] [Created] (MAPREDUCE-7118) Distributed cache conflicts breaks backwards compatability

2018-07-03 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-7118:
-

 Summary: Distributed cache conflicts breaks backwards compatability
 Key: MAPREDUCE-7118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.1.0, 3.0.0, 3.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe


MAPREDUCE-4503 made distributed cache conflicts break job submission, but this 
was quickly downgraded to a warning in MAPREDUCE-4549.  Unfortunately the 
latter did not go into trunk, so the fix is only in 0.23 and 2.x.  When Oozie, 
Pig, and other downstream projects that can occasionally generate distributed 
cache conflicts move to Hadoop 3.x the workflows that used to work on 0.23 and 
2.x no longer function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-7080) Default speculator won't sepculate the last several submitted reduced task if the total task num is large

2018-04-17 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7080.
---
Resolution: Duplicate

Closing as a duplicate of MAPREDUCE-7081.

> Default speculator won't sepculate the last several submitted reduced task if 
> the total task num is large
> -
>
> Key: MAPREDUCE-7080
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7080
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> DefaultSpeculator speculates a task one time. 
> By default, the number of speculators is max(max(10, 0.01 * tasks.size), 0.1 
> * running tasks)
> I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
> all the map tasks are finished.
> The cluster has 1000 vcores, and the Job has 5000 reduce jobs.
> At first, 1000 reduces tasks can run simultaneously, number of speculators 
> can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks with less data 
> can over shortly, and speculator will speculator a task per second by 
> default. The task be speculated execution may be because the more data to be 
> processed. It will speculator  100 tasks within 100 seconds.
> When 4900 reduces is over, If a reduce is executed with a lot of  data be 
> processed and is put on a slow machine. The speculate opportunity is running 
> out, it will not be speculated. It can increase the execution time of job 
> significantly.
> In short, it may waste the speculate opportunity at first only because the 
> execution time of  reduce with less data to be processed as average time. At  
> end of job, there is no speculate opportunity available, especially last 
> several running tasks, judged the number of the running tasks .
>  
> In my opinion, the number of tasks be speculated can be judged by square of 
> finished task percent. Take an example, if ninety percent of  the task is 
> finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It will 
> leave enough opportunity for latter tasks.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.7.6 (RC0)

2018-04-16 Thread Jason Lowe

Thanks for driving the release, Konstatin!

+1 (binding)

- Verified signatures and digests
- Completed a native build from source
- Deployed a single-node cluster
- Ran some sample jobs

Jason

On Mon, Apr 9, 2018 at 6:14 PM, Konstantin Shvachko
 wrote:
> Hi everybody,
>
> This is the next dot release of Apache Hadoop 2.7 line. The previous one 2.7.5
> was released on December 14, 2017.
> Release 2.7.6 includes critical bug fixes and optimizations. See more
> details in Release Note:
> http://home.apache.org/~shv/hadoop-2.7.6-RC0/releasenotes.html
>
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.6-RC0/
>
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 04/16/2018.
>
> My up to date public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Thanks,
> --Konstantin

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7079) TestMRIntermediateDataEncryption is failing in precommit builds

2018-04-12 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-7079:
-

 Summary: TestMRIntermediateDataEncryption is failing in precommit 
builds
 Key: MAPREDUCE-7079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Lowe


TestMRIntermediateDataEncryption is either timing out or tearing down the JVM 
which causes the unit tests in jobclient to not pass cleanly during precommit 
builds. From sample precommit console output, note the lack of a test results 
line when the test is run:
{noformat}
[INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 s 
- in org.apache.hadoop.mapred.TestSequenceFileInputFormat
[INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
[INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 s 
- in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
[...]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:14 h
[INFO] Finished at: 2018-04-12T04:27:06+00:00
[INFO] Final Memory: 24M/594M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7078) TestPipeApplication is failing in precommit builds

2018-04-12 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-7078:
-

 Summary: TestPipeApplication is failing in precommit builds
 Key: MAPREDUCE-7078
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7078
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Lowe


TestPipeApplication is either timing out or tearing down the JVM which causes 
the unit tests in jobclient to not pass cleanly during precommit builds.  From 
sample precommit console output, note the lack of a test results line when the 
test is run:
{noformat}
[INFO] Running org.apache.hadoop.mapred.TestIFile
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1 s - in 
org.apache.hadoop.mapred.TestIFile
[INFO] Running org.apache.hadoop.mapred.pipes.TestPipeApplication
[INFO] Running org.apache.hadoop.mapred.pipes.TestPipesNonJavaInputFormat
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.02 s - 
in org.apache.hadoop.mapred.pipes.TestPipesNonJavaInputFormat
[...]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:14 h
[INFO] Finished at: 2018-04-12T04:27:06+00:00
[INFO] Final Memory: 24M/594M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump

2018-02-14 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-7053:
-

 Summary: Timed out tasks can fail to produce thread dump
 Key: MAPREDUCE-7053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
Reporter: Jason Lowe


TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically recently.  
When the AM times out a task it immediately removes it from the list of known 
tasks and then connects to the NM to request a thread dump followed by a kill.  
If the task heartbeats in after the task has been removed from the list of 
known tasks but before the thread dump signal arrives then the task can exit 
with a "org.apache.hadoop.mapred.Task: Parent died." message and no thread dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-7049) Testcase TestMRJobs#testJobClassloaderWithCustomClasses fails

2018-02-06 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7049.
---
Resolution: Duplicate

> Testcase TestMRJobs#testJobClassloaderWithCustomClasses fails 
> --
>
> Key: MAPREDUCE-7049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7049
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>
> The testcase TestMRJobs#testJobClassloaderWithCustomClasses fails 
> consistently with this error:
> {noformat}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 54.325 s <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs
> [ERROR] 
> testJobClassloaderWithCustomClasses(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   Time elapsed: 10.531 s  <<< FAILURE!
> java.lang.AssertionError: 
> Job status: Application application_1517928628935_0001 failed 2 times due to 
> AM Container for appattempt_1517928628935_0001_02 exited with  exitCode: 1
> Failing this attempt.Diagnostics: [2018-02-06 15:50:38.688]Exception from 
> container-launch.
> Container id: container_1517928628935_0001_02_01
> Exit code: 1
> [2018-02-06 15:50:38.693]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> [2018-02-06 15:50:38.694]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> For more detailed output, check the application tracking page: 
> http://ubuntu:46235/cluster/app/application_1517928628935_0001 Then click on 
> links to logs of each attempt.
> . Failing the application.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobClassloader(TestMRJobs.java:529)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobClassloaderWithCustomClasses(TestMRJobs.java:477)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> Today I found the offending commit with {{git bisect}} and this failure is 
> caused by {{YARN-2185}}.
> The application master fails because of the following error:
> {noformat}
> 2018-02-05 17:15:18,530 DEBUG [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1694)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>

[jira] [Created] (MAPREDUCE-7033) Map outputs implicitly rely on permissive umask for shuffle

2018-01-11 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-7033:
-

 Summary: Map outputs implicitly rely on permissive umask for 
shuffle
 Key: MAPREDUCE-7033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7033
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe


Map tasks do not explicitly set the permissions of their output files for 
shuffle.  In a secure cluster the shuffle service is running as a different 
user than the map task, so the output files require group readability in order 
to serve up the data during the shuffle phase.  If the user's UNIX umask is too 
restrictive (e.g.: 077) then the map task's file.out and file.out.index 
permissions can be too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: Apache Hadoop 3.0.1 Release plan

2018-01-09 Thread Jason Lowe

Is it necessary to cut the branch so far ahead of the release?  branch-3.0
is already a maintenance line for 3.0.x releases.  Is there a known
feature/improvement planned to go into branch-3.0 that is not desirable for
the 3.0.1 release?

I have found in the past that branching so early leads to many useful fixes
being unnecessarily postponed to future releases because committers forget
to pick to the new, relatively long-lived patch branch.  This becomes
especially true if blockers end up dragging out the ultimate release date,
which has historically been quite common.  My preference would be to cut
this branch as close to the RC as possible.

Jason

On Tue, Jan 9, 2018 at 1:17 PM, Lei Xu  wrote:

> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>

Re: [VOTE] Release Apache Hadoop 2.7.5 (RC1)

2017-12-12 Thread Jason Lowe

Thanks for driving the release, Konstantin!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build from source
- Deployed a single-node cluster
- Ran some sample jobs and checked the logs

Jason


On Thu, Dec 7, 2017 at 9:22 PM, Konstantin Shvachko 
wrote:

> Hi everybody,
>
> I updated CHANGES.txt and fixed documentation links.
> Also committed  MAPREDUCE-6165, which fixes a consistently failing test.
>
> This is RC1 for the next dot release of Apache Hadoop 2.7 line. The
> previous one 2.7.4 was release August 4, 2017.
> Release 2.7.5 includes critical bug fixes and optimizations. See more
> details in Release Note:
> http://home.apache.org/~shv/hadoop-2.7.5-RC1/releasenotes.html
>
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC1/
>
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 12/13/2017.
>
> My up to date public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Thanks,
> --Konstantin
>

Re: [VOTE] Release Apache Hadoop 2.8.3 (RC0)

2017-12-12 Thread Jason Lowe

Thanks for driving this release, Junping!

+1 (binding)

- Verified signatures and digests
- Successfully performed native build from source
- Deployed a single-node cluster
- Ran some test jobs and examined the logs

Jason

On Tue, Dec 5, 2017 at 3:58 AM, Junping Du  wrote:

> Hi all,
>  I've created the first release candidate (RC0) for Apache Hadoop
> 2.8.3. This is our next maint release to follow up 2.8.2. It includes 79
> important fixes and improvements.
>
>   The RC artifacts are available at: http://home.apache.org/~
> junping_du/hadoop-2.8.3-RC0
>
>   The RC tag in git is: release-2.8.3-RC0
>
>   The maven artifacts are available via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1072
>
>   Please try the release and vote; the vote will run for the usual 5
> working days, ending on 12/12/2017 PST time.
>
> Thanks,
>
> Junping
>

[jira] [Resolved] (MAPREDUCE-7019) java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2

2017-12-08 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7019.
---
Resolution: Invalid

Closing this since I believe the error is coming from the program being 
launched by the streaming job rather than an issue with the streaming framework 
code.  If this is incorrect, please provide details showing where the streaming 
framework code is going awry.


> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed 
> with code 2
> -
>
> Key: MAPREDUCE-7019
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7019
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: shrutika sarda
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)

2017-11-17 Thread Jason Lowe

Thanks for putting this release together!

+1 (binding)

- Verified signatures and digests
- Successfully built from source including native
- Deployed to single-node cluster and ran some test jobs

Jason


On Mon, Nov 13, 2017 at 6:10 PM, Arun Suresh  wrote:

> Hi Folks,
>
> Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the
> starting release for Apache Hadoop 2.9.x line - it includes 30 New Features
> with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since
> 2.8.2.
>
> More information about the 2.9.0 release plan can be found here:
> *https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9
>  Roadmap#Roadmap-Version2.9>*
>
> New RC is available at: *https://home.apache.org/~
> asuresh/hadoop-2.9.0-RC3/
> *
>
> The RC tag in git is: release-2.9.0-RC3, and the latest commit id is:
> 756ebc8394e473ac25feac05fa493f6d612e6c50.
>
> The maven artifacts are available via repository.apache.org at:
>  apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066=D&
> sntz=1=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https://
> repository.apache.org/content/repositories/orgapachehadoop-1068/
>  >*
>
> We are carrying over the votes from the previous RC given that the delta is
> the license fix.
>
> Given the above - we are also going to stick with the original deadline for
> the vote : ending on Friday 17th November 2017 2pm PT time.
>
> Thanks,
> -Arun/Subru
>

Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-23 Thread Jason Lowe

My apologies, false alarm on the CHANGES.md and RELEASENOTES.md.  I was in
the process of reviewing the release and was interrupted, and when I
resumed I thought I had already downloaded the CHANGES and RELEASENOTES,
but in fact they were the old versions from a prior review of 2.8.0.  I
reviewed both of them for 2.8.2 (for real this time!) and they look
correct.  Again my apologies for the confusion.

Jason

On Mon, Oct 23, 2017 at 3:26 PM, Jason Lowe <jl...@oath.com> wrote:

> +1 (binding)
>
> - Verified signatures and digests
> - Performed a native build from source
> - Deployed to a single-node cluster
> - Ran some sample jobs
>
> The CHANGES.md and RELEASENOTES.md both refer to release 2.8.0 instead of
> 2.8.2, and I do not see the list of JIRAs in CHANGES.md that have been
> committed since 2.8.1.  Since we're voting on the source bits rather than
> the change log I kept my vote as a +1 as I do see the 2.8.2 changes in the
> source code.
>
> Jason
>
>
> On Thu, Oct 19, 2017 at 7:42 PM, Junping Du <j...@hortonworks.com> wrote:
>
>> Hi folks,
>>  I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>>
>>  Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
>> and will be the latest stable/production release for Apache Hadoop - it
>> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
>> blocker/critical issues.
>>
>>   More information about the 2.8.2 release plan can be found here:
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>>
>>   New RC is available at: http://home.apache.org/~junpin
>> g_du/hadoop-2.8.2-RC1<http://home.apache.org/~junping_du/hadoop-2.8.2-RC0
>> >
>>
>>   The RC tag in git is: release-2.8.2-RC1, and the latest commit id
>> is: 66c47f2a01ad9637879e95f80c41f798373828fb
>>
>>   The maven artifacts are available via repository.apache.org<
>> http://repository.apache.org/> at: https://repository.apache.org/
>> content/repositories/orgapachehadoop-1064<https://repository
>> .apache.org/content/repositories/orgapachehadoop-1062>
>>
>>   Please try the release and vote; the vote will run for the usual 5
>> days, ending on 10/24/2017 6pm PST time.
>>
>> Thanks,
>>
>> Junping
>>
>>
>

Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-23 Thread Jason Lowe

+1 (binding)

- Verified signatures and digests
- Performed a native build from source
- Deployed to a single-node cluster
- Ran some sample jobs

The CHANGES.md and RELEASENOTES.md both refer to release 2.8.0 instead of
2.8.2, and I do not see the list of JIRAs in CHANGES.md that have been
committed since 2.8.1.  Since we're voting on the source bits rather than
the change log I kept my vote as a +1 as I do see the 2.8.2 changes in the
source code.

Jason


On Thu, Oct 19, 2017 at 7:42 PM, Junping Du  wrote:

> Hi folks,
>  I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>
>  Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
> and will be the latest stable/production release for Apache Hadoop - it
> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
> blocker/critical issues.
>
>   More information about the 2.8.2 release plan can be found here:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>   New RC is available at: http://home.apache.org/~
> junping_du/hadoop-2.8.2-RC1 du/hadoop-2.8.2-RC0>
>
>   The RC tag in git is: release-2.8.2-RC1, and the latest commit id
> is: 66c47f2a01ad9637879e95f80c41f798373828fb
>
>   The maven artifacts are available via repository.apache.org repository.apache.org/> at: https://repository.apache.org/
> content/repositories/orgapachehadoop-1064 repository.apache.org/content/repositories/orgapachehadoop-1062>
>
>   Please try the release and vote; the vote will run for the usual 5
> days, ending on 10/24/2017 6pm PST time.
>
> Thanks,
>
> Junping
>
>

[jira] [Created] (MAPREDUCE-6969) TestHSWebApp is failing

2017-09-26 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6969:
-

 Summary: TestHSWebApp is failing
 Key: MAPREDUCE-6969
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6969
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


TestHSWebApp has been failing recently:
{noformat}
Running org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp
Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.57 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp
testLogsViewBadStartEnd(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp)  
Time elapsed: 0.076 sec  <<< FAILURE!
org.mockito.exceptions.verification.junit.ArgumentsAreDifferent: 
Argument(s) are different! Wanted:
printWriter.write(
"Invalid log end value: bar"
);
-> at 
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp.testLogsViewBadStartEnd(TestHSWebApp.java:261)
Actual invocation has different arguments:
printWriter.write(
"http://www.w3.org/TR/html4/strict.dtd;>"
);
-> at 
org.apache.hadoop.yarn.webapp.view.TextView.echoWithoutEscapeHtml(TextView.java:62)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp.testLogsViewBadStartEnd(TestHSWebApp.java:261)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6968) Staging directory erasure coding config property has a typo

2017-09-26 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6968:
-

 Summary: Staging directory erasure coding config property has a 
typo
 Key: MAPREDUCE-6968
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6968
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 3.0.0-beta1
Reporter: Jason Lowe
Assignee: Jason Lowe


TestMapreduceConfigFields has been failing since MAPREDUCE-6954. 
MRJobConfig#MR_AM_STAGING_DIR_ERASURECODING_ENABLED is defined as 
"yarn.app.mapreduce.am.staging-direrasurecoding.enabled"  but the property is 
listed as "yarn.app.mapreduce.am.staging-dir.erasurecoding.enabled" in 
mapred-default.xml.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-6959) Understanding on process to start contribution

2017-09-18 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6959.
---
Resolution: Invalid

JIRA is for tracking features and bugs in Hadoop and not for general support.  
Questions such as these can be directed to the [mailing 
lists|http://hadoop.apache.org/mailing_lists.html].  Specifically if you're 
interested in contributing I highly recommend checking out the [How To 
Contribute|https://wiki.apache.org/hadoop/HowToContribute] wiki page.

Note that https://github.com/apache/hadoop-mapreduce is a mirror of just the 
MapReduce code from what looks like Hadoop 1.x or even earlier code that is no 
longer supported.  All active development is on Hadoop 2. x and Hadoop 3.x.


> Understanding on process to start contribution
> --
>
> Key: MAPREDUCE-6959
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6959
> Project: Hadoop Map/Reduce
>  Issue Type: Wish
>Reporter: Mehul
>Priority: Trivial
>
> I was trying to find process/steps to start with contribution into following 
> repo i.e. https://github.com/apache/hadoop-mapreduce. Can someone please help 
> with the detail so that I can create appropriate git/jira issue and start 
> woking on it?
> Any direction would be really appreciated!
> Thanks,
> Mehul



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-14 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6958:
-

 Summary: Shuffle audit logger should log size of shuffle transfer
 Key: MAPREDUCE-6958
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Minor


The shuffle audit logger currently logs the job ID and reducer ID but nothing 
about the size of the requested transfer.  It calculates this as part of the 
HTTP response headers, so it would be trivial to log the response size.  This 
would be very valuable for debugging network traffic storms from the shuffle 
handler.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6952) Using DistributedCache.addFileToClasspath with a rename fragment fails during job submit

2017-09-07 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6952:
-

 Summary: Using DistributedCache.addFileToClasspath with a rename 
fragment fails during job submit
 Key: MAPREDUCE-6952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.8.1, 2.7.4
Reporter: Jason Lowe


Calling DistributedCache.addFileToClasspath with a Path that specifies a URI 
fragment, used to rename the file during localization, causes job submission to 
fail with a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Reopened] (MAPREDUCE-6641) TestTaskAttempt fails in trunk

2017-08-29 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-6641:
---

Seeing this fail the same way in 2.8 builds as well.  Unfortunately since the 
fix uses lambdas I can't just cherry-pick the fix down to other branches.  
Reopening so Jenkins can comment on a branch-2 version of the patch.

> TestTaskAttempt fails in trunk
> --
>
> Key: MAPREDUCE-6641
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6641
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Assignee: Haibo Chen
> Fix For: 3.0.0-alpha1
>
> Attachments: mapreduce6641.001.patch, mapreduce6641.002.patch, 
> MAPREDUCE-6641-branch-2.002.patch, 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt-output.txt
>
>
> {code}
> Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.917 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> testMRAppHistoryForTAFailedInAssigned(org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt)
>   Time elapsed: 12.732 sec  <<< FAILURE!
> java.lang.AssertionError: No Ta Started JH Event
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-29 Thread Jason Lowe

+1 (binding)

I participated in the review for the reader authorization and verified that
ATSv2 has no significant impact when disabled.  Looking forward to seeing
the next increment in functionality in a release.  A big thank you to
everyone involved in this effort!

Jason


On Tue, Aug 22, 2017 at 1:32 AM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> Hi folks,
>
> Per earlier discussion [1], I'd like to start a formal vote to merge
> feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote will
> run for 7 days, and will end August 29 11:00 PM PDT.
>
> We have previously completed one merge onto trunk [3] and Timeline Service
> v2 has been part of Hadoop release 3.0.0-alpha1.
>
> Since then, we have been working on extending the capabilities of Timeline
> Service v2 in a feature branch [2] for a while, and we are reasonably
> confident that the state of the feature meets the criteria to be merged
> onto trunk and we'd love folks to get their hands on it in a test capacity
> and provide valuable feedback so that we can make it production-ready.
>
> In a nutshell, Timeline Service v.2 delivers significant scalability and
> usability improvements based on a new architecture. What we would like to
> merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> complete end-to-end read/write flow with security and read level
> authorization via whitelists. You should be able to start setting it up and
> testing it.
>
> At a high level, the following are the key features that have been
> implemented since alpha1:
> - Security via Kerberos Authentication and delegation tokens
> - Read side simple authorization via whitelist
> - Client configurable entity sort ordering
> - Richer REST APIs for apps, app attempts, containers, fetching metrics by
> timerange, pagination, sub-app entities
> - Support for storing sub-application entities (entities that exist outside
> the scope of an application)
> - Configurable TTLs (time-to-live) for tables, configurable table prefixes,
> configurable hbase cluster
> - Flow level aggregations done as dynamic (table level) coprocessors
> - Uses latest stable HBase release 1.2.6
>
> There are a total of 82 subtasks that were completed as part of this
> effort.
>
> We paid close attention to ensure that once disabled Timeline Service v.2
> does not impact existing functionality when disabled (by default).
>
> Special thanks to a team of folks who worked hard and contributed towards
> this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
>
> Regards,
> Vrushali
>
> [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> [2] https://issues.apache.org/jira/browse/YARN-5355
> [3] https://issues.apache.org/jira/browse/YARN-2928
> [4] https://github.com/apache/hadoop/commits/YARN-5355
>

Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-28 Thread Jason Lowe

Allen Wittenauer wrote:

> > On Aug 25, 2017, at 1:23 PM, Jason Lowe <jl...@oath.com> wrote:
> >
> > Allen Wittenauer wrote:
> >
> > > Doesn't this place an undue burden on the contributor with the first
> incompatible patch to prove worthiness?  What happens if it is decided that
> it's not good enough?
> >
> > It is a burden for that first, "this can't go anywhere else but 4.x"
> change, but arguably that should not be a change done lightly anyway.  (Or
> any other backwards-incompatible change for that matter.)  If it's worth
> committing then I think it's perfectly reasonable to send out the dev
> announce that there's reason for trunk to diverge from 3.x, cut branch-3,
> and move on.  This is no different than Andrew's recent announcement that
> there's now a need for separating trunk and the 3.0 line based on what's
> about to go in.
>
> So, by this definition as soon as a patch comes in to remove
> deprecated bits there will be no issue with a branch-3 getting created,
> correct?
>

I think this gets back to the "if it's worth committing" part.  I feel the
community should collectively decide when it's worth taking the hit to
maintain the separate code line.  IMHO removing deprecated bits alone is
not reason enough to diverge the code base and the additional maintenance
that comes along with the extra code line.  A new feature is traditionally
the reason to diverge because that's something users would actually care
enough about to take the compatibility hit when moving to the version that
has it.  That also helps drive a timely release of the new code line
because users want the feature that went into it.

> >  Otherwise if past trunk behavior is any indication, it ends up mostly
> enabling people to commit to just trunk, forgetting that the thing they are
> committing is perfectly valid for branch-3.
>
> I'm not sure there was any "forgetting" involved.  We likely
> wouldn't be talking about 3.x at all if it wasn't for the code diverging
> enough.
>

I don't think it was the myriad of small patches that went only into trunk
over the last 6 years that drove this.  Instead I think it was simply that
an "important enough" feature went in, like erasure coding, that gathered
momentum behind this release.  Trunk sat ignored for basically 5+ years,
and plenty of patches went into just trunk that should have gone into at
least branch-2 as well.  I don't think we as a community did the
contributors any favors by putting their changes into a code line that
didn't see a release for a very long time.  Yes 3.x could have released
sooner to help solve that issue, but given the complete lack of excitement
around 3.x until just recently is there any reason this won't happen again
with 4.x?  Seems to me 4.x will need to have something "interesting enough"
to drive people to release it relative to 3.x, which to me indicates we
shouldn't commit things only to there until we have an interest to do so.

> > Given the number of committers that openly ignore discussions like
> this, who is going to verify that incompatible changes don't get in?
> >
> > The same entities who are verifying other bugs don't get in, i.e.: the
> committers and the Hadoop QA bot running the tests.
> >  Yes, I know that means it's inevitable that compatibility breakages
> will happen, and we can and should improve the automation around
> compatibility testing when possible.
>
> The automation only goes so far.  At least while investigating
> Yetus bugs, I've seen more than enough blatant and purposeful ignored
> errors and warnings that I'm not convinced it will be effective. ("That
> javadoc compile failure didn't come from my patch!"  Um, yes, yes it did.)
> PR for features has greatly trumped code correctness for a few years now.
>

I totally agree here.  We can and should do better about this outside of
automation.  I brought up automation since I see it as a useful part of the
total solution along with better developer education, oversight, etc.  I'm
thinking specifically about tools that can report on public API signature
changes, but that's just one aspect of compatibility.  Semantic behavior is
not something a static analysis tool can automatically detect, and the only
way to automate some of that is something like end-to-end compatibility
testing.  Bigtop may cover some of this with testing of older versions of
downstream projects like HBase, Hive, Oozie, etc., and we could setup some
tests that standup two different Hadoop clusters and run tests that verify
interop between them.  But the tests will never be exhaustive and we will
still need educated committers and oversight to fill in the gaps.

>  But I don't think there's a magic bullet for preventing all
> compatibi

Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-25 Thread Jason Lowe

Allen Wittenauer wrote:

> Doesn't this place an undue burden on the contributor with the first
> incompatible patch to prove worthiness?  What happens if it is decided that
> it's not good enough?

It is a burden for that first, "this can't go anywhere else but 4.x"
change, but arguably that should not be a change done lightly anyway.  (Or
any other backwards-incompatible change for that matter.)  If it's worth
committing then I think it's perfectly reasonable to send out the dev
announce that there's reason for trunk to diverge from 3.x, cut branch-3,
and move on.  This is no different than Andrew's recent announcement that
there's now a need for separating trunk and the 3.0 line based on what's
about to go in.

I do not think it makes sense to pay for the maintenance overhead of two
nearly-identical lines with no backwards-incompatible changes between them
until we have the need.  Otherwise if past trunk behavior is any
indication, it ends up mostly enabling people to commit to just trunk,
forgetting that the thing they are committing is perfectly valid for
branch-3.  If we can agree that trunk and branch-3 should be equivalent
until an incompatible change goes into trunk, why pay for the commit
overhead and potential for accidentally missed commits until it is really
necessary?

How many will it take before the dam will break?  Or is there a timeline
> going to be given before trunk gets set to 4.x?

I think the threshold count for the dam should be 1.  As soon as we have a
JIRA that needs to be committed to move the project forward and we cannot
ship it in a 3.x release then we create branch-3 and move trunk to 4.x.
As for a timeline going to 4.x, again I don't see it so much as a "baking
period" as a "when we need it" criteria.  If we need it in a week then we
should cut it in a week.  Or a year then a year.  It all depends upon when
that 4.x-only change is ready to go in.

Given the number of committers that openly ignore discussions like this,
> who is going to verify that incompatible changes don't get in?
>

The same entities who are verifying other bugs don't get in, i.e.: the
committers and the Hadoop QA bot running the tests.  Yes, I know that means
it's inevitable that compatibility breakages will happen, and we can and
should improve the automation around compatibility testing when possible.
But I don't think there's a magic bullet for preventing all compatibility
bugs from being introduced, just like there isn't one for preventing
general bugs.  Does having a trunk branch separate but essentially similar
to branch-3 make this any better?

Longer term:  what is the PMC doing to make sure we start doing major
> releases in a timely fashion again?  In other words, is this really an
> issue if we shoot for another major in (throws dart) 2 years?
>

If we're trying to do semantic versioning then we shouldn't have a regular
cadence for major releases unless we have a regular cadence of changes that
break compatibility.  I'd hope that's not something we would strive
towards.  I do agree that we should try to be better about shipping
releases, major or minor, in a more timely manner, but I don't agree that
we should cut 4.0 simply based on a duration since the last major release.
The release contents and community's desire for those contents should
dictate the release numbering and schedule, respectively.

Jason

On Fri, Aug 25, 2017 at 2:16 PM, Allen Wittenauer 
wrote:

>
> > On Aug 25, 2017, at 10:36 AM, Andrew Wang 
> wrote:
>
> > Until we need to make incompatible changes, there's no need for
> > a Hadoop 4.0 version.
>
> Some questions:
>
> Doesn't this place an undue burden on the contributor with the
> first incompatible patch to prove worthiness?  What happens if it is
> decided that it's not good enough?
>
> How many will it take before the dam will break?  Or is there a
> timeline going to be given before trunk gets set to 4.x?
>
> Given the number of committers that openly ignore discussions like
> this, who is going to verify that incompatible changes don't get in?
>
> Longer term:  what is the PMC doing to make sure we start doing
> major releases in a timely fashion again?  In other words, is this really
> an issue if we shoot for another major in (throws dart) 2 years?
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>

Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Jason Lowe

Andrew Wang wrote:

> This means I'll cut branch-3 and
> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
> up development for Hadoop 3.1.0 and 4.0.0.

I can see a need for branch-3.0, but please do not create branch-3.  Doing
so will relegate trunk back to the "patch purgatory" branch, a place where
patches won't see a release for years.  Unless something is imminently
going in that will break backwards compatibility and warrant a new 4.x
release, I don't see the need to distinguish trunk from the 3.x line.
Leaving trunk as the 3.x line means less branches to commit patches through
and more testing of every patch since trunk would remain an active area for
testing and releasing.  If we separate trunk and branch-3 then it's almost
certain only-trunk patches will start to accumulate and never get any
"real" testing until someone eventually decides it's time to go to Hadoop
4.x.  Looking back at trunk-as-3.x for an example, patches committed there
in the early days after branch-2 was cut didn't see a release for almost 6
years.

My apologies if I've missed a feature that is just going to miss the 3.0
release and will break compatibility when it goes in.  If so then we need
to cut branch-3, but if not then here's my plea to hold off until we do
need it.

Jason

On Thu, Aug 24, 2017 at 3:33 PM, Andrew Wang 
wrote:

> Glad to see the discussion continued in my absence :)
>
> From a release management perspective, it's *extremely* reasonable to block
> the inclusion of new features a month from the planned release date. A
> typical software development lifecycle includes weeks of feature freeze and
> weeks of code freeze. It is no knock on any developer or any feature to say
> that we should not include something in 3.0.0.
>
> I've been very open and clear about the goals, schedule, and scope of 3.0.0
> over the last year plus. The point of the extended alpha process was to get
> all our features in during alpha, and the alpha merge window has been open
> for a year. I'm unmoved by arguments about how long a feature has been
> worked on. None of these were not part of the original 3.0.0 scope, and our
> users have been waiting even longer for big-ticket 3.0 items like JDK8 and
> HDFS EC that were part of the discussed scope.
>
> I see that two VOTEs have gone out since I was out. I still plan to follow
> the proposal in my original email. This means I'll cut branch-3 and
> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
> up development for Hadoop 3.1.0 and 4.0.0.
>
> I'm reaching out to the lead contributor of each of these features
> individually to discuss. We need to close on this quickly, and email is too
> low bandwidth at this stage.
>
> Best,
> Andrew
>

[jira] [Resolved] (MAPREDUCE-6933) Invalid event: TA_CONTAINER_LAUNCH_FAILED at KILLED

2017-08-04 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6933.
---
Resolution: Duplicate

> Invalid event: TA_CONTAINER_LAUNCH_FAILED at KILLED
> ---
>
> Key: MAPREDUCE-6933
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6933
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.1, 3.0.0-alpha4
>Reporter: lujie
>
> When I run a job on 0.23.1, I found a InvalidStateTransitonException:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_CONTAINER_LAUNCH_FAILED at KILLED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:926)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:135)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:870)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:862)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:82)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> After I manually analyse the code of 3.0.0,I think this error may still 
> exists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.7.4 (RC0)

2017-08-02 Thread Jason Lowe

Thanks for driving the 2.7.4 release!
+1 (binding)
- Verified signatures and digests- Successfully built from source including 
native- Deployed to a single-node cluster and ran sample MapReduce jobs
Jason 

On Saturday, July 29, 2017 6:29 PM, Konstantin Shvachko 
 wrote:
 

 Hi everybody,

Here is the next release of Apache Hadoop 2.7 line. The previous stable
release 2.7.3 was available since 25 August, 2016.
Release 2.7.4 includes 264 issues fixed after release 2.7.3, which are
critical bug fixes and major optimizations. See more details in Release
Note:
http://home.apache.org/~shv/hadoop-2.7.4-RC0/releasenotes.html

The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.4-RC0/

Please give it a try and vote on this thread. The vote will run for 5 days
ending 08/04/2017.

Please note that my up to date public key are available from:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
Please don't forget to refresh the page if you've been there recently.
There are other place on Apache sites, which may contain my outdated key.

Thanks,
--Konstantin

Re: Apache Hadoop 2.8.2 Release Plan

2017-07-21 Thread Jason Lowe

+1 to base the 2.8.2 release off of the more recent activity on branch-2.8.  
Because branch-2.8.2 was cut so long ago it is missing a lot of fixes that are 
in branch-2.8.  There also are a lot of JIRAs that claim they are fixed in 
2.8.2 but are not in branch-2.8.2.  Having the 2.8.2 release be based on recent 
activity in branch-2.8 would solve both of these issues, and we'd only need to 
move the handful of JIRAs that have marked themselves correctly as fixed in 
2.8.3 to be fixed in 2.8.2.

Jason
 

On Friday, July 21, 2017 10:01 AM, Kihwal Lee 
 wrote:
 

 Thanks for driving the next 2.8 release, Junping. While I was committing a 
blocker for 2.7.4, I noticed some of the jiras are back-ported to 2.7, but 
missing in branch-2.8.2.  Perhaps it is safer and easier to simply rebranch 
2.8.2.
Thanks,Kihwal

On Thursday, July 20, 2017, 3:32:16 PM CDT, Junping Du  
wrote:

Hi all,
    Per Vinod's previous email, we just announce Apache Hadoop 2.8.1 get 
released today which is a special security release. Now, we should work towards 
2.8.2 release which aim for production deployment. The focus obviously is to 
fix blocker/critical issues [2], bug-fixes and *no* features / improvements. We 
currently have 13 blocker/critical issues, and 10 of them are Patch Available.

  I plan to cut an RC in a month - target for releasing before end of Aug., to 
give enough time for outstanding blocker / critical issues. Will start moving 
out any tickets that are not blockers and/or won't fit the timeline. For 
progress of releasing effort, please refer our release wiki [2].

  Please share thoughts if you have any. Thanks!

Thanks,

Junping

[1] 2.8.2 release Blockers/Criticals: https://s.apache.org/JM5x
[2] 2.8 Release wiki: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release


From: Vinod Kumar Vavilapalli 
Sent: Thursday, July 20, 2017 1:05 PM
To: gene...@hadoop.apache.org
Subject: [ANNOUNCE] Apache Hadoop 2.8.1 is released

Hi all,

The Apache Hadoop PMC has released version 2.8.1. You can get it from this 
page: http://hadoop.apache.org/releases.html#Download
This is a security release in the 2.8.0 release line. It consists of 2.8.0 plus 
security fixes. Users on 2.8.0 are encouraged to upgrade to 2.8.1.

Please note that 2.8.x release line continues to be not yet ready for 
production use. Critical issues are being ironed out via testing and downstream 
adoption. Production users should wait for a subsequent release in the 2.8.x 
line.

Thanks
+Vinod


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6916) History server scheduling tasks at fixed rate can be problematic when those tasks are slow

2017-07-18 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6916:
-

 Summary: History server scheduling tasks at fixed rate can be 
problematic when those tasks are slow
 Key: MAPREDUCE-6916
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6916
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.7.4
Reporter: Jason Lowe


The job history server currently schedules both the task of moving jobs from 
intermediate to done and the task of cleaning jobs at a fixed rate.  If those 
tasks take longer than the rate period to execute then a backlog of 
to-be-scheduled tasks can build up and cause a long storm of them to execute 
later when the blockage clears.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6909) LocalJobRunner fails when run on a node from multiple users

2017-06-30 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6909:
-

 Summary: LocalJobRunner fails when run on a node from multiple 
users
 Key: MAPREDUCE-6909
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6909
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.8.1
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


MAPREDUCE-5762 removed mapreduce.jobtracker.staging.root.dir from 
mapred-default.xml but the property is still being used by LocalJobRunner and 
the code default value does *not* match the value that was removed from 
mapred-default.xml.  This broke the use case where multiple users are running 
local mode jobs on the same node, since they now default to the same directory 
in /tmp.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-6898) TestKill.testKillTask is flaky

2017-06-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6898.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.8.2)
   (was: 3.0.0-alpha4)
   (was: 2.9.0)

> TestKill.testKillTask is flaky
> --
>
> Key: MAPREDUCE-6898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6898
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6898-001.patch
>
>
> TestKill.testKillTask() can fail if the async dispatcher thread is slower 
> than the test's thread.
> {noformat}
> 2017-05-26 11:43:26,532 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from INITED to SETUP
> Job State is : RUNNING
> Job State is : RUNNING Waiting for state : SUCCEEDED   map progress : 0.0   
> reduce progress : 0.0
> 2017-05-26 11:43:26,538 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-05-26 11:43:26,540 INFO  [AsyncDispatcher event handler] impl.TaskImpl 
> (TaskImpl.java:handle(661)) - task_0__m_00 Task Transitioned from NEW 
> to KILLED
> 2017-05-26 11:43:26,540 ERROR [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(998)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> JOB_TASK_COMPLETED at SETUP
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1366)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1362)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-05-26 11:43:26,541 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from SETUP to ERROR
> 2017-05-26 11:43:26,542 INFO  [AsyncDispatcher event handler] app.MRAppMaster 
> (MRAppMaster.java:serviceStop(978)) - Skipping cleaning up the staging dir. 
> assuming AM will be retried.
> {noformat}
> We have to wait until the job's internal state is 
> {{JobInternalState.RUNNING}} and not {{JobInternalState.SETUP}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Reopened] (MAPREDUCE-6898) TestKill.testKillTask is flaky

2017-06-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-6898:
---

No worries, I'll revert and mark this as a duplicate of MAPREDUCE-6815.

> TestKill.testKillTask is flaky
> --
>
> Key: MAPREDUCE-6898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6898
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: MAPREDUCE-6898-001.patch
>
>
> TestKill.testKillTask() can fail if the async dispatcher thread is slower 
> than the test's thread.
> {noformat}
> 2017-05-26 11:43:26,532 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from INITED to SETUP
> Job State is : RUNNING
> Job State is : RUNNING Waiting for state : SUCCEEDED   map progress : 0.0   
> reduce progress : 0.0
> 2017-05-26 11:43:26,538 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-05-26 11:43:26,540 INFO  [AsyncDispatcher event handler] impl.TaskImpl 
> (TaskImpl.java:handle(661)) - task_0__m_00 Task Transitioned from NEW 
> to KILLED
> 2017-05-26 11:43:26,540 ERROR [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(998)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> JOB_TASK_COMPLETED at SETUP
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1366)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1362)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-05-26 11:43:26,541 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from SETUP to ERROR
> 2017-05-26 11:43:26,542 INFO  [AsyncDispatcher event handler] app.MRAppMaster 
> (MRAppMaster.java:serviceStop(978)) - Skipping cleaning up the staging dir. 
> assuming AM will be retried.
> {noformat}
> We have to wait until the job's internal state is 
> {{JobInternalState.RUNNING}} and not {{JobInternalState.SETUP}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-6869) org.apache.hadoop.mapred.ShuffleHandler: Shuffle error in populating headers :

2017-03-28 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6869.
---
Resolution: Not A Bug

Closing this since it does not appear to be a problem in Hadoop.  Please reopen 
with additional evidence if you find otherwise.

> org.apache.hadoop.mapred.ShuffleHandler: Shuffle error in populating headers :
> --
>
> Key: MAPREDUCE-6869
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6869
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, yarn
>Affects Versions: 2.6.0
> Environment: hadoop 2.6.0-cdh5.8.2
>Reporter: 翟玉勇
>Priority: Minor
>
> nodemanager log
> 2017-03-25 21:07:03,071 ERROR org.apache.hadoop.mapred.ShuffleHandler: 
> Shuffle error in populating headers :
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
> usercache/master/appcache/application_1489067586592_930490/output/attempt_1489067586592_930490_m_002811_0/file.out.index
>  in any of the configured local directories
> at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:488)
> at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:165)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.getMapOutputInfo(ShuffleHandler.java:1000)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.populateHeaders(ShuffleHandler.java:1022)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:908)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
> at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
> at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
>

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

2017-03-17 Thread Jason Lowe

+1 (binding)
- Verfied signatures and digests- Performed a native build from the release 
tag- Deployed to a single node cluster- Ran some sample jobs
Jason
 

On Friday, March 17, 2017 4:18 AM, Junping Du  wrote:
 

 Hi all,
    With fix of HDFS-11431 get in, I've created a new release candidate (RC3) 
for Apache Hadoop 2.8.0.

    This is the next minor release to follow up 2.7.0 which has been released 
for more than 1 year. It comprises 2,900+ fixes, improvements, and new 
features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      New RC is available at: 
http://home.apache.org/~junping_du/hadoop-2.8.0-RC3

      The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 
91f2b7a13d1e97be65db92ddabc627cc29ac0009

      The maven artifacts are available via repository.apache.org at: 
https://repository.apache.org/content/repositories/orgapachehadoop-1057

      Please try the release and vote; the vote will run for the usual 5 days, 
ending on 03/22/2017 PDT time.

Thanks,

Junping

Re: Updated 2.8.0-SNAPSHOT artifact

2016-11-04 Thread Jason Lowe

At this point my preference would be to do the most expeditious thing to 
release 2.8, whether that's sticking with the branch-2.8 we have today or 
re-cutting it on branch-2.  Doing a quick JIRA query, there's been almost 2,400 
JIRAs resolved in 2.8.0 (1).  For many of them, it's well-past time they saw a 
release vehicle.  If re-cutting the branch means we have to wrap up a few extra 
things that are still in-progress on branch-2 or add a few more blockers to the 
list before we release then I'd rather stay where we're at and ship it ASAP.

Jason
(1) 
https://issues.apache.org/jira/issues/?jql=project%20in%20%28hadoop%2C%20yarn%2C%20mapreduce%2C%20hdfs%29%20and%20resolution%20%3D%20Fixed%20and%20fixVersion%20%3D%202.8.0





On Tuesday, October 25, 2016 5:31 PM, Karthik Kambatla  
wrote:
 

 Is there value in releasing current branch-2.8? Aren't we better off
re-cutting the branch off of branch-2?

On Tue, Oct 25, 2016 at 12:20 AM, Akira Ajisaka 
wrote:

> It's almost a year since branch-2.8 has cut.
> I'm thinking we need to release 2.8.0 ASAP.
>
> According to the following list, there are 5 blocker and 6 critical issues.
> https://issues.apache.org/jira/issues/?filter=12334985
>
> Regards,
> Akira
>
>
> On 10/18/16 10:47, Brahma Reddy Battula wrote:
>
>> Hi Vinod,
>>
>> Any plan on first RC for branch-2.8 ? I think, it has been long time.
>>
>>
>>
>>
>> --Brahma Reddy Battula
>>
>> -Original Message-
>> From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org]
>> Sent: 20 August 2016 00:56
>> To: Jonathan Eagles
>> Cc: common-...@hadoop.apache.org
>> Subject: Re: Updated 2.8.0-SNAPSHOT artifact
>>
>> Jon,
>>
>> That is around the time when I branched 2.8, so I guess you were getting
>> SNAPSHOT artifacts till then from the branch-2 nightly builds.
>>
>> If you need it, we can set up SNAPSHOT builds. Or just wait for the first
>> RC, which is around the corner.
>>
>> +Vinod
>>
>> On Jul 28, 2016, at 4:27 PM, Jonathan Eagles  wrote:
>>>
>>> Latest snapshot is uploaded in Nov 2015, but checkins are still coming
>>> in quite frequently.
>>> https://repository.apache.org/content/repositories/snapshots/org/apach
>>> e/hadoop/hadoop-yarn-api/
>>>
>>> Are there any plans to start producing updated SNAPSHOT artifacts for
>>> current hadoop development lines?
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>

Re: [VOTE] Release Apache Hadoop 2.6.5 (RC1)

2016-10-10 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Built native from source- Deployed to a 
single-node cluster and ran some sample jobs
Jason
 

On Sunday, October 2, 2016 7:13 PM, Sangjin Lee  wrote:
 

 Hi folks,

I have pushed a new release candidate (R1) for the Apache Hadoop 2.6.5
release (the next maintenance release in the 2.6.x release line). RC1
contains fixes to CHANGES.txt, and is otherwise identical to RC0.

Below are the details of this release candidate:

The RC is available for validation at:
http://home.apache.org/~sjlee/hadoop-2.6.5-RC1/.

The RC tag in git is release-2.6.5-RC1 and its git commit is
e8c9fe0b4c252caf2ebf1464220599650f119997.

The maven artifacts are staged via repository.apache.org at:
https://repository.apache.org/content/repositories/orgapachehadoop-1050/.

You can find my public key at
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS.

Please try the release and vote. The vote will run for the usual 5 days. I
would greatly appreciate your timely vote. Thanks!

Regards,
Sangjin

Re: [VOTE] Release Apache Hadoop 2.7.3 RC2

2016-08-22 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Successfully built from source with native 
support- Deployed a single-node cluster- Ran some sample jobs successfully

Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Wednesday, August 17, 2016 9:05 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC2
   
Hi all,

I've created a new release candidate RC2 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/ 


The RC tag in git is: release-2.7.3-RC2

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1046 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - few issues with RC0 forced a RC1 [1]
 - few more issues with RC1 forced a RC2 [2]
 - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [3].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2] [VOTE] Release Apache Hadoop 2.7.3 RC1: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg26336.html 

[3] 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html

[jira] [Created] (MAPREDUCE-6763) Shuffle server listen queue is too small

2016-08-19 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6763:
-

 Summary: Shuffle server listen queue is too small
 Key: MAPREDUCE-6763
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6763
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe


ShuffleHandler doesn't specify a listen queue length for the server port, so it 
ends up getting the default listen queue length of 50.  This is too small to 
handle bursts of shuffle traffic on large clusters.  It's also inconsistent 
with the default Hadoop uses for RPC servers (default=128).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-15 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed a pseudo-distributed cluster- Ran some sample jobs
Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Friday, August 12, 2016 11:45 AM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC1
   
Hi all,

I've created a release candidate RC1 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 


The RC tag in git is: release-2.7.3-RC1

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - few issues with RC0 forced a RC1 [1]
 - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html

Re: [Release thread] 2.6.5 release activities

2016-08-10 Thread Jason Lowe

Thanks for organizing this, Chris!
I don't believe HADOOP-13362 is needed since it's related to ContainerMetrics.
ContainerMetrics weren't added until 2.7 by YARN-2984.
YARN-4794 looks applicable to 2.6. The change drops right in except it has
JDK7-isms (multi-catch clause), so it needs a slight change.

Jason

From: Chris Trezzo
To: "common-...@hadoop.apache.org" ;
hdfs-...@hadoop.apache.org; "mapreduce-dev@hadoop.apache.org"
; "yarn-...@hadoop.apache.org"

Sent: Tuesday, August 9, 2016 7:32 PM
Subject: [Release thread] 2.6.5 release activities

Based on the sentiment in the "[DISCUSS] 2.6.x line releases" thread, I
have moved forward with some of the initial effort in creating a 2.6.5
release. I am forking this thread so we have a dedicated 2.6.5 release
thread.

I have gone through the git logs and gathered a list of JIRAs that are in
branch-2.7 but are missing from branch-2.6. I limited the diff to issues
with a commit date after 1/26/2016. I did this because 2.6.4 was cut from
branch-2.6 around that date (http://markmail.org/message/xmy7ebs6l3643o5e)
and presumably issues that were committed to branch-2.7 before then were
already looked at as part of 2.6.4.

I have collected these issues in a spreadsheet and have given them an
initial triage on whether they are candidates for a backport to 2.6.5. The
spreadsheet is sorted by the status of the issues with the potential
backport candidates at the top. Here is a link to the spreadsheet:
https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3olWpOCo6EBAey1WYC8hTRUemHvYPPzY/edit?usp=sharing

As of now, I have identified 16 potential backport candidates. Please take
a look at the list and let me know if there are any that you think should
not be on the list, or ones that you think I have missed. This was just an
initial high-level triage, so there could definitely be issues that are
miss-labeled.

As a side note: we still need to look at the pre-commit build for 2.6 and
follow up with an addendum for HADOOP-12800.

Thanks everyone!
Chris Trezzo

Re: [VOTE] Release Apache Hadoop 2.7.3 RC0

2016-08-05 Thread Jason Lowe

Both sound like real problems to me, and I think it's appropriate to file JIRAs 
to track them.
Jason


  From: Andrew Wang 
 To: Karthik Kambatla  
Cc: larry mccay ; Vinod Kumar Vavilapalli 
; "common-...@hadoop.apache.org" 
; "hdfs-...@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 
; "mapreduce-dev@hadoop.apache.org" 

 Sent: Thursday, August 4, 2016 5:56 PM
 Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC0
   
Could a YARN person please comment on these two issues, one of which Vinay
also hit? If someone already triaged or filed JIRAs, I missed it.

On Mon, Jul 25, 2016 at 11:52 AM, Andrew Wang 
wrote:

> I'll also add that, as a YARN newbie, I did hit two usability issues.
> These are very unlikely to be regressions, and I can file JIRAs if they
> seem fixable.
>
> * I didn't have SSH to localhost set up (new laptop), and when I tried to
> run the Pi job, it'd exit my window manager session. I feel there must be a
> more developer-friendly solution here.
> * If you start the NodeManager and not the RM, the NM has a handler for
> SIGTERM and SIGINT that blocked my Ctrl-C and kill attempts during startup.
> I had to kill -9 it.
>
> On Mon, Jul 25, 2016 at 11:44 AM, Andrew Wang 
> wrote:
>
>> I got asked this off-list, so as a reminder, only PMC votes are binding
>> on releases. Everyone is encouraged to vote on releases though!
>>
>> +1 (binding)
>>
>> * Downloaded source, built
>> * Started up HDFS and YARN
>> * Ran Pi job which as usual returned 4, and a little teragen
>>
>> On Mon, Jul 25, 2016 at 11:08 AM, Karthik Kambatla 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> * Downloaded and build from source
>>> * Checked LICENSE and NOTICE
>>> * Pseudo-distributed cluster with FairScheduler
>>> * Ran MR and HDFS tests
>>> * Verified basic UI
>>>
>>> On Sun, Jul 24, 2016 at 1:07 PM, larry mccay  wrote:
>>>
>>> > +1 binding
>>> >
>>> > * downloaded and built from source
>>> > * checked LICENSE and NOTICE files
>>> > * verified signatures
>>> > * ran standalone tests
>>> > * installed pseudo-distributed instance on my mac
>>> > * ran through HDFS and mapreduce tests
>>> > * tested credential command
>>> > * tested webhdfs access through Apache Knox
>>> >
>>> >
>>> > On Fri, Jul 22, 2016 at 10:15 PM, Vinod Kumar Vavilapalli <
>>> > vino...@apache.org> wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I've created a release candidate RC0 for Apache Hadoop 2.7.3.
>>> > >
>>> > > As discussed before, this is the next maintenance release to follow
>>> up
>>> > > 2.7.2.
>>> > >
>>> > > The RC is available for validation at:
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ <
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/>
>>> > >
>>> > > The RC tag in git is: release-2.7.3-RC0
>>> > >
>>> > > The maven artifacts are available via repository.apache.org <
>>> > > http://repository.apache.org/> at
>>> > > https://repository.apache.org/content/repositories/
>>> orgapachehadoop-1040/
>>> > <
>>> > > https://repository.apache.org/content/repositories/
>>> orgapachehadoop-1040/
>>> > >
>>> > >
>>> > > The release-notes are inside the tar-balls at location
>>> > > hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html.
>>> I
>>> > > hosted this at
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html <
>>> > > http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html
>>> >
>>> > for
>>> > > your quick perusal.
>>> > >
>>> > > As you may have noted, a very long fix-cycle for the License & Notice
>>> > > issues (HADOOP-12893) caused 2.7.3 (along with every other Hadoop
>>> > release)
>>> > > to slip by quite a bit. This release's related discussion thread is
>>> > linked
>>> > > below: [1].
>>> > >
>>> > > Please try the release and vote; the vote will run for the usual 5
>>> days.
>>> > >
>>> > > Thanks,
>>> > > Vinod
>>> > >
>>> > > [1]: 2.7.3 release plan:
>>> > > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/
>>> msg24439.html
>>> > <
>>> > > http://markmail.org/thread/6yv2fyrs4jlepmmr>
>>> >
>>>
>>
>>
>

Re: [VOTE] Release Apache Hadoop 2.7.3 RC0

2016-07-25 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed a pseudo-distributed cluster- Ran some sample jobs
Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Friday, July 22, 2016 9:15 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC0

Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ 

The RC tag in git is: release-2.7.3-RC0

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1040/ 

The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html 
 for your 
quick perusal.

As you may have noted, a very long fix-cycle for the License & Notice issues 
(HADOOP-12893) caused 2.7.3 (along with every other Hadoop release) to slip by 
quite a bit. This release's related discussion thread is linked below: [1].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html

[jira] [Resolved] (MAPREDUCE-3294) Log the reason for killing a task during speculative execution

2016-06-20 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-3294.
---
Resolution: Duplicate

This was fixed by MAPREDUCE-5692.

> Log the reason for killing a task during speculative execution
> --
>
> Key: MAPREDUCE-3294
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3294
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Ramya Sunil
>
> The reason for killing a speculated task has to be logged. Currently, a 
> speculated task is killed with a note of "Container killed by the 
> ApplicationMaster. Container killed on request. Exit code is 137" which is 
> not very useful. Better logging of this message stating the task was killed 
> due to completion of its speculative task would be useful.
> Also, this message is lost once the app is moved to history. All we are left 
> with is a list of killed tasks without a reason being notified to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-4758) jobhistory web ui not showing correct # failed reducers

2016-05-12 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-4758.
---
Resolution: Duplicate

This is a duplicate of MAPREDUCE-5982 which was fixed in 2.7.2 and 2.6.4.

> jobhistory web ui not showing correct # failed reducers
> ---
>
> Key: MAPREDUCE-4758
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4758
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, webapps
>Affects Versions: 0.23.4
>Reporter: Thomas Graves
>
> we had a job fail due to a reducer failing 4 times.  Unfortunately the job 
> history UI didn't show  this particular failed reducer which lead to 
> confusion as to why the job failed. 
> This reducer failed to launch all 4 task attempts with a Token Expiration 
> error and the jobhistory file only gets an event when the task attempt 
> transitions to launched.  The webapp JobInfo object only counts the task 
> attempts in the jobhistory file to display under the "Attempt Type" table, so 
> since this task didn't have an attempt with it, it did show it on the UI.
> We need to reconcile the task list with the task attempts or also shows more 
> stats for the tasks vs task attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.6.4 RC0

2016-02-08 Thread Jason Lowe

+1 (binding)
- verified signatures and digests- built native from source- deployed a 
single-node cluster and ran some sample MapReduce jobs.
Jason

  From: Junping Du 
 To: "hdfs-...@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org" ; 
"common-...@hadoop.apache.org"  
 Sent: Wednesday, February 3, 2016 1:01 AM
 Subject: [VOTE] Release Apache Hadoop 2.6.4 RC0

Hi community folks,
  I've created a release candidate RC0 for Apache Hadoop 2.6.4 (the next 
maintenance release to follow up 2.6.3.) according to email thread of release 
plan 2.6.4 [1]. Below is details of this release candidate:

The RC is available for validation at:
*http://people.apache.org/~junping_du/hadoop-2.6.4-RC0/
*

The RC tag in git is: release-2.6.4-RC0

The maven artifacts are staged via repository.apache.org at:
*https://repository.apache.org/content/repositories/orgapachehadoop-1028/?
*

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the usual 5 days.

Thanks!

Cheers,

Junping

[1]: 2.6.4 release plan: http://markmail.org/message/fk3ud3c665lscvx5?

[jira] [Created] (MAPREDUCE-6625) TestCLI#testGetJob fails occasionally

2016-02-02 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6625:
-

 Summary: TestCLI#testGetJob fails occasionally
 Key: MAPREDUCE-6625
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6625
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


Lately TestCLI has been failing sometimes in precommit builds:
{noformat}
Running org.apache.hadoop.mapreduce.tools.TestCLI
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.883 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.tools.TestCLI
testGetJob(org.apache.hadoop.mapreduce.tools.TestCLI)  Time elapsed: 0.037 sec  
<<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.mapreduce.tools.TestCLI.testGetJob(TestCLI.java:175)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-6623) TestRMNMInfo and TestNetworkedJob fails in trunk

2016-02-01 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6623.
---
Resolution: Duplicate

Resolving as a duplicate per the previous comment.

> TestRMNMInfo and TestNetworkedJob fails in trunk
> 
>
> Key: MAPREDUCE-6623
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6623
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Eric Badger
>
> TestRMNMInfo:
> {code}
> Running org.apache.hadoop.mapreduce.v2.TestRMNMInfo
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 32.347 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestRMNMInfo
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 
> 1.572 sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {code}
> TestNetworkedJob
> {code}
> testNetworkedJob:174 expected:<[[Thu Jan 28 22:41:20 + 2016] Application 
> is Activated, waiting for resources to be assigned for AM.  Details : AM 
> Partition =  ; Partition Resource = <memory:8192, 
> vCores:16> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used 
> capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> but was:<[]>
>   TestRMNMInfo.testRMNMInfo:111 Unexpected number of live nodes: expected:<4> 
> but was:<0>
> {code}
> JDK version: JDK v1.8.0_66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Hadoop 2.7.2 RC2

2016-01-19 Thread Jason Lowe

That's reasonable, especially if we don't take nearly as long for 2.7.3.  Note 
that there are almost 50 JIRAs already committed to 2.7.3, so hopefully we'll 
have a plan for that soon.
+1 (binding) for 2.7.2 RC2.
Jason


  From: Vinod Kumar Vavilapalli <vino...@apache.org>
 To: mapreduce-dev@hadoop.apache.org; Jason Lowe <jl...@yahoo-inc.com> 
Cc: Hadoop Common <common-...@hadoop.apache.org>; "hdfs-...@hadoop.apache.org" 
<hdfs-...@hadoop.apache.org>; "yarn-...@hadoop.apache.org" 
<yarn-...@hadoop.apache.org>
 Sent: Tuesday, January 19, 2016 5:25 PM
 Subject: Re: [VOTE] Release Apache Hadoop 2.7.2 RC2
   
The JIRA YARN-4610 links YARN-3434 as the one causing the breakage, and 
YARN-3434 already exists in 2.7.1 itself. That categorizes the new issue as an 
existing bug.
If you agree with that sentiment, and given that there is a clear work-around, 
in the interest of progress of 2.7.2 (we have spent > 2 months on this now), 
I’d like to move forward.
Please LMK what you think.
Thanks+Vinod


On Jan 19, 2016, at 3:13 PM, Jason Lowe <jl...@yahoo-inc.com.INVALID> wrote:
-1 (binding)
We have been running a release derived from 2.7 on some of our clusters, and we 
recently hit a bug where an application making large container requests can 
drastically slow down container allocations for other users in the same queue.  
See YARN-4610 for details.  Since 
yarn.scheduler.capacity.reservations-continue-look-all-nodes is on by default, 
I think we should fix this.  If we decide to ship 2.7.2 without that fix then 
the release notes should call out that JIRA and mention the workaround of 
setting yarn.scheduler.capacity.reservations-continue-look-all-nodes to false.
Jason


  From: Vinod Kumar Vavilapalli <vino...@apache.org>
 To: Hadoop Common <common-...@hadoop.apache.org>; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
 Sent: Thursday, January 14, 2016 10:57 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC2

Hi all,

I've created an updated release candidate RC2 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/

The RC tag in git is: release-2.7.2-RC2

The maven artifacts are available via repository.apache.org 
<http://repository.apache.org/> at 
https://repository.apache.org/content/repositories/orgapachehadoop-1027 
<https://repository.apache.org/content/repositories/orgapachehadoop-1027>

The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/releasenotes.html 
<http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html> for your 
quick perusal.

As you may have noted,
 - I terminated the RC1 related voting thread after finding out that we didn’t 
have a bunch of patches that are already in the released 2.6.3 version. After a 
brief discussion, we decided to keep the parallel 2.6.x and 2.7.x releases 
incremental, see [4] for this discussion.
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 
<http://markmail.org/message/oozq3gvd4nhzsaes>
[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 
<http://markmail.org/message/iktqss2qdeykgpqk>
[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc 
<http://markmail.org/message/5txhvr2qdiqglrwc>
[4] Retracted [VOTE] Release Apache Hadoop 2.7.2 RC1: 
http://markmail.org/thread/n7ljbsnquihn3wlw

Re: [VOTE] Release Apache Hadoop 2.7.2 RC2

2016-01-19 Thread Jason Lowe

-1 (binding)
We have been running a release derived from 2.7 on some of our clusters, and we 
recently hit a bug where an application making large container requests can 
drastically slow down container allocations for other users in the same queue.  
See YARN-4610 for details.  Since 
yarn.scheduler.capacity.reservations-continue-look-all-nodes is on by default, 
I think we should fix this.  If we decide to ship 2.7.2 without that fix then 
the release notes should call out that JIRA and mention the workaround of 
setting yarn.scheduler.capacity.reservations-continue-look-all-nodes to false.
Jason


  From: Vinod Kumar Vavilapalli 
 To: Hadoop Common ; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
 Sent: Thursday, January 14, 2016 10:57 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC2
   
Hi all,

I've created an updated release candidate RC2 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/

The RC tag in git is: release-2.7.2-RC2

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1027 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - I terminated the RC1 related voting thread after finding out that we didn’t 
have a bunch of patches that are already in the released 2.6.3 version. After a 
brief discussion, we decided to keep the parallel 2.6.x and 2.7.x releases 
incremental, see [4] for this discussion.
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 

[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 

[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc 

[4] Retracted [VOTE] Release Apache Hadoop 2.7.2 RC1: 
http://markmail.org/thread/n7ljbsnquihn3wlw

[jira] [Created] (MAPREDUCE-6599) ResourceManager crash due to scheduling opportunity overflow

2016-01-05 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6599:
-

 Summary: ResourceManager crash due to scheduling opportunity 
overflow
 Key: MAPREDUCE-6599
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6599
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.1
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical


If a resource request lingers long enough unsatisfied then the scheduling 
opportunities count for the request can overflow and cause an RM crash.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Hadoop 2.7.2 RC1

2015-12-18 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Spot checked CHANGES.txt files- Successfully 
performed a native build from source- Deployed to a single node cluster and ran 
sample jobs
We have been running with the fix for YARN-4354 on two of our clusters for some 
time with no issues, so I feel confident that prior blocker is now fixed.
Jason
 

  From: Vinod Kumar Vavilapalli 
 To: Hadoop Common ; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: Vinod Kumar Vavilapalli 
 Sent: Wednesday, December 16, 2015 8:49 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC1
   
Hi all,

I've created a release candidate RC1 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/ 


The RC tag in git is: release-2.7.2-RC1

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1026/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html 
for quick perusal.

As you may have noted,
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 

[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 

[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc

Re: [VOTE] Release Apache Hadoop 2.6.3 RC0

2015-12-16 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Successfully built from source with native 
code support- Deployed to a single-node cluster and ran some test jobs
Jason

  From: Junping Du 
 To: Hadoop Common ; "hdfs-...@hadoop.apache.org" 
; "mapreduce-dev@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 

Cc: "junping...@apache.org" 
 Sent: Friday, December 11, 2015 6:16 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.3 RC0

Hi all developers in hadoop community,
  I've created a release candidate RC0 for Apache Hadoop 2.6.3 (the next 
maintenance release to follow up 2.6.2.) according to email thread of release 
plan 2.6.3 [1]. Sorry for this RC coming a bit late as several blocker issues 
were getting committed until yesterday. Below is the details:

The RC is available for validation at:
*http://people.apache.org/~junping_du/hadoop-2.6.3-RC0/
*

The RC tag in git is: release-2.6.3-RC0

The maven artifacts are staged via repository.apache.org at:
*https://repository.apache.org/content/repositories/orgapachehadoop-1025/?
*

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the usual 5 days.

Thanks and happy weekend!

Cheers,

Junping

[1]: 2.6.3 release plan: http://markmail.org/thread/nc2jogbgni37vu6y

Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Jason Lowe

-1 (binding)
Ran into public localization issues and filed YARN-4354. We need that resolved 
before the release is ready.  We will either need a timely fix or may have to 
revert YARN-2902 to unblock the release if my root-cause analysis is correct.  
I'll dig into this more today.

Jason

  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Wednesday, November 11, 2015 10:31 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC0
   
Hi all,


I've created a release candidate RC0 for Apache Hadoop 2.7.2.


As discussed before, this is the next maintenance release to follow up
2.7.1.


The RC is available for validation at:

*http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/

*


The RC tag in git is: release-2.7.2-RC0


The maven artifacts are available via repository.apache.org at

*https://repository.apache.org/content/repositories/orgapachehadoop-1023/

*


As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
by quite a bit. This release's related discussion threads are linked below:
[1] and [2].


Please try the release and vote; the vote will run for the usual 5 days.


Thanks,

Vinod


[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes

[2]: Planning Apache Hadoop 2.7.2
http://markmail.org/message/iktqss2qdeykgpqk

Re: [VOTE] Release Apache Hadoop 2.6.2

2015-10-26 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Performed native build from source- Deployed 
a single-node cluster and ran some test jobs

Jason
  From: Sangjin Lee 
 To: "common-...@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org" ; 
"hdfs-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli  
 Sent: Thursday, October 22, 2015 4:14 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.2
   
Hi all,

I have created a release candidate (RC0) for Hadoop 2.6.2.

The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/

The RC tag in git is: release-2.6.2-RC0

The list of JIRAs committed for 2.6.2:
https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2

The maven artifacts are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1022/

Please try out the release candidate and vote. The vote will run for 5 days.

Thanks,
Sangjin

[jira] [Resolved] (MAPREDUCE-4938) Job submission to unknown queue can leave staging directory behind

2015-10-15 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-4938.
---
Resolution: Duplicate

> Job submission to unknown queue can leave staging directory behind
> --
>
> Key: MAPREDUCE-4938
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4938
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha, 0.23.5
>    Reporter: Jason Lowe
>
> There is a race where submitting a job to an unknown queue can appear to 
> succeed to the client and then subsequently fail later.  Since there was no 
> AM ever launched, there was nothing left to cleanup the staging directory.  
> At that point the client is the only thing that can cleanup the staging 
> directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6472) MapReduce AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-08 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6472:
-

 Summary: MapReduce AM should have java.io.tmpdir=./tmp to be 
consistent with tasks
 Key: MAPREDUCE-6472
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6472
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.6.0
Reporter: Jason Lowe


MapReduceChildJVM.getVMCommand ensures that all tasks have 
-Djava.io.tmpdir=./tmp set as part of the task command-line, but this is only 
used for tasks.  The AM itself does not have a corresponding java.io.tmpdir 
setting.  It should also use the same tmpdir setting to avoid cases where the 
AM JVM wants to place files in /tmp by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6413) TestLocalJobSubmission is failing

2015-06-23 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6413:
-

 Summary: TestLocalJobSubmission is failing
 Key: MAPREDUCE-6413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6413
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.1
Reporter: Jason Lowe


ThestLocalJobSubmission.testLocalJobLibjarsOption is failing with 
java.net.UnknownHostException: testcluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6355) 2.5 client cannot communicate with 2.5 job on 2.6 cluster

2015-05-04 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6355:
-

 Summary: 2.5 client cannot communicate with 2.5 job on 2.6 cluster
 Key: MAPREDUCE-6355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe


Trying to run a job on a Hadoop 2.6 cluster from a Hadoop 2.5 client submitting 
a job that uses Hadoop 2.5 jars results in a job that succeeds but the client 
cannot communicate with the AM while the job is running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6324) Uber jobs fail to update AMRM token when it rolls over

2015-04-21 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6324:
-

 Summary: Uber jobs fail to update AMRM token when it rolls over
 Key: MAPREDUCE-6324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


When the RM rolls a new AMRM master key the AMs are supposed to receive a new 
AMRM token on subsequent heartbeats between the time when the new key is rolled 
and when it is activated.  This is not occurring for uber jobs.  If the 
connection to the RM needs to be re-established after the new key is activated 
(e.g.: RM restart or network hiccup) then the uber job AM will be unable to 
reconnect to the RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Hadoop 2.7.0 RC0

2015-04-14 Thread Jason Lowe

+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed to a single-node cluster and ran sample jobs
Jason

  From: Vinod Kumar Vavilapalli vino...@apache.org
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Friday, April 10, 2015 6:44 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.0 RC0
   
Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.0.

 The RC is available at: http://people.apache.org/~vinodkv/hadoop-2.7.0-RC0/

The RC tag in git is: release-2.7.0-RC0

 The maven artifacts are available via repository.apache.org at
https://repository.apache.org/content/repositories/orgapachehadoop-1017/

As discussed before
 - This release will only work with JDK 1.7 and above
 - I’d like to use this as a starting release for 2.7.x [1], depending on
how it goes, get it stabilized and potentially use a 2.7.1 in a few
weeks as the stable release.

 Please try the release and vote; the vote will run for the usual 5 days.

 Thanks,
 Vinod

 [1]: A 2.7.1 release to follow up 2.7.0
http://markmail.org/thread/zwzze6cqqgwq4rmw

[jira] [Created] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer

2015-04-01 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6303:
-

 Summary: Read timeout when retrying a fetch error can be fatal to 
a reducer
 Key: MAPREDUCE-6303
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Blocker


If a reducer encounters an error trying to fetch from a node then encounters a 
read timeout when trying to re-establish the connection then the reducer can 
fail.  The read timeout exception can leak to the top of the Fetcher thread 
which will cause the reduce task to teardown.  This type of error can repeat 
across reducer attempts causing jobs to fail due to a single bad node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6279) AM should explicity exit JVM after all services have stopped

2015-03-18 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6279:
-

 Summary: AM should explicity exit JVM after all services have 
stopped
 Key: MAPREDUCE-6279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe


Occasionally the MapReduce AM can get stuck trying to shut down.  
MAPREDUCE-6049 and MAPREDUCE-5888 were specific instances that have been fixed, 
but this can also occur with uber jobs if the task code inadvertently leaves 
non-daemon threads lingering.

We should explicitly shutdown the JVM after the MapReduce AM has unregistered 
and all services have been stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Looking to a Hadoop 3 release

2015-03-05 Thread Jason Lowe

I'm OK with a 3.0.0 release as long as we are minimizing the pain of 
maintaining yet another release line and conscious of the incompatibilities 
going into that release line.
For the former, I would really rather not see a branch-3 cut so soon.  It's yet 
another line onto which to cherry-pick, and I don't see why we need to add this 
overhead at such an early phase.  We should only create branch-3 when there's 
an incompatible change that the community wants and it should _not_ go into the 
next major release (i.e.: it's for Hadoop 4.0).  We can develop 3.0 alphas and 
betas on trunk and release from trunk in the interim.  IMHO we need to stop 
treating trunk as a place to exile patches.

For the latter, I think as a community we need to evaluate the benefits of 
breaking compatibility against the costs of migrating.  Each time we break 
compatibility we create a hurdle for people to jump when they move to the new 
release, and we should make those hurdles worth their time.  For example, 
wire-compatibility has been mentioned as part of this.  Any feature that breaks 
wire compatibility better be absolutely amazing, as it creates a huge hurdle 
for people to jump.
To summarize:+1 for a community-discussed roadmap of what we're breaking in 
Hadoop 3 and why it's worth it for users
-1 for creating branch-3 now, we can release from trunk until the next 
incompatibility for Hadoop 4 arrives
+1 for baking classpath isolation as opt-in on 2.x and eventually default on in 
3.0
Jason
  From: Andrew Wang andrew.w...@cloudera.com
 To: hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org 
Cc: common-...@hadoop.apache.org common-...@hadoop.apache.org; 
mapreduce-dev@hadoop.apache.org mapreduce-dev@hadoop.apache.org; 
yarn-...@hadoop.apache.org yarn-...@hadoop.apache.org 
 Sent: Wednesday, March 4, 2015 12:15 PM
 Subject: Re: Looking to a Hadoop 3 release
   
Let's not dismiss this quite so handily.

Sean, Jason, and Stack replied on HADOOP-11656 pointing out that while we
could make classpath isolation opt-in via configuration, what we really
want longer term is to have it on by default (or just always on). Stack in
particular points out the practical difficulties in using an opt-in method
in 2.x from a downstream project perspective. It's not pretty.

The plan that both Sean and Jason propose (which I support) is to have an
opt-in solution in 2.x, bake it there, then turn it on by default
(incompatible) in a new major release. I think this lines up well with my
proposal of some alphas and betas leading up to a GA 3.x. I'm also willing
to help with 2.x release management if that would help with testing this
feature.

Even setting aside classpath isolation, a new major release is still
justified by JDK8. Somehow this is being ignored in the discussion. Allen,
historically the voice of the user in our community, just highlighted it as
a major compatibility issue, and myself and Tucu have also expressed our
very strong concerns about bumping this in a minor release. 2.7's bump is a
unique exception, but this is not something to be cited as precedent or
policy.

Where does this resistance to a new major release stem from? As I've
described from the beginning, this will look basically like a 2.x release,
except for the inclusion of classpath isolation by default and target
version JDK8. I've expressed my desire to maintain API and wire
compatibility, and we can audit the set of incompatible changes in trunk to
ensure this. My proposal for doing alpha and beta releases leading up to GA
also gives downstreams a nice amount of time for testing and validation.

Regards,
Andrew



On Tue, Mar 3, 2015 at 2:32 PM, Arun Murthy a...@hortonworks.com wrote:

 Awesome, looks like we can just do this in a compatible manner - nothing
 else on the list seems like it warrants a (premature) major release.

 Thanks Vinod.

 Arun

 
 From: Vinod Kumar Vavilapalli vino...@hortonworks.com
 Sent: Tuesday, March 03, 2015 2:30 PM
 To: common-...@hadoop.apache.org
 Cc: hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
 yarn-...@hadoop.apache.org
 Subject: Re: Looking to a Hadoop 3 release

 I started pitching in more on that JIRA.

 To add, I think we can and should strive for doing this in a compatible
 manner, whatever the approach. Marking and calling it incompatible before
 we see proposal/patch seems premature to me. Commented the same on JIRA:
 https://issues.apache.org/jira/browse/HADOOP-11656?focusedCommentId=14345875page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14345875
 .

 Thanks
 +Vinod

 On Mar 2, 2015, at 8:08 PM, Andrew Wang andrew.w...@cloudera.commailto:
 andrew.w...@cloudera.com wrote:

 Regarding classpath isolation, based on what I hear from our customers,
 it's still a big problem (even after the MR classloader work). The latest
 Jackson version bump was quite painful for our downstream projects, and the
 HDFS client still leaks a lot

[jira] [Created] (MAPREDUCE-6263) Large jobs can lose history when killed due to brief client timeout

2015-02-18 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6263:
-

 Summary: Large jobs can lose history when killed due to brief 
client timeout
 Key: MAPREDUCE-6263
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6263
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe


YARNRunner connects to the AM to send the kill job command then waits a 
hardcoded 10 seconds for the job to enter a terminal state.  If the job fails 
to enter a terminal state in that time then YARNRunner will tell YARN to kill 
the application forcefully.  The latter type of kill usually results in no job 
history, since the AM process is killed forcefully.

Ten seconds can be too short for large jobs in a large cluster, as it takes 
time to connect to all the nodemanagers, process the state machine events, and 
copy a large jhist file.  The timeout should be more lenient or configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6261) NullPointerException if MapOutputBuffer.flush invoked twice

2015-02-13 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6261:
-

 Summary: NullPointerException if MapOutputBuffer.flush invoked 
twice
 Key: MAPREDUCE-6261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.5.0
Reporter: Jason Lowe


MapOutputBuffer.flush will throw an NPE if it is invoked twice, since it 
blindly assumes kvbuffer is not null yet sets kvbuffer to null towards the end 
of the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-5727) History server web page can filter without showing filter keyword

2015-02-11 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5727.
---
Resolution: Duplicate

This is the same issue as described in YARN-2238, and there's more discussion 
there.

 History server web page can filter without showing filter keyword
 -

 Key: MAPREDUCE-5727
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5727
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, webapps
Affects Versions: 2.3.0
Reporter: Jason Lowe

 I loaded up a job conf page on the history server and used one of the search 
 boxes to narrow the results.  I then navigated to other pages (e.g.: map 
 tasks, logs, etc.) then navigated back to the job conf page using the job 
 configuration link on the left side of the page.  When I arrived it promptly 
 showed me just a few conf entries (the ones I had searched for earlier) but 
 my search term was missing.  At first glance it looked like those were the 
 only entries in the entire job conf, which can be very confusing.  Somehow 
 the search term is being remembered but not replotted when the configuration 
 page is revisited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6230) MR AM does not survive RM restart if RM activated a new AMRM secret key

2015-01-27 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6230:
-

 Summary: MR AM does not survive RM restart if RM activated a new 
AMRM secret key
 Key: MAPREDUCE-6230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


A MapReduce AM will fail to reconnect to an RM that performed restart in the 
following scenario:

# MapReduce job launched with AMRM token generated from AMRM secret X
# RM rolls new AMRM secret Y and activates the new key
# RM performs a work-preserving restart
# MapReduce job AM now unable to connect to RM with Invalid AMRMToken 
exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6225) Fix new findbug warnings in hadoop-mapreduce-client-core

2015-01-26 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6225:
-

 Summary: Fix new findbug warnings in hadoop-mapreduce-client-core
 Key: MAPREDUCE-6225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6225
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Jason Lowe


Recent precommit builds in hadoop-mapreduce-client-core are flagging findbug 
warnings that appear to be new with the recent findbugs upgrade.  These need to 
be cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6219) Reduce memory required for FileInputFormat located status optimization

2015-01-20 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6219:
-

 Summary: Reduce memory required for FileInputFormat located status 
optimization
 Key: MAPREDUCE-6219
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6219
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Priority: Minor


MAPREDUCE-1981 introduced an optimization to drastically reduce the number of 
namenode operations required to compute input splits when processing a 
directory.  However it requires more memory to perform this optimization as it 
retains the full LocatedFileStatus object for all input files while computing 
the splits.  This can lead to odd situations for users where using a directory 
as input can run the job client out of heap space but using directory/* as the 
input spec allows it to run within the original heap space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6172) TestDbClasses timeouts are too aggressive

2014-11-24 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6172:
-

 Summary: TestDbClasses timeouts are too aggressive
 Key: MAPREDUCE-6172
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6172
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Minor


Some of the TestDbClasses test timeouts are only 1 second, and some of those 
tests perform disk I/O which could easily exceed the test timeout if the disk 
is busy or there's some other hiccup on the system at the time.  We should 
increase these timeouts to something more reasonable (i.e.: 10 or 20 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Hadoop 2.6.0

2014-11-17 Thread Jason Lowe

+1 (binding)
- verified signatures and digests- verified late-arriving fixes for YARN-2846 
and MAPREDUCE-6156 were present
- built from source- deployed to a single-node cluster 
- ran some sample MapReduce jobs
Jason
  From: Arun C Murthy a...@hortonworks.com
 To: common-...@hadoop.apache.org common-...@hadoop.apache.org; 
hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org yarn-...@hadoop.apache.org; 
mapreduce-dev@hadoop.apache.org mapreduce-dev@hadoop.apache.org 
 Sent: Thursday, November 13, 2014 5:08 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.0 
   
Folks,

I've created another release candidate (rc1) for hadoop-2.6.0 based on the 
feedback.

The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc1
The RC tag in git is: release-2.6.0-rc1

The maven artifacts are available via repository.apache.org at 
https://repository.apache.org/content/repositories/orgapachehadoop-1013.

Please try the release and vote; the vote will run for the usual 5 days.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [VOTE] Release Apache Hadoop 2.6.0

2014-11-13 Thread Jason Lowe

I just committed 2.6 blockes YARN-2846 and MAPREDUCE-6156 which should also be 
in the 2.6.0 rc1 build.
Jason
  From: Arun C Murthy a...@hortonworks.com
 To: yarn-...@hadoop.apache.org 
Cc: mapreduce-dev@hadoop.apache.org; Ravi Prakash ravi...@ymail.com; 
hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org; 
common-...@hadoop.apache.org common-...@hadoop.apache.org 
 Sent: Wednesday, November 12, 2014 10:58 AM
 Subject: Re: [VOTE] Release Apache Hadoop 2.6.0

Sounds good. I'll create an rc1. Thanks.

Arun

On Nov 11, 2014, at 2:06 PM, Robert Kanter rkan...@cloudera.com wrote:

 Hi Arun,

 We were testing the RC and ran into a problem with the recent fixes that
 were done for POODLE for Tomcat (HADOOP-11217 for KMS and HDFS-7274 for
 HttpFS).  Basically, in disabling SSLv3, we also disabled SSLv2Hello, which
 is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they
 can't connect without it.  Just to be clear, it does not mean SSLv2, which
 is insecure.  This also affects the MR shuffle in HADOOP-11243.

 The fix is super simple, so I think we should reopen these 3 JIRAs and put
 in addendum patches and get them into 2.6.0.

 thanks
 - Robert

 On Tue, Nov 11, 2014 at 1:04 PM, Ravi Prakash ravi...@ymail.com wrote:

 Hi Arun!
 We are very close to completion on YARN-1964 (DockerContainerExecutor).
 I'd also like HDFS-4882 to be checked in. Do you think these issues merit
 another RC?
 ThanksRavi

    On Tuesday, November 11, 2014 11:57 AM, Steve Loughran 
 ste...@hortonworks.com wrote:

 +1 binding

 -patched slider pom to build against 2.6.0

 -verified build did download, which it did at up to ~8Mbps. Faster than a
 local build.

 -full clean test runs on OS/X  Linux

 Windows 2012:

 Same thing. I did have to first build my own set of the windows native
 binaries, by checking out branch-2.6.0; doing a native build, copying the
 binaries and then purging the local m2 repository of hadoop artifacts to be
 confident I was building against. For anyone who wants those native libs
 they will be up on
 https://github.com/apache/incubator-slider/tree/develop/bin/windows/ once
 it syncs with the ASF repos.

 afterwords: the tests worked!

 On 11 November 2014 02:52, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 I've created a release candidate (rc0) for hadoop-2.6.0 that I would like
 to see released.

 The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0
 The RC tag in git is: release-2.6.0-rc0

 The maven artifacts are available via repository.apache.org at
 https://repository.apache.org/content/repositories/orgapachehadoop-1012.

 Please try the release and vote; the vote will run for the usual 5 days.

 thanks,
 Arun

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/hdp/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Created] (MAPREDUCE-6161) mapred hsadmin command missing from trunk

2014-11-13 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6161:
-

 Summary: mapred hsadmin command missing from trunk
 Key: MAPREDUCE-6161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6161
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scripts
Affects Versions: trunk
Reporter: Jason Lowe


The hsadmin subcommand of the mapred script is no longer present in trunk. It 
is present in branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-6159) No log of JobHistory found in all logs files

2014-11-12 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6159.
---
Resolution: Invalid

The JobHistoryEventHandler is code that runs in the ApplicationMaster rather 
than the job history server.  You'll find those log messages in the AM logs of 
individual jobs which are either aggregated to HDFS (by default) or left on the 
nodes the AMs ran on if log aggregation is disabled.

 No log of JobHistory found in all logs files
 

 Key: MAPREDUCE-6159
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6159
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver, mrv2
Affects Versions: 2.2.0
 Environment: Hadoop-2.2.0
Reporter: JasonZhu

 I intend to dig into 'mapreduce.jobhistory.intermediate-done-dir' argument, 
 the position of which is at `JHAdminConfig:73`, to get some comprehension on 
 history server. This argument is referenced at 
 `JobHistoryEventHandler.moveToDoneNow()`, where history server moves job 
 summary file 
 from $[yarn.app.mapreduce.am.staging-dir]/$[user]/.staging to 
 $[mapreduce.jobhistory.intermediate-done-dir]/$[user]. 
 The following code snippet in `moveToDoneNow()` will definitely write some 
 logs out to log file, but I can found no any sign of it in all logs in 
 $HADOOP_LOG_DIR via command `grep Copied to done location *`.
 if (copied)
 LOG.info(Copied to done location:  + toPath);
 else 
 LOG.info(copy failed);
 Is there anything that I missed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6141) History server leveldb recovery store

2014-10-28 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6141:
-

 Summary: History server leveldb recovery store
 Key: MAPREDUCE-6141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe


It would be nice to have a leveldb option to the job history server recovery 
store.  Leveldb would provide some benefits over the existing filesystem store 
such as better support for atomic operations, fewer I/O ops per state update, 
and far fewer total files on the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6119) Ability to disable node update processing in MR AM

2014-10-03 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6119:
-

 Summary: Ability to disable node update processing in MR AM
 Key: MAPREDUCE-6119
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6119
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Jason Lowe






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (MAPREDUCE-6114) TestMRCJCFileInputFormat#testAddInputPath fails in trunk

2014-09-29 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6114.
---
Resolution: Duplicate

Dup of MAPREDUCE-6094.

 TestMRCJCFileInputFormat#testAddInputPath fails in trunk
 

 Key: MAPREDUCE-6114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6114
 Project: Hadoop Map/Reduce
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 This can be reproduced locally:
 {code}
 Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.474 sec  
 FAILURE! - in org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat
 testAddInputPath(org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat)
   Time elapsed: 0.86 sec   ERROR!
 java.io.IOException: No FileSystem for scheme: s3
   at 
 org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2583)
   at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2590)
   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
   at 
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2629)
   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2611)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
   at 
 org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat.testAddInputPath(TestMRCJCFileInputFormat.java:55)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Hadoop 2.5.1 RC0

2014-09-10 Thread Jason Lowe


+1 (binding)

- verified signatures and digests
- built from source
- examined CHANGES.txt for items fixed in 2.5.1
- deployed to a single-node cluster and ran some sample MR jobs

Jason

On 09/05/2014 07:18 PM, Karthik Kambatla wrote:

Hi folks,

I have put together a release candidate (RC0) for Hadoop 2.5.1.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.1-RC0/
The RC git tag is release-2.5.1-RC0
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1010/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks
Karthik

[jira] [Created] (MAPREDUCE-6075) HistoryServerFileSystemStateStore can create zero-length files

2014-09-05 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6075:
-

 Summary: HistoryServerFileSystemStateStore can create zero-length 
files
 Key: MAPREDUCE-6075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe


When the history server state store writes a token file it uses 
IOUtils.cleanup() to close the file which will silently ignore errors.  This 
can lead to empty token files in the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Migration from subversion to git for version control

2014-08-10 Thread Jason Lowe


+1

Jason

On 08/08/2014 09:57 PM, Karthik Kambatla wrote:

I have put together this proposal based on recent discussion on this topic.

Please vote on the proposal. The vote runs for 7 days.

1. Migrate from subversion to git for version control.
2. Force-push to be disabled on trunk and branch-* branches. Applying
changes from any of trunk/branch-* to any of branch-* should be through
git cherry-pick -x.
3. Force-push on feature-branches is allowed. Before pulling in a
feature, the feature-branch should be rebased on latest trunk and the
changes applied to trunk through git rebase --onto or git cherry-pick
commit-range.
4. Every time a feature branch is rebased on trunk, a tag that
identifies the state before the rebase needs to be created (e.g.
tag_feature_JIRA-2454_2014-08-07_rebase). These tags can be deleted once
the feature is pulled into trunk and the tags are no longer useful.
5. The relevance/use of tags stay the same after the migration.

Thanks
Karthik

PS: Per Andrew Wang, this should be a Adoption of New Codebase kind of
vote and will be Lazy 2/3 majority of PMC members.

Re: [VOTE] Release Apache Hadoop 2.5.0 RC2

2014-08-10 Thread Jason Lowe


+1 (binding)

- verified signatures and digests
- built from source
- deployed a single-node cluster
- ran some sample jobs

Jason

On 08/06/2014 03:59 PM, Karthik Kambatla wrote:

Hi folks,

I have put together a release candidate (rc2) for Hadoop 2.5.0.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.0-RC2/
The RC tag in svn is here:
https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.5.0-rc2/
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1009/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks

[jira] [Created] (MAPREDUCE-6021) MR AM should add working directory to LD_LIBRARY_PATH

2014-08-01 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6021:
-

 Summary: MR AM should add working directory to LD_LIBRARY_PATH
 Key: MAPREDUCE-6021
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6021
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.4.1
Reporter: Jason Lowe


Tasks implicitly pick up shared libraries added to the job because the task 
launch context explicitly adds the container working directory to 
LD_LIBRARY_PATH.  However the same is not done for the AM container which is 
inconsistent.  User code can run in the AM via output committer, speculator, 
uber job, etc., so the AM's LD_LIBRARY_PATH should have the container work 
directory for consistency with tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (MAPREDUCE-6022) map_input_file is missing from streaming job environment

2014-08-01 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6022:
-

 Summary: map_input_file is missing from streaming job environment
 Key: MAPREDUCE-6022
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6022
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Jason Lowe


When running a streaming job the 'map_input_file' environment variable is not 
being set.  This property is deprecated, but in the past deprecated properties 
still appeared in a stream job's environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (MAPREDUCE-6010) HistoryServerFileSystemStateStore fails to update tokens

2014-07-28 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6010:
-

 Summary: HistoryServerFileSystemStateStore fails to update tokens
 Key: MAPREDUCE-6010
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6010
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe


When token recovery is enabled and the file system state store is being used 
then tokens fail to be updated due to a rename destination conflict.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (MAPREDUCE-6011) Improve history server behavior during a recovery error

2014-07-28 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6011:
-

 Summary: Improve history server behavior during a recovery error
 Key: MAPREDUCE-6011
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6011
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe


Currently when the history server encounters an error during recovery it is 
fatal without specific details on the error (e.g. which token was involved 
during the recovery error).  We should either allow the history server to 
proceed past recovery errors or provide more specifics on the offending token 
involved in the fatal error to aid in manual recovery.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: [DISCUSS] Assume Private-Unstable for classes that are not annotated

2014-07-23 Thread Jason Lowe

I think that's a reasonable proposal as long as we understand it changes 
the burden from finding all the things that should be marked @Private to 
finding all the things that should be marked @Public. As Tom Graves 
pointed out in an earlier discussion about @LimitedPrivate, it may be 
impossible to do a straightforward task and use only interfaces marked 
@Public.  If users can't do basic things without straying from @Public 
interfaces then tons of code can break if we assume it's always fair 
game to change anything not marked @Public.  The well you shouldn't 
have used a non-@Public interface argument is not very useful in that 
context.


So as long as we're good about making sure officially supported features 
have corresponding @Public interfaces to wield them then I agree it will 
be easier to track those rather than track all the classes that should 
be @Private.  Hopefully if users understand that's how things work 
they'll help file JIRAs for interfaces that need to be @Public to get 
their work done.


Jason

On 07/22/2014 04:54 PM, Karthik Kambatla wrote:

Hi devs

As you might have noticed, we have several classes and methods in them that
are not annotated at all. This is seldom intentional. Avoiding incompatible
changes to all these classes can be considerable baggage.

I was wondering if we should add an explicit disclaimer in our
compatibility guide that says, Classes without annotations are to
considered @Private

For methods, is it reasonable to say - Class members without specific
annotations inherit the annotations of the class?

Thanks
Karthik

Re: [VOTE] Release Apache Hadoop 2.4.1

2014-06-27 Thread Jason Lowe


+1

- Verified signatures and digests
- Built from source, installed on single-node cluster and ran some 
sample jobs


Jason

On 06/21/2014 01:51 AM, Arun C Murthy wrote:

Folks,

I've created another release candidate (rc1) for hadoop-2.4.1 based on the 
feedback that I would like to push out.

The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1
The RC tag in svn is here: 
https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc1

The maven artifacts are available via repository.apache.org.

Please try the release and vote; the vote will run for the usual 7 days.

thanks,
Arun



--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/hdp/

Re: [VOTE] Release Apache Hadoop 0.23.11

2014-06-25 Thread Jason Lowe


+1 (binding)

- Verified signatures and digests
- Deployed binary tarball to a single-node cluster and ran some MR 
example jobs
- Built from source, deployed to a single-node cluster and ran some MR 
example jobs


Jason

On 06/19/2014 10:14 AM, Thomas Graves wrote:

Hey Everyone,

There have been various bug fixes that have went into
branch-0.23 since the 0.23.10 release.  We think its time to do a 0.23.11.

This is also the last planned release off of branch-0.23 we plan on doing.

The RC is available at:
http://people.apache.org/~tgraves/hadoop-0.23.11-candidate-0/


The RC Tag in svn is here:
http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.11-rc0/

The maven artifacts are available via repository.apache.org.

Please try the release and vote; the vote will run for the usual 7 days
til June 26th.

I am +1 (binding).

thanks,
Tom Graves

[jira] [Resolved] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-18 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5928.
---

Resolution: Duplicate

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-5927:
---


 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 .Failing this attempt.. Failing the application.
 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0
 Can you please help me in fixing this ?
 Thanks,
 ~Kedar



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5927.
---

Resolution: Fixed

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 .Failing this attempt.. Failing the application.
 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0
 Can you please help me in fixing this ?
 Thanks,
 ~Kedar



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5927.
---

Resolution: Invalid

This is a general support question better asked on the u...@hadoop.apache.org 
list.  JIRA is for tracking bugs and features in Hadoop and not a general user 
support channel.

In this case the ApplicationMaster is crashing shortly after startup.  You'll 
need to examine the ApplicationMaster log to determine what happened -- click 
on the tracking URL and then from there go to the AM logs link or you can also 
use the yarn logs command if log aggregation is enabled on your cluster.

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615

[jira] [Resolved] (MAPREDUCE-5923) org.apache.hadoop.mapred.pipes.TestPipeApplication timeouts intermittently

2014-06-12 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5923.
---

Resolution: Duplicate

This is a duplicate of MAPREDUCE-5868.

 org.apache.hadoop.mapred.pipes.TestPipeApplication timeouts intermittently
 --

 Key: MAPREDUCE-5923
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5923
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk
Reporter: Chen He
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (MAPREDUCE-5830) HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3

2014-05-28 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-5830:
---


Reopening this, as we should address older Hive versions.

 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3
 --

 Key: MAPREDUCE-5830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5830
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Jason Lowe
Priority: Blocker

 HostUtil.getTaskLogUrl used to have a signature like this in Hadoop 2.3.0 and 
 earlier:
 public static String getTaskLogUrl(String taskTrackerHostName, String 
 httpPort, String taskAttemptID)
 but now has a signature like this:
 public static String getTaskLogUrl(String scheme, String taskTrackerHostName, 
 String httpPort, String taskAttemptID)
 This breaks source and binary backwards-compatibility.  MapReduce and Hive 
 both have references to this, so their jars compiled against 2.3 or earlier 
 do not work on 2.4.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (MAPREDUCE-5891) Improved shuffle error handling across NM restarts

2014-05-16 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-5891:
-

 Summary: Improved shuffle error handling across NM restarts
 Key: MAPREDUCE-5891
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5891
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe


To minimize the number of map fetch failures reported by reducers across an NM 
restart it would be nice if reducers only reported a fetch failure after trying 
for at specified period of time to retrieve the data.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 3 >

1 - 100 of 277 matches

Mail list logo