[jira] [Reopened] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked

2017-10-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened MAPREDUCE-6987:


> JHS Log Scanner and Cleaner blocked
> ---
>
> Key: MAPREDUCE-6987
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
>
> {code}
> "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 
> nid=0x11db waiting on condition [0x7fd6aa859000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xd6c88a80> (a 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47)
>   at 
> org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 
> nid=0x11da waiting on condition [0x7fd6aa95a000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xd6c8> (a 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47)
>   at 
> org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> Both threads waiting on {{FutureTask.get()}} for infinite time after first 
> execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked

2017-10-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6987.

   Resolution: Duplicate
Fix Version/s: (was: 3.1.0)
   (was: 3.0.0)
   (was: 2.9.0)

> JHS Log Scanner and Cleaner blocked
> ---
>
> Key: MAPREDUCE-6987
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
>
> {code}
> "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 
> nid=0x11db waiting on condition [0x7fd6aa859000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xd6c88a80> (a 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47)
>   at 
> org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 
> nid=0x11da waiting on condition [0x7fd6aa95a000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xd6c8> (a 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47)
>   at 
> org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> Both threads waiting on {{FutureTask.get()}} for infinite time after first 
> execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5951) Add support for the YARN Shared Cache

2017-10-05 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193832#comment-16193832
 ] 

Andrew Wang commented on MAPREDUCE-5951:


If it's going into 2.9.0, I think it's safe for 3.0.0 too. Please include it in 
branch-3.0 as well, thanks!

> Add support for the YARN Shared Cache
> -
>
> Key: MAPREDUCE-5951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5951
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5951-Overview.001.pdf, 
> MAPREDUCE-5951-trunk.016.patch, MAPREDUCE-5951-trunk.017.patch, 
> MAPREDUCE-5951-trunk.018.patch, MAPREDUCE-5951-trunk.019.patch, 
> MAPREDUCE-5951-trunk-020.patch, MAPREDUCE-5951-trunk-021.patch, 
> MAPREDUCE-5951-trunk-v10.patch, MAPREDUCE-5951-trunk-v11.patch, 
> MAPREDUCE-5951-trunk-v12.patch, MAPREDUCE-5951-trunk-v13.patch, 
> MAPREDUCE-5951-trunk-v14.patch, MAPREDUCE-5951-trunk-v15.patch, 
> MAPREDUCE-5951-trunk-v1.patch, MAPREDUCE-5951-trunk-v2.patch, 
> MAPREDUCE-5951-trunk-v3.patch, MAPREDUCE-5951-trunk-v4.patch, 
> MAPREDUCE-5951-trunk-v5.patch, MAPREDUCE-5951-trunk-v6.patch, 
> MAPREDUCE-5951-trunk-v7.patch, MAPREDUCE-5951-trunk-v8.patch, 
> MAPREDUCE-5951-trunk-v9.patch
>
>
> Implement the necessary changes so that the MapReduce application can 
> leverage the new YARN shared cache (i.e. YARN-1492).
> Specifically, allow per-job configuration so that MapReduce jobs can specify 
> which set of resources they would like to cache (i.e. jobjar, libjars, 
> archives, files).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6925) CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6925:
---
Target Version/s: 2.9.0, 3.0.0  (was: 2.9.0, 3.0.0-beta1)

> CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and 
> YarnChild
> ---
>
> Key: MAPREDUCE-6925
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6925
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, client, task
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>
> Currently, counter limits "mapreduce.job.counters.*" handled by 
> {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized 
> asymmetrically: on the client side, and on the AM, job.xml is ignored whereas 
> it's taken into account in YarnChild.
> It would be good to make the Limits job-configurable, such that max 
> counters/groups is only increased when needed. With the current Limits 
> implementation relying on static constants, it's going to be challenging for 
> tools that submit jobs concurrently  without resorting to class loading 
> isolation.
> The patch that I am uploading is not perfect but demonstrates the issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6946) Moving logging APIs over to slf4j in hadoop-mapreduce

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6946:
---
Target Version/s: 2.9.0, 3.0.0  (was: 2.9.0, 3.0.0-beta1)

> Moving logging APIs over to slf4j in hadoop-mapreduce
> -
>
> Key: MAPREDUCE-6946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6946
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>
> MapReduce side of YARN-6712. This is an umbrella jira for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6960) Shuffle Handler prints disk error stack traces for every read failure.

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6960:
---
Fix Version/s: (was: 3.0.0)
   3.0.0-beta1

> Shuffle Handler prints disk error stack traces for every read failure.
> --
>
> Key: MAPREDUCE-6960
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6960
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0, 2.8.3
>
> Attachments: MAPREDUCE-6960.001.patch
>
>
> {code}
>  } catch (IOException e) {
>   LOG.error("Shuffle error :", e);
> {code}
> In cases where the read from a disk fails and throws a DiskErrorException, 
> the shuffle handler prints the entire stack trace for each and every one of 
> the failures causing the nodemanager logs to quickly fill up the disk. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6953) Skip the testcase testJobWithChangePriority if FairScheduler is used

2017-09-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6953:
---
Fix Version/s: (was: 3.0.0)
   3.0.0-beta1

> Skip the testcase testJobWithChangePriority if FairScheduler is used
> 
>
> Key: MAPREDUCE-6953
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6953
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: client
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: MAPREDUCE-6953-001.patch
>
>
> We run the unit tests with Fair Scheduler downstream. FS does not support 
> priorities at the moment, so TestMRJobs#testJobWithChangePriority fails.
> Just add {{Assume.assumeFalse(usingFairScheduler);}} and JUnit will skip the 
> test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6892) Issues with the count of failed/killed tasks in the jhist file

2017-09-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162092#comment-16162092
 ] 

Andrew Wang commented on MAPREDUCE-6892:


Peter, do you mind adding a release note to this JIRA summarizing the impact 
for our end users? Thanks!

> Issues with the count of failed/killed tasks in the jhist file
> --
>
> Key: MAPREDUCE-6892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6892
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, jobhistoryserver
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6892-001.patch, MAPREDUCE-6892-002.PATCH, 
> MAPREDUCE-6892-003.patch, MAPREDUCE-6892-004.patch, MAPREDUCE-6892-005.patch, 
> MAPREDUCE-6892-006.patch
>
>
> Recently we encountered some issues with the value of failed tasks. After 
> parsing the jhist file, {{JobInfo.getFailedMaps()}} returned 0, but actually 
> there were failures. 
> Another minor thing is that you cannot get the number of killed tasks 
> (although this can be calculated).
> The root cause is that {{JobUnsuccessfulCompletionEvent}} contains only the 
> successful map/reduce task counts. Number of failed (or killed) tasks are not 
> stored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6870) Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

2017-09-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161990#comment-16161990
 ] 

Andrew Wang commented on MAPREDUCE-6870:


Hi Erik, do you mind adding a release note summarizing the incompatibility? 
Would be nice for our end users.

> Add configuration for MR job to finish when all reducers are complete (even 
> with unfinished mappers)
> 
>
> Key: MAPREDUCE-6870
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6870
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Peter Bacsko
> Fix For: 3.0.0-beta1
>
> Attachments: MAPREDUCE-6870-001.patch, MAPREDUCE-6870-002.patch, 
> MAPREDUCE-6870-003.patch, MAPREDUCE-6870-004.patch, MAPREDUCE-6870-005.patch, 
> MAPREDUCE-6870-006.patch, MAPREDUCE-6870-007.patch
>
>
> Even with MAPREDUCE-5817, there could still be cases where mappers get 
> scheduled before all reducers are complete, but those mappers run for long 
> time, even after all reducers are complete. This could hurt the performance 
> of large MR jobs.
> In some cases, mappers don't have any materialize-able outcome other than 
> providing intermediate data to reducers. In that case, the job owner should 
> have the config option to finish the job once all reducers are complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6941) The default setting doesn't work for MapReduce job

2017-09-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6941.

Resolution: Not A Problem

I'm going to close this based on Ray's analysis. Junping, if you disagree, 
please re-open the JIRA.

> The default setting doesn't work for MapReduce job
> --
>
> Key: MAPREDUCE-6941
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6941
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Junping Du
>Priority: Blocker
>
> On the deployment of hadoop 3 cluster (based on current trunk branch) with 
> default settings, the MR job will get failed as following exceptions:
> {noformat}
> 2017-08-16 13:00:03,846 INFO mapreduce.Job: Job job_1502913552390_0001 
> running in uber mode : false
> 2017-08-16 13:00:03,847 INFO mapreduce.Job:  map 0% reduce 0%
> 2017-08-16 13:00:03,864 INFO mapreduce.Job: Job job_1502913552390_0001 failed 
> with state FAILED due to: Application application_1502913552390_0001 failed 2 
> times due to AM Container for appattempt_1502913552390_0001_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: [2017-08-16 13:00:02.963]Exception from 
> container-launch.
> Container id: container_1502913552390_0001_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:994)
>   at org.apache.hadoop.util.Shell.run(Shell.java:887)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1212)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:295)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:455)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:275)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:90)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This is because mapreduce related jar are not added into yarn setup by 
> default. To make MR job run successful, we need to add following 
> configurations to yarn-site.xml now:
> {noformat}
> 
>   yarn.application.classpath
>   
> ...
> /share/hadoop/mapreduce/*,
> /share/hadoop/mapreduce/lib/*
> ...
>   
> {noformat}
> But this config is not necessary for previous version of Hadoop. We should 
> fix this issue before beta release otherwise it will be a regression for 
> configuration changes.
> This could be more like a YARN issue (if so, we should move), depends on how 
> we fix it finally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6941) The default setting doesn't work for MapReduce job

2017-08-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146431#comment-16146431
 ] 

Andrew Wang commented on MAPREDUCE-6941:


[~djp] is Ray's explanation satisfactory? Wondering if we can close this, it's 
one of two unassigned blockers right now.

> The default setting doesn't work for MapReduce job
> --
>
> Key: MAPREDUCE-6941
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6941
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Junping Du
>Priority: Blocker
>
> On the deployment of hadoop 3 cluster (based on current trunk branch) with 
> default settings, the MR job will get failed as following exceptions:
> {noformat}
> 2017-08-16 13:00:03,846 INFO mapreduce.Job: Job job_1502913552390_0001 
> running in uber mode : false
> 2017-08-16 13:00:03,847 INFO mapreduce.Job:  map 0% reduce 0%
> 2017-08-16 13:00:03,864 INFO mapreduce.Job: Job job_1502913552390_0001 failed 
> with state FAILED due to: Application application_1502913552390_0001 failed 2 
> times due to AM Container for appattempt_1502913552390_0001_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: [2017-08-16 13:00:02.963]Exception from 
> container-launch.
> Container id: container_1502913552390_0001_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:994)
>   at org.apache.hadoop.util.Shell.run(Shell.java:887)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1212)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:295)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:455)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:275)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:90)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This is because mapreduce related jar are not added into yarn setup by 
> default. To make MR job run successful, we need to add following 
> configurations to yarn-site.xml now:
> {noformat}
> 
>   yarn.application.classpath
>   
> ...
> /share/hadoop/mapreduce/*,
> /share/hadoop/mapreduce/lib/*
> ...
>   
> {noformat}
> But this config is not necessary for previous version of Hadoop. We should 
> fix this issue before beta release otherwise it will be a regression for 
> configuration changes.
> This could be more like a YARN issue (if so, we should move), depends on how 
> we fix it finally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6941) The default setting doesn't work for MapReduce job

2017-08-25 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142217#comment-16142217
 ] 

Andrew Wang commented on MAPREDUCE-6941:


Thanks Ray. Should we just close this then? Or are the docs still lacking in 
some way?

> The default setting doesn't work for MapReduce job
> --
>
> Key: MAPREDUCE-6941
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6941
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Junping Du
>Priority: Blocker
>
> On the deployment of hadoop 3 cluster (based on current trunk branch) with 
> default settings, the MR job will get failed as following exceptions:
> {noformat}
> 2017-08-16 13:00:03,846 INFO mapreduce.Job: Job job_1502913552390_0001 
> running in uber mode : false
> 2017-08-16 13:00:03,847 INFO mapreduce.Job:  map 0% reduce 0%
> 2017-08-16 13:00:03,864 INFO mapreduce.Job: Job job_1502913552390_0001 failed 
> with state FAILED due to: Application application_1502913552390_0001 failed 2 
> times due to AM Container for appattempt_1502913552390_0001_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: [2017-08-16 13:00:02.963]Exception from 
> container-launch.
> Container id: container_1502913552390_0001_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:994)
>   at org.apache.hadoop.util.Shell.run(Shell.java:887)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1212)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:295)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:455)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:275)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:90)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This is because mapreduce related jar are not added into yarn setup by 
> default. To make MR job run successful, we need to add following 
> configurations to yarn-site.xml now:
> {noformat}
> 
>   yarn.application.classpath
>   
> ...
> /share/hadoop/mapreduce/*,
> /share/hadoop/mapreduce/lib/*
> ...
>   
> {noformat}
> But this config is not necessary for previous version of Hadoop. We should 
> fix this issue before beta release otherwise it will be a regression for 
> configuration changes.
> This could be more like a YARN issue (if so, we should move), depends on how 
> we fix it finally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6901) Remove @deprecated tags from DistributedCache

2017-08-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113648#comment-16113648
 ] 

Andrew Wang commented on MAPREDUCE-6901:


A little ping since this is marked as critical and the patch looks ready for 
review. [~jlowe] or [~rkanter]?

> Remove @deprecated tags from DistributedCache
> -
>
> Key: MAPREDUCE-6901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6901
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: distributed-cache
>Affects Versions: 3.0.0-alpha3
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Critical
> Attachments: MAPREDUCE-6901.001.patch
>
>
> Doing this as part of Hadoop 3 cleanup.
> DistributedCache has been marked as deprecated forever to the point where the 
> change that did it isn't in Git.
> I don't really have a preference for whether we remove it or not, but I'd 
> like to have a discussion and have it properly documented as a release not 
> for Hadoop 3 before we hit final release.  At the very least we can have a 
> Release Note that will sum up whatever discussion we have here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException

2017-07-31 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108021#comment-16108021
 ] 

Andrew Wang commented on MAPREDUCE-6288:


Thanks for handling the reverts Junping. I filed and linked MAPREDUCE-6924 so 
the reverts show up in the beta1 changelog, since I believe they were included 
in the alpha releases.

> mapred job -status fails with AccessControlException 
> -
>
> Key: MAPREDUCE-6288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Priority: Blocker
> Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, 
> MAPREDUCE-6288.patch
>
>
> After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred 
> job -status job_1427080398288_0001}}
> {noformat}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> Permission denied: user=jenkins, access=EXECUTE, 
> inode="/user/history/done":mapred:hadoop:drwxrwx---
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)
>   at 

[jira] [Created] (MAPREDUCE-6924) Revert MAPREDUCE-6199 MAPREDUCE-6286 and MAPREDUCE-5875

2017-07-31 Thread Andrew Wang (JIRA)
Andrew Wang created MAPREDUCE-6924:
--

 Summary: Revert MAPREDUCE-6199 MAPREDUCE-6286 and MAPREDUCE-5875
 Key: MAPREDUCE-6924
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6924
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Assignee: Junping Du


Filing this JIRA so the reverts show up in the changelog.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6924) Revert MAPREDUCE-6199 MAPREDUCE-6286 and MAPREDUCE-5875

2017-07-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6924.

   Resolution: Fixed
Fix Version/s: 3.0.0-beta1

Resolving this changelog tracking JIRA. Thanks to [~djp] for doing the reverts!

> Revert MAPREDUCE-6199 MAPREDUCE-6286 and MAPREDUCE-5875
> ---
>
> Key: MAPREDUCE-6924
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6924
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Junping Du
> Fix For: 3.0.0-beta1
>
>
> Filing this JIRA so the reverts show up in the changelog.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6734) Add option to distcp to preserve file path structure of source files at the destination

2017-07-07 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6734:
---
Fix Version/s: (was: 3.0.0-alpha4)

> Add option to distcp to preserve file path structure of source files at the 
> destination
> ---
>
> Key: MAPREDUCE-6734
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6734
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.0.0-alpha2
> Environment: Software platform
>Reporter: Frederick Tucker
>  Labels: distcp, newbie, patch
> Attachments: MAPREDUCE-6734.3.0.0-alpha2.patch, 
> MAPREDUCE-6734.3.0.0-alpha2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When copying files using distcp with globbed source files, all the matched 
> files in the glob are copied in a single flat directory.  This causes 
> problems when the file structure at the source is important.  It also is an 
> issue when there are two files matched in the glob with the same name because 
> it causes a duplicate file error at the target.  I'd like to have an option 
> to preserve the file structure of the source files when globbing inputs.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6697) Concurrent task limits should only be applied when necessary

2017-06-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6697:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

I'm going to close this so I can roll a release, please re-open if you need a 
Jenkins run after.

> Concurrent task limits should only be applied when necessary
> 
>
> Key: MAPREDUCE-6697
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6697
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: Nathan Roberts
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: MAPREDUCE-6697-v1.patch
>
>
> The concurrent task limit feature should only adjust the ANY portion of the 
> AM heartbeat ask when a limit is truly necessary, otherwise extraneous 
> containers could be allocated by the RM to the AM adding some overhead to 
> both.  Specifying a concurrent task limit that is beyond the total number of 
> tasks in the job should be the same as asking for no limit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6829) Add peak memory usage counter for each task

2017-04-21 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6829:
---
Fix Version/s: 3.0.0-alpha3

> Add peak memory usage counter for each task
> ---
>
> Key: MAPREDUCE-6829
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6829
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Yufei Gu
>Assignee: Miklos Szegedi
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6829.000.patch, MAPREDUCE-6829.001.patch, 
> MAPREDUCE-6829.002.patch, MAPREDUCE-6829.003.patch, MAPREDUCE-6829.004.patch, 
> MAPREDUCE-6829.005.patch
>
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.
> This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN 
> JIRA since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less 
> assumes a branch-1 architecture, should be close at this point.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException

2017-03-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950343#comment-15950343
 ] 

Andrew Wang commented on MAPREDUCE-6288:


Pinging this JIRA as it's still marked as a blocker for 3.x and unassigned. Is 
anyone planning on picking it up?

> mapred job -status fails with AccessControlException 
> -
>
> Key: MAPREDUCE-6288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Priority: Blocker
> Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, 
> MAPREDUCE-6288.patch
>
>
> After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred 
> job -status job_1427080398288_0001}}
> {noformat}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> Permission denied: user=jenkins, access=EXECUTE, 
> inode="/user/history/done":mapred:hadoop:drwxrwx---
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257)
>   at 

[jira] [Updated] (MAPREDUCE-6873) MR Job Submission Fails if MR framework application path not on defaultFS

2017-03-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6873:
---
Fix Version/s: (was: 3.0.0-beta1)
   3.0.0-alpha3

> MR Job Submission Fails if MR framework application path not on defaultFS
> -
>
> Key: MAPREDUCE-6873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6873
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.6.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Fix For: 2.9.0, 2.7.4, 2.8.1, 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6873.000.patch
>
>
> {{JobSubmitter#addMRFrameworkPathToDistributedCache()}} assumes that 
> {{mapreduce.framework.application.path}} has a FS which matches 
> {{fs.defaultFS}} which may not always be true. This is just a consequence of 
> using {{FileSystem.get(Configuration)}} instead of {{FileSystem.get(URI, 
> Configuration)}}. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6101) on job submission, if input or output directories are encrypted, shuffle data should be encrypted at rest

2017-03-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6101:
---
Status: Open  (was: Patch Available)

> on job submission, if input or output directories are encrypted, shuffle data 
> should be encrypted at rest
> -
>
> Key: MAPREDUCE-6101
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6101
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: job submission, security
>Affects Versions: 2.6.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: MAPREDUCE-6101.1.patch, MAPREDUCE-6101.2.patch
>
>
> Currently setting shuffle data at rest encryption has to be done explicitly 
> to work. If not set explicitly (ON or OFF) but the input or output HDFS 
> directories of the job are in an encrption zone, we should set it to ON.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6854) Each map task should create a unique temporary name that includes an object name

2017-03-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6854:
---
Target Version/s: 3.0.0-alpha3  (was: 3.0.0-alpha2)

> Each map task should create a unique temporary name that includes an object 
> name
> 
>
> Key: MAPREDUCE-6854
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6854
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.0.0-alpha2
>Reporter: Gil Vernik
>  Labels: patch
> Attachments: HADOOP-6854-001.patch, HADOOP-6854-002.patch
>
>
> Consider an example: a local file "/data/a.txt"  need to be copied into 
> swift://container.service/data/a.txt
> The way distcp works is that first it will upload "/data/a.txt" into 
> swift://container.mil01/data/.distcp.tmp.attempt_local2036034928_0001_m_00_0
> Upon completion distcp will move   
> swift://container.mil01/data/.distcp.tmp.attempt_local2036034928_0001_m_00_0
>  into swift://container.mil01/data/a.txt
> 
> The temporary file naming convention assumes that each map task will 
> sequentially create objects as swift://container.mil01/.distcp.tmp.attempt_ID
> and then rename them to the final names.  Most of Hadoop eco system 
> components use object.name which is part of the temporary name, however 
> distcp doesn't use such approach.
> This JIRA propose to add a configuration key indicating that temporary 
> objects will also include object name as part of their temporary file name,
> For example
> "/data/a.txt" will be uploaded into 
> "swift://container.mil01/data/a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0"
> "a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0" doesn't affects 
> flows in the access drivers, since "a.txt" is not considered as a 
> sub-directory so no special operations will be taken. 
> The benefits of the patch :
> 1. Temp object names will be better distributed in object stores, since they 
> all has different prefix.
> 2. Sometimes it's not possible to debug what data is copied and what failed. 
> Sometimes temp files are not renamed, it will be much better if expecting 
> temp name - one can figure what object names were copied.
> 3. Different systems may expect 
> "a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0" and extract value 
> prior "distcp.tmp" thus getting destination object name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6857) Reduce number of exists() calls on the target object

2017-03-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6857:
---
Target Version/s: 3.0.0-alpha3  (was: 3.0.0-alpha2)
   Fix Version/s: (was: 3.0.0-alpha2)

> Reduce number of exists() calls on the target object
> 
>
> Key: MAPREDUCE-6857
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6857
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.0.0-alpha2
>Reporter: Gil Vernik
> Attachments: HADOOP-6857-002.patch
>
>
> CopyMapper.map(..) calls targetStatus = targetFS.getFileStatus(target).
> Few steps later RetriableFileCopyCommand.promoteTmpToTarget(..) will call 
> again exists(target) and delete if present. 
> The second exists() is useless, since if targetStatus==null it can be easily 
> seen  if overwrite mode is activated and so target object can be deleted.
> The propose of this patch is to delete target object by using targetStatus 
> and thus avoid calling exists() method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6854) Each map task should create a unique temporary name that includes an object name

2017-03-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6854:
---
Fix Version/s: (was: 3.0.0-alpha2)

> Each map task should create a unique temporary name that includes an object 
> name
> 
>
> Key: MAPREDUCE-6854
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6854
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.0.0-alpha2
>Reporter: Gil Vernik
>  Labels: patch
> Attachments: HADOOP-6854-001.patch, HADOOP-6854-002.patch
>
>
> Consider an example: a local file "/data/a.txt"  need to be copied into 
> swift://container.service/data/a.txt
> The way distcp works is that first it will upload "/data/a.txt" into 
> swift://container.mil01/data/.distcp.tmp.attempt_local2036034928_0001_m_00_0
> Upon completion distcp will move   
> swift://container.mil01/data/.distcp.tmp.attempt_local2036034928_0001_m_00_0
>  into swift://container.mil01/data/a.txt
> 
> The temporary file naming convention assumes that each map task will 
> sequentially create objects as swift://container.mil01/.distcp.tmp.attempt_ID
> and then rename them to the final names.  Most of Hadoop eco system 
> components use object.name which is part of the temporary name, however 
> distcp doesn't use such approach.
> This JIRA propose to add a configuration key indicating that temporary 
> objects will also include object name as part of their temporary file name,
> For example
> "/data/a.txt" will be uploaded into 
> "swift://container.mil01/data/a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0"
> "a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0" doesn't affects 
> flows in the access drivers, since "a.txt" is not considered as a 
> sub-directory so no special operations will be taken. 
> The benefits of the patch :
> 1. Temp object names will be better distributed in object stores, since they 
> all has different prefix.
> 2. Sometimes it's not possible to debug what data is copied and what failed. 
> Sometimes temp files are not renamed, it will be much better if expecting 
> temp name - one can figure what object names were copied.
> 3. Different systems may expect 
> "a.txt.distcp.tmp.attempt_local2036034928_0001_m_00_0" and extract value 
> prior "distcp.tmp" thus getting destination object name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6734) Add option to distcp to preserve file path structure of source files at the destination

2017-01-25 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6734:
---
Fix Version/s: (was: 3.0.0-alpha2)
   3.0.0-alpha3

> Add option to distcp to preserve file path structure of source files at the 
> destination
> ---
>
> Key: MAPREDUCE-6734
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6734
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.0.0-alpha2
> Environment: Software platform
>Reporter: Frederick Tucker
>  Labels: distcp, newbie, patch
> Fix For: 3.0.0-alpha3
>
> Attachments: MAPREDUCE-6734.3.0.0-alpha2.patch, 
> MAPREDUCE-6734.3.0.0-alpha2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When copying files using distcp with globbed source files, all the matched 
> files in the glob are copied in a single flat directory.  This causes 
> problems when the file structure at the source is important.  It also is an 
> issue when there are two files matched in the glob with the same name because 
> it causes a duplicate file error at the target.  I'd like to have an option 
> to preserve the file structure of the source files when globbing inputs.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6728) Give fetchers hint when ShuffleHandler rejects a shuffling connection

2017-01-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6728:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolving this so it gets picked up in the 3.0.0-alpha2 release notes. Please 
reopen if/when you need a branch-2 precommit run.

> Give fetchers hint when ShuffleHandler rejects a shuffling connection
> -
>
> Key: MAPREDUCE-6728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6728
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: mapreduce6728.001.patch, mapreduce6728.002.patch, 
> mapreduce6728.003.patch, mapreduce6728.004.patch, mapreduce6728.005.patch, 
> mapreduce6728.006.patch, MAPREDUCE-6728-branch-2.8.06.patch, 
> mapreduce6728.branch-2.8.patch, mapreduce6728.prelim.patch
>
>
> If # of open shuffle connection to a node goes over the max, ShuffleHandler 
> closes the connection immediately without giving fetchers any hint of the 
> reason, which causes fetchers to fail due to exceptions 
> java.net.SocketException: Unexpected end of file from server
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:772)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:430)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:395)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:266)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:323)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> OR 
> java.net.SocketException: Connection reset
>   at java.net.SocketInputStream.read(SocketInputStream.java:196)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323)
>   at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:430)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:395)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:266)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java
> Such failures are counted as fetcher failures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6791) remove unnecessary dependency from hadoop-mapreduce-client-jobclient to hadoop-mapreduce-client-shuffle

2017-01-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6791:
---
Release Note: An unnecessary dependency on hadoop-mapreduce-client-shuffle 
in hadoop-mapreduce-client-jobclient has been removed.

> remove unnecessary dependency from hadoop-mapreduce-client-jobclient to 
> hadoop-mapreduce-client-shuffle
> ---
>
> Key: MAPREDUCE-6791
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6791
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>  Labels: Incompatible
> Fix For: 3.0.0-alpha2
>
> Attachments: mapreduce6791.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6704) Container fail to launch for mapred application

2016-12-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752741#comment-15752741
 ] 

Andrew Wang commented on MAPREDUCE-6704:


Ping, we getting close on resolving this JIRA?

> Container fail to launch for mapred application
> ---
>
> Key: MAPREDUCE-6704
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6704
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-MAPREDUCE-6704.patch, 0001-YARN-5026.patch, 
> ClusterSetup.html, MAPREDUCE-6704.0002.patch, MR-6704-branch2.8.tar.gz, 
> MR-6704-trunk-tempPatch.tar.gz, MR-6704-trunk.tar.gz, SingleCluster.html, 
> container-whitelist-env-wip.patch, temp.patch
>
>
> Container fail to launch for mapred application.
> As part for launch script {{HADOOP_MAPRED_HOME}} default value is not set 
> .After 
> https://github.com/apache/hadoop/commit/9d4d30243b0fc9630da51a2c17b543ef671d035c
>{{HADOOP_MAPRED_HOME}} is not able to get from {{builder.environment()}} 
> since {{DefaultContainerExecutor#buildCommandExecutor}} sets inherit to false.
> {noformat}
> 16/05/02 09:16:05 INFO mapreduce.Job: Job job_1462155939310_0004 failed with 
> state FAILED due to: Application application_1462155939310_0004 failed 2 
> times due to AM Container for appattempt_1462155939310_0004_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: Exception from container-launch.
> Container id: container_1462155939310_0004_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:946)
> at org.apache.hadoop.util.Shell.run(Shell.java:850)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1144)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:385)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:281)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:89)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6288) mapred job -status fails with AccessControlException

2016-12-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6288:
---
Target Version/s: 2.8.0, 3.0.0-beta1  (was: 2.8.0, 3.0.0-alpha2)

I'm going to retarget this for 3.0.0-beta1 rather than 3.0.0-alpha2, since it 
won't be a regression from alpha1.

Is someone actively working on this JIRA? If not, I'd like to revert this code 
out of trunk and branch-2 too rather than kicking the can down the road every 
release.

> mapred job -status fails with AccessControlException 
> -
>
> Key: MAPREDUCE-6288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: MAPREDUCE-6288-gera-001.patch, MAPREDUCE-6288.002.patch, 
> MAPREDUCE-6288.patch
>
>
> After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred 
> job -status job_1427080398288_0001}}
> {noformat}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> Permission denied: user=jenkins, access=EXECUTE, 
> inode="/user/history/done":mapred:hadoop:drwxrwx---
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191)
>   at 
> 

[jira] [Updated] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.

2016-12-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6565:
---
Fix Version/s: 3.0.0-alpha2

> Configuration to use host name in delegation token service is not read from 
> job.xml during MapReduce job execution.
> ---
>
> Key: MAPREDUCE-6565
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Chris Nauroth
>Assignee: Li Lu
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6565-trunk.001.patch
>
>
> By default, the service field of a delegation token is populated based on 
> server IP address.  Setting {{hadoop.security.token.service.use_ip}} to 
> {{false}} changes this behavior to use host name instead of IP address.  
> However, this configuration property is not read from job.xml.  Instead, it's 
> read from a separate {{Configuration}} instance created during static 
> initialization of {{SecurityUtil}}.  This does not work correctly with 
> MapReduce jobs if the framework is distributed by setting 
> {{mapreduce.application.framework.path}} and the 
> {{mapreduce.application.classpath}} is isolated to avoid reading 
> core-site.xml from the cluster nodes.  MapReduce tasks will fail to 
> authenticate to HDFS, because they'll try to find a delegation token based on 
> the NameNode IP address, even though at job submission time the tokens were 
> generated using the host name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6288) mapred job -status fails with AccessControlException

2016-12-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6288:
---
Target Version/s: 2.8.0, 3.0.0-alpha2  (was: 2.8.0)

> mapred job -status fails with AccessControlException 
> -
>
> Key: MAPREDUCE-6288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: MAPREDUCE-6288-gera-001.patch, MAPREDUCE-6288.002.patch, 
> MAPREDUCE-6288.patch
>
>
> After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred 
> job -status job_1427080398288_0001}}
> {noformat}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> Permission denied: user=jenkins, access=EXECUTE, 
> inode="/user/history/done":mapred:hadoop:drwxrwx---
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1490)
>   at 
> 

[jira] [Updated] (MAPREDUCE-6682) TestMRCJCFileOutputCommitter fails intermittently

2016-11-21 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6682:
---
Fix Version/s: (was: 3.0.0-alpha2)
   3.0.0-alpha1

> TestMRCJCFileOutputCommitter fails intermittently
> -
>
> Key: MAPREDUCE-6682
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6682
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Akira Ajisaka
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6682.00.patch, MAPREDUCE-6682.01.patch, 
> MAPREDUCE-6682.02.patch, MAPREDUCE-6682.03.patch, MAPREDUCE-6682.04.patch
>
>
> {noformat}
> java.lang.AssertionError: Output directory not empty expected:<0> but was:<4>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapred.TestMRCJCFileOutputCommitter.testAbort(TestMRCJCFileOutputCommitter.java:153)
> {noformat}
> *PreCommit Report* 
> https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6434/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException

2016-11-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668967#comment-15668967
 ] 

Andrew Wang commented on MAPREDUCE-6288:


Hi folks, anything more to be said about this JIRA? It's marked as a blocker, 
and there's been no action for over a year.

> mapred job -status fails with AccessControlException 
> -
>
> Key: MAPREDUCE-6288
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: MAPREDUCE-6288-gera-001.patch, MAPREDUCE-6288.002.patch, 
> MAPREDUCE-6288.patch
>
>
> After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred 
> job -status job_1427080398288_0001}}
> {noformat}
> Exception in thread "main" org.apache.hadoop.security.AccessControlException: 
> Permission denied: user=jenkins, access=EXECUTE, 
> inode="/user/history/done":mapred:hadoop:drwxrwx---
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257)
>   at 

[jira] [Updated] (MAPREDUCE-6467) Submitting streaming job is not thread safe

2016-11-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6467:
---
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)

> Submitting streaming job is not thread safe
> ---
>
> Key: MAPREDUCE-6467
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6467
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission
>Affects Versions: 2.7.1
>Reporter: jeremie simon
>Assignee: Ivo Udelsmann
>Priority: Minor
>  Labels: easyfix, streaming, thread-safety
> Attachments: MAPREDUCE-6467.001.patch
>
>
> The submission of the streaming job is not thread safe. 
> That is because the class StreamJob is using the OptionBuilder which is 
> itself not thread safe. 
> This can cause super tricky bugs. 
> An easy fix would be to simply create instances of Option through the normal 
> constructor and decorate the object if necessary. 
> This fix should be applied on the functions createOption and 
> createBoolOption. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6704) Container fail to launch for mapred application

2016-10-21 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596201#comment-15596201
 ] 

Andrew Wang commented on MAPREDUCE-6704:


Folks, is there any progress we can make on this JIRA? That this doesn't work 
out of the box anymore has been very surprising to our users. I'd like to get 
it fixed for alpha2 if possible.

> Container fail to launch for mapred application
> ---
>
> Key: MAPREDUCE-6704
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6704
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-MAPREDUCE-6704.patch, 0001-YARN-5026.patch
>
>
> Container fail to launch for mapred application.
> As part for launch script {{HADOOP_MAPRED_HOME}} default value is not set 
> .After 
> https://github.com/apache/hadoop/commit/9d4d30243b0fc9630da51a2c17b543ef671d035c
>{{HADOOP_MAPRED_HOME}} is not able to get from {{builder.environment()}} 
> since {{DefaultContainerExecutor#buildCommandExecutor}} sets inherit to false.
> {noformat}
> 16/05/02 09:16:05 INFO mapreduce.Job: Job job_1462155939310_0004 failed with 
> state FAILED due to: Application application_1462155939310_0004 failed 2 
> times due to AM Container for appattempt_1462155939310_0004_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: Exception from container-launch.
> Container id: container_1462155939310_0004_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:946)
> at org.apache.hadoop.util.Shell.run(Shell.java:850)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1144)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:385)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:281)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:89)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6536) hadoop-pipes doesn't use maven properties for openssl

2016-10-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583966#comment-15583966
 ] 

Andrew Wang commented on MAPREDUCE-6536:


Ping on this JIRA. Looks like it's pretty close, should we target for alpha2?

> hadoop-pipes doesn't use maven properties for openssl
> -
>
> Key: MAPREDUCE-6536
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6536
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 3.0.0-alpha1
> Environment: OS X
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
> Attachments: HADOOP-12518.00.patch, HADOOP-12518.01.patch, 
> HADOOP-12518.02.patch, HADOOP-12518.03.patch, MAPREDUCE-6536.04.patch
>
>
> hadoop-common has some maven properties that are used to define where OpenSSL 
> lives.  hadoop-pipes should also use them so we can enable automated testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-5506) Hadoop-1.1.1 occurs ArrayIndexOutOfBoundsException with MultithreadedMapRunner

2016-10-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-5506.

Resolution: Won't Fix

Resolving as WONTFIX since mr1 has been removed.

> Hadoop-1.1.1 occurs ArrayIndexOutOfBoundsException with MultithreadedMapRunner
> --
>
> Key: MAPREDUCE-5506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5506
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.1.1
> Environment: RHEL 6.3 x86_64
>Reporter: sam liu
>Priority: Blocker
>
> After I set:
> - 'jobConf.setMapRunnerClass(MultithreadedMapRunner.class);' in MR app
> - 'mapred.map.multithreadedrunner.threads = 2' in mapred-site.xml
> A simple MR app failed as its Map task encountered 
> ArrayIndexOutOfBoundsException as below(please ignore the line numbers in the 
> exception as I added some log print codes):
> java.lang.ArrayIndexOutOfBoundsException
> at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1331)
> at java.io.DataOutputStream.write(DataOutputStream.java:101)
> at org.apache.hadoop.io.Text.write(Text.java:282)
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
> at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
> at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1060)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:591)
> at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:41)
> at study.hadoop.mapreduce.sample.WordCount$Map.map(WordCount.java:1)
> at 
> org.apache.hadoop.mapred.lib.MultithreadedMapRunner$MapperInvokeRunable.run(MultithreadedMapRunner.java:231)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:897)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:919)
> at java.lang.Thread.run(Thread.java:738)
> And the exception happens on line 'System.arraycopy(b, off, kvbuffer, 
> bufindex, len)' in MapTask.java#MapOutputBuffer#Buffer#write(). When the 
> exception occurs, 'b.length=4' but 'len=9'. 
> Btw, if I set 'mapred.map.multithreadedrunner.threads = 1', no exception 
> happened. So it should be an issue caused by multiple threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6792:
---
Target Version/s: 2.9.0, 3.0.0-alpha2  (was: 2.9.0, 3.0.0-alpha1)

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells

2016-10-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6458:
---
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)

> Figure out the way to pass build-in classpath (files in distributed cache, 
> etc.) from parent to spawned shells
> --
>
> Key: MAPREDUCE-6458
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Dustin Cote
> Attachments: MAPREDUCE-6458.00.patch
>
>
> In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints 
> to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, 
> so jars in distributed cache can still work in child tasks. In trunk, we may 
> think some way different, like: involve additional env var to safely pass 
> build-in classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO

2016-08-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6729:
---
Flags:   (was: Important)

> Accurately compute the test execute time in DFSIO
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Affects Versions: 2.9.0
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code} 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6701) application master log can not be available when clicking jobhistory's am logs link

2016-08-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6701:
---
Flags: Patch  (was: Patch,Important)

> application master log can not be available when clicking jobhistory's am 
> logs link
> ---
>
> Key: MAPREDUCE-6701
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6701
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.9.0
>Reporter: chenyukang
>Assignee: Haibo Chen
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: yarn5041.001.patch, yarn5041.002.patch
>
>
> In history server webapp, application master logs link is wrong. it shows "No 
> logs available for container container_1462419429440_0003_01_01".  It 
> direct to a wrong nodemanager http port instead of a node manager' container 
> managerment port. I think YARN-4701 brought this bug



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6714) Refactor UncompressedSplitLineReader.fillBuffer()

2016-08-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447501#comment-15447501
 ] 

Andrew Wang commented on MAPREDUCE-6714:


FYI for git greppers, this was typo'd as MAPREDUCE-6741 in the message.

> Refactor UncompressedSplitLineReader.fillBuffer()
> -
>
> Key: MAPREDUCE-6714
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6714
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6714.001.patch
>
>
> MAPREDUCE-6635 made this change:
> {code}
> -  maxBytesToRead = Math.min(maxBytesToRead,
> -(int)(splitLength - totalBytesRead));
> +  long leftBytesForSplit = splitLength - totalBytesRead;
> +  // check if leftBytesForSplit exceed Integer.MAX_VALUE
> +  if (leftBytesForSplit <= Integer.MAX_VALUE) {
> +maxBytesToRead = Math.min(maxBytesToRead, (int)leftBytesForSplit);
> +  }
> {code}
> The result is one more comparison than necessary and code that's a little 
> convoluted.  The code can be simplified as:
> {code}
>   long leftBytesForSplit = splitLength - totalBytesRead;
>   if (leftBytesForSplit < maxBytesToRead) {
> maxBytesToRead = (int)leftBytesForSplit;
>   }
> {code}
> The comparison will auto promote {{maxBytesToRead}}, making it safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Reopened] (MAPREDUCE-6462) JobHistoryServer to support JvmPauseMonitor as a service

2016-08-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened MAPREDUCE-6462:


> JobHistoryServer to support JvmPauseMonitor as a service
> 
>
> Key: MAPREDUCE-6462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6462
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.8.0
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Minor
> Attachments: 0001-MAPREDUCE-6462.patch, 0002-MAPREDUCE-6462.patch, 
> HADOOP-12321-003.patch, HADOOP-12321-005-aggregated.patch
>
>
> As JvmPauseMonitor is made as an AbstractService, subsequent method changes 
> are needed in all places which uses the monitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6462) JobHistoryServer to support JvmPauseMonitor as a service

2016-08-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6462.

   Resolution: Duplicate
Fix Version/s: (was: 2.9.0)

> JobHistoryServer to support JvmPauseMonitor as a service
> 
>
> Key: MAPREDUCE-6462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6462
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.8.0
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Minor
> Attachments: 0001-MAPREDUCE-6462.patch, 0002-MAPREDUCE-6462.patch, 
> HADOOP-12321-003.patch, HADOOP-12321-005-aggregated.patch
>
>
> As JvmPauseMonitor is made as an AbstractService, subsequent method changes 
> are needed in all places which uses the monitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6454) MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.

2016-08-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447492#comment-15447492
 ] 

Andrew Wang commented on MAPREDUCE-6454:


Hi folks, I noticed this JIRA is not present in trunk, though Vinod's comment 
says:

{quote}
Committed this to trunk, branch-2, 2.7 and 2.6. Thanks Junping.
{quote}

Based on the above discussion, I think this was only intended for branch-2 and 
thus git is correct, but I would appreciate clarification.

> MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.
> 
>
> Key: MAPREDUCE-6454
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6454
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.2, 2.6.2
>
> Attachments: MAPREDUCE-6454-v2.1.patch, MAPREDUCE-6454-v2.patch, 
> MAPREDUCE-6454-v3.1.patch, MAPREDUCE-6454-v3.patch, MAPREDUCE-6454.patch
>
>
> We already set lib jars on distributed-cache to CLASSPATH. However, in some 
> corner cases (like: MR local mode, Hive Map side local join, etc.), we need 
> these jars on HADOOP_CLASSPATH so hadoop scripts can take it in launching 
> runjar process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6458) Figure out the way to pass build-in classpath (files in distributed cache, etc.) from parent to spawned shells

2016-08-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6458:
---
Target Version/s: 3.0.0-alpha1  (was: )

> Figure out the way to pass build-in classpath (files in distributed cache, 
> etc.) from parent to spawned shells
> --
>
> Key: MAPREDUCE-6458
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6458
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Dustin Cote
> Attachments: MAPREDUCE-6458.00.patch
>
>
> In MAPREDUCE-6454 (target for branch-2.x), we provide a way with constraints 
> to pass built-in classpath from parent to child shell, via HADOOP_CLASSPATH, 
> so jars in distributed cache can still work in child tasks. In trunk, we may 
> think some way different, like: involve additional env var to safely pass 
> build-in classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6704) Container fail to launch for mapred application

2016-07-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6704:
---
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)

> Container fail to launch for mapred application
> ---
>
> Key: MAPREDUCE-6704
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6704
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-MAPREDUCE-6704.patch, 0001-YARN-5026.patch
>
>
> Container fail to launch for mapred application.
> As part for launch script {{HADOOP_MAPRED_HOME}} default value is not set 
> .After 
> https://github.com/apache/hadoop/commit/9d4d30243b0fc9630da51a2c17b543ef671d035c
>{{HADOOP_MAPRED_HOME}} is not able to get from {{builder.environment()}} 
> since {{DefaultContainerExecutor#buildCommandExecutor}} sets inherit to false.
> {noformat}
> 16/05/02 09:16:05 INFO mapreduce.Job: Job job_1462155939310_0004 failed with 
> state FAILED due to: Application application_1462155939310_0004 failed 2 
> times due to AM Container for appattempt_1462155939310_0004_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: Exception from container-launch.
> Container id: container_1462155939310_0004_02_01
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:946)
> at org.apache.hadoop.util.Shell.run(Shell.java:850)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1144)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:385)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:281)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:89)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-4683) We need to fix our build to create/distribute hadoop-mapreduce-client-core-tests.jar

2016-07-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-4683:
---
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)

> We need to fix our build to create/distribute 
> hadoop-mapreduce-client-core-tests.jar
> 
>
> Key: MAPREDUCE-4683
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4683
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Reporter: Arun C Murthy
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: MAPREDUCE-4683.patch
>
>
> We need to fix our build to create/distribute 
> hadoop-mapreduce-client-core-tests.jar, need this before MAPREDUCE-4253



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-4522) DBOutputFormat Times out on large batch inserts

2016-07-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-4522:
---
Fix Version/s: (was: 3.0.0-alpha1)

> DBOutputFormat Times out on large batch inserts
> ---
>
> Key: MAPREDUCE-4522
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4522
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task-controller
>Affects Versions: 0.20.205.0
>Reporter: Nathan Jarus
>Assignee: Shyam Gavulla
>  Labels: newbie
> Attachments: MAPREDUCE-4522.001.patch
>
>
> In DBRecordWriter#close(), progress is never updated. In large batch inserts, 
> this can cause the reduce task to time out due to the amount of time it takes 
> the SQL engine to process that insert. 
> Potential solutions I can see:
> Don't batch inserts; do the insert when DBRecordWriter#write() is called 
> (awful)
> Spin up a thread in DBRecordWriter#close() and update progress in that. 
> (gross)
> I can provide code for either if you're interested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6313) Audit/optimize tests in hadoop-mapreduce-client-jobclient

2016-07-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6313:
---
Fix Version/s: (was: 3.0.0-alpha1)

> Audit/optimize tests in hadoop-mapreduce-client-jobclient
> -
>
> Key: MAPREDUCE-6313
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6313
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Allen Wittenauer
>Assignee: nijel
>  Labels: newbie
>
> The tests in this package take an extremely long time to run, with some tests 
> taking 15-20 minutes on their own.  It would be worthwhile to verify and 
> optimize any tests in this package in order to reduce patch testing time or 
> perhaps even splitting the package up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6274) [Rumen] Support compact property description in configuration XML

2016-07-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6274:
---
Fix Version/s: (was: 3.0.0-alpha1)

> [Rumen] Support compact property description in configuration XML
> -
>
> Key: MAPREDUCE-6274
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6274
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tools/rumen
>Reporter: Kengo Seki
>Assignee: Shen Yinjie
>  Labels: newbie, rumen
>
> HADOOP-6964 made it possible to define configuration properties using XML 
> attributes, but Rumen has own configuration parsers and they don’t recognize 
> XML attributes. So it would be better to support the new description.
> We can simply apply the same modification as HADOOP-6964 to Rumen, but it 
> might be worth considering making the parse function in common (also with a 
> part of o.a.h.conf.Configuration.loadResource(), if possible), because Rumen 
> has similar codes in JobConfigurationParser and ParsedConfigFile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-4695) Fix LocalRunner on trunk after MAPREDUCE-3223 broke it

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-4695:
---
Component/s: test

> Fix LocalRunner on trunk after MAPREDUCE-3223 broke it
> --
>
> Key: MAPREDUCE-4695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Blocker
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-4695.patch, MAPREDUCE-4695.patch
>
>
> MAPREDUCE-3223 removed mapreduce.cluster.local.dir property from 
> mapred-default.xml (since NM local dirs are now used) but failed to counter 
> that LocalJobRunner, etc. still use it.
> {code}
> mr-3223.txt:-  mapreduce.cluster.local.dir
> mr-3223.txt--  ${hadoop.tmp.dir}/mapred/local
> {code}
> All local job tests have been failing since then.
> This JIRA is to reintroduce it or provide an equivalent new config for fixing 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-3149) add a test to verify that buildDTAuthority works for cases with no authority.

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-3149:
---
Component/s: test

> add a test to verify that buildDTAuthority works for cases with no authority.
> -
>
> Key: MAPREDUCE-3149
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3149
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: John George
>Assignee: John George
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-7602.patch
>
>
> Add a test to verify that buildDTAuthority works for cases with no Authority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-2632:
---
Target Version/s:   (was: )
Release Note: A partitioner is now only created if there are multiple 
reducers.

I added a release note based on my understanding of this patch, please update 
if something's off.

> Avoid calling the partitioner when the numReduceTasks is 1.
> ---
>
> Key: MAPREDUCE-2632
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Ravi Teja Ch N V
>Assignee: Sunil G
> Fix For: 3.0.0-alpha1
>
> Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, 
> MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch
>
>
> We can avoid the call to the partitioner when the number of reducers is 
> 1.This will avoid the unnecessary computations by the partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6223:
---
Target Version/s:   (was: )
Hadoop Flags: Reviewed  (was: Incompatible change,Reviewed)

> TestJobConf#testNegativeValueForTaskVmem failures
> -
>
> Key: MAPREDUCE-6223
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha1
>Reporter: Gera Shegalov
>Assignee: Varun Saxena
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch, 
> MAPREDUCE-6223.003.patch, MAPREDUCE-6223.004.patch, MAPREDUCE-6223.005.patch, 
> MAPREDUCE-6223.006.patch
>
>
> {code}
> Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec <<< 
> FAILURE! - in org.apache.hadoop.conf.TestJobConf
> testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
> elapsed: 0.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<-1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-05-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6628:
---
Affects Version/s: 2.6.4
 Target Version/s: 2.8.0

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.6.4
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch, 
> MAPREDUCE-6628.003.patch, MAPREDUCE-6628.004.patch, MAPREDUCE-6628.005.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-05-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270264#comment-15270264
 ] 

Andrew Wang commented on MAPREDUCE-6628:


Thanks for sticking with this for so long Mariappan. The stream-related changes 
overall look good to me. One naming nit, could we call the boolean 
"closeWrapperStream" rather than "ownOutputStream"? I think that's more 
descriptive.

The test should also be JUnit4 rather than JUnit3.

Can someone more familiar with the MR side review the MapTask and unit test 
changes? It'd also be good to get confirmation about the overall idea from an 
MR person.

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch, 
> MAPREDUCE-6628.003.patch, MAPREDUCE-6628.004.patch, MAPREDUCE-6628.005.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6526) Remove usage of metrics v1 from hadoop-mapreduce

2016-05-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267721#comment-15267721
 ] 

Andrew Wang commented on MAPREDUCE-6526:


Still LGTM :) +1

> Remove usage of metrics v1 from hadoop-mapreduce
> 
>
> Key: MAPREDUCE-6526
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6526
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Blocker
> Attachments: MAPREDUCE-6526.00.patch, MAPREDUCE-6526.01.patch, 
> MAPREDUCE-6526.02.patch, MAPREDUCE-6526.03.patch
>
>
> LocalJobRunnerMetrics and ShuffleClientMetrics are still using metrics v1. We 
> should remove these metrics or rewrite them to use metrics v2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6537) Include hadoop-pipes examples in the release tarball

2016-05-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6537:
---
Affects Version/s: (was: 3.0.0)
   2.8.0

> Include hadoop-pipes examples in the release tarball
> 
>
> Key: MAPREDUCE-6537
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6537
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 2.8.0
>Reporter: Allen Wittenauer
>Assignee: Kai Sasaki
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: HADOOP-12381.00.patch
>
>
> Hadoop pipes examples are built but never packaged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6537) Include hadoop-pipes examples in the release tarball

2016-05-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6537:
---
  Resolution: Fixed
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0  (was: 3.0.0)
  Status: Resolved  (was: Patch Available)

This affected branch-2 too, so I committed back through branch-2.8. Thanks 
again Kai for the patch, Allen for finding this issue.

> Include hadoop-pipes examples in the release tarball
> 
>
> Key: MAPREDUCE-6537
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6537
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 2.8.0
>Reporter: Allen Wittenauer
>Assignee: Kai Sasaki
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: HADOOP-12381.00.patch
>
>
> Hadoop pipes examples are built but never packaged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6537) Include hadoop-pipes examples in the release tarball

2016-05-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6537:
---
Summary: Include hadoop-pipes examples in the release tarball  (was: hadoop 
pipes examples aren't in the mvn package tar ball)

> Include hadoop-pipes examples in the release tarball
> 
>
> Key: MAPREDUCE-6537
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6537
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Kai Sasaki
>Priority: Blocker
> Attachments: HADOOP-12381.00.patch
>
>
> Hadoop pipes examples are built but never packaged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6537) hadoop pipes examples aren't in the mvn package tar ball

2016-05-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267636#comment-15267636
 ] 

Andrew Wang commented on MAPREDUCE-6537:


Tested the before and after, LGTM. Will commit shortly, thanks for the 
contribution [~lewuathe]!

> hadoop pipes examples aren't in the mvn package tar ball
> 
>
> Key: MAPREDUCE-6537
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6537
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: Kai Sasaki
>Priority: Blocker
> Attachments: HADOOP-12381.00.patch
>
>
> Hadoop pipes examples are built but never packaged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6526) Remove usage of metrics v1 from hadoop-mapreduce

2016-05-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267590#comment-15267590
 ] 

Andrew Wang commented on MAPREDUCE-6526:


+1 LGTM, thanks Akira!

> Remove usage of metrics v1 from hadoop-mapreduce
> 
>
> Key: MAPREDUCE-6526
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6526
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Blocker
> Attachments: MAPREDUCE-6526.00.patch, MAPREDUCE-6526.01.patch, 
> MAPREDUCE-6526.02.patch
>
>
> LocalJobRunnerMetrics and ShuffleClientMetrics are still using metrics v1. We 
> should remove these metrics or rewrite them to use metrics v2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.

2016-03-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176874#comment-15176874
 ] 

Andrew Wang commented on MAPREDUCE-2632:


[~kasha] mind adding some release notes for this change? Doing some 
3.0.0-related cleanup.

> Avoid calling the partitioner when the numReduceTasks is 1.
> ---
>
> Key: MAPREDUCE-2632
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Ravi Teja Ch N V
>Assignee: Sunil G
> Fix For: 3.0.0
>
> Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, 
> MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch
>
>
> We can avoid the call to the partitioner when the number of reducers is 
> 1.This will avoid the unnecessary computations by the partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2016-03-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176873#comment-15176873
 ] 

Andrew Wang commented on MAPREDUCE-6223:


Same question here as I just posed on MAPREDUCE-6234, do we need to mark this 
change as incompatible if it's only present with MAPREDUCE-5785, which is 
already marked incompatible and only in 3.0.0?

> TestJobConf#testNegativeValueForTaskVmem failures
> -
>
> Key: MAPREDUCE-6223
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Gera Shegalov
>Assignee: Varun Saxena
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch, 
> MAPREDUCE-6223.003.patch, MAPREDUCE-6223.004.patch, MAPREDUCE-6223.005.patch, 
> MAPREDUCE-6223.006.patch
>
>
> {code}
> Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec <<< 
> FAILURE! - in org.apache.hadoop.conf.TestJobConf
> testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
> elapsed: 0.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<-1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6234) TestHighRamJob fails due to the change in MAPREDUCE-5785

2016-03-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176871#comment-15176871
 ] 

Andrew Wang commented on MAPREDUCE-6234:


Should this change be marked incompatible? Sounds like it's fixing an issue 
only presented by MAPREDUCE-5785, which is already marked incompatible and only 
checked into trunk.

> TestHighRamJob fails due to the change in MAPREDUCE-5785
> 
>
> Key: MAPREDUCE-6234
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6234
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/gridmix, mrv2
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-6234.001.patch, MAPREDUCE-6234.002.patch, 
> MAPREDUCE-6234.003.patch
>
>
> TestHighRamJob fails by this.
> {code}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.mapred.gridmix.TestHighRamJob
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.162 sec <<< 
> FAILURE! - in org.apache.hadoop.mapred.gridmix.TestHighRamJob
> testHighRamFeatureEmulation(org.apache.hadoop.mapred.gridmix.TestHighRamJob)  
> Time elapsed: 1.102 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<-1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamConfig(TestHighRamJob.java:98)
>   at 
> org.apache.hadoop.mapred.gridmix.TestHighRamJob.testHighRamFeatureEmulation(TestHighRamJob.java:117)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6637) Testcase Failure : TestFileInputFormat.testSplitLocationInfo

2016-02-19 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155215#comment-15155215
 ] 

Andrew Wang commented on MAPREDUCE-6637:


+1 LGTM thanks Brahma! Committing shortly.

> Testcase Failure : TestFileInputFormat.testSplitLocationInfo
> 
>
> Key: MAPREDUCE-6637
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6637
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: MAPREDUCE-6637.patch
>
>
> Following testcase is failing after HADOOP-12810
> {noformat}
> FAILED:  org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo[0]
> Error Message:
> expected:<2> but was:<1>
> Stack Trace:
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo(TestFileInputFormat.java:115)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6637) Testcase Failure : TestFileInputFormat.testSplitLocationInfo

2016-02-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6637:
---
   Resolution: Fixed
Fix Version/s: 2.7.3
   Status: Resolved  (was: Patch Available)

Pushed to trunk, branch-2, branch-2.8, branch-2.7 for 2.7.3. Thanks for find 
and fix Brahma!

> Testcase Failure : TestFileInputFormat.testSplitLocationInfo
> 
>
> Key: MAPREDUCE-6637
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6637
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-6637.patch
>
>
> Following testcase is failing after HADOOP-12810
> {noformat}
> FAILED:  org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo[0]
> Error Message:
> expected:<2> but was:<1>
> Stack Trace:
> java.lang.AssertionError: expected:<2> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.mapred.TestFileInputFormat.testSplitLocationInfo(TestFileInputFormat.java:115)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-02-08 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15137493#comment-15137493
 ] 

Andrew Wang commented on MAPREDUCE-6628:


[~hitliuyi], any thoughts on this one?

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6455) Unable to use surefire 2.18

2015-08-27 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717082#comment-14717082
 ] 

Andrew Wang commented on MAPREDUCE-6455:


LGTM! Thanks Charlie, I'll revert the original and commit this one down.

 Unable to use surefire  2.18
 -

 Key: MAPREDUCE-6455
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6455
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Charlie Helin
Assignee: Charlie Helin
 Fix For: 3.0.0

 Attachments: mr-6455.1.patch, mr-6455.2.patch, mr-6455.2.patch, 
 mr-6455.3.patch, mr-6455.4.patch


 There are some compelling features in later version of surefire which lets 
 one exclude/include tests based the content of a file, re-running of test 
 case etc.
 However introduced in Surefire 2.18 is also 
 https://issues.apache.org/jira/browse/SUREFIRE-649. Which changed the 
 convention of null properties to empty string values (). This only applies 
 to forked tests such as the MapReduce tests and cause a couple of them to 
 fail because of functionality that is directly or indirectly dependent on the 
 value being null. One such example is Configuration.substituteVars() and 
 TaskLog.getBaseLogDir().
 substituteVars() shows the issue when the getProperty returns empty String, 
 skipping the getRaw(var) expression. One way to work around this could be
 {code} 
if (val == null || val.isEmpty()) {
 String raw = getRaw(var);
 if (raw != null) {
   // raw contains a value, otherwise default to whatever 
 System.getProperty returned
   // since it could be an empty string
   val = raw;
 }
   }
 {code}
 getBaseLogDir, similarly when returns an empty string the schematics of 
 java.io.File differs dependent on whether parent is null or . A null value 
 is interpreted as new File(file); whereas  will be interpreted as new 
 File(defaultParent /* / */, file);
 This could simply be addressed with 
 {code}
   static String getBaseLogDir() {
 String logDir = System.getProperty(hadoop.log.dir);
 // there is a difference how null and  is treated as a parent
 // directory when creating a file
 return logDir == null || logDir.isEmpty() ? null : logDir;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6455) Unable to use surefire 2.18

2015-08-26 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715842#comment-14715842
 ] 

Andrew Wang commented on MAPREDUCE-6455:


Nice, this pom change fixes it? Only nit is that we should keep test.build.dir 
with its comment, broken by the reordering, i.e.:

{noformat}
!-- TODO: all references in testcases should be updated to this default --
test.build.dir${test.build.dir}/test.build.dir
{noformat}

 Unable to use surefire  2.18
 -

 Key: MAPREDUCE-6455
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6455
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Charlie Helin
Assignee: Charlie Helin
 Fix For: 3.0.0

 Attachments: mr-6455.1.patch, mr-6455.2.patch, mr-6455.2.patch, 
 mr-6455.3.patch


 There are some compelling features in later version of surefire which lets 
 one exclude/include tests based the content of a file, re-running of test 
 case etc.
 However introduced in Surefire 2.18 is also 
 https://issues.apache.org/jira/browse/SUREFIRE-649. Which changed the 
 convention of null properties to empty string values (). This only applies 
 to forked tests such as the MapReduce tests and cause a couple of them to 
 fail because of functionality that is directly or indirectly dependent on the 
 value being null. One such example is Configuration.substituteVars() and 
 TaskLog.getBaseLogDir().
 substituteVars() shows the issue when the getProperty returns empty String, 
 skipping the getRaw(var) expression. One way to work around this could be
 {code} 
if (val == null || val.isEmpty()) {
 String raw = getRaw(var);
 if (raw != null) {
   // raw contains a value, otherwise default to whatever 
 System.getProperty returned
   // since it could be an empty string
   val = raw;
 }
   }
 {code}
 getBaseLogDir, similarly when returns an empty string the schematics of 
 java.io.File differs dependent on whether parent is null or . A null value 
 is interpreted as new File(file); whereas  will be interpreted as new 
 File(defaultParent /* / */, file);
 This could simply be addressed with 
 {code}
   static String getBaseLogDir() {
 String logDir = System.getProperty(hadoop.log.dir);
 // there is a difference how null and  is treated as a parent
 // directory when creating a file
 return logDir == null || logDir.isEmpty() ? null : logDir;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6455) Unable to use surefire 2.18

2015-08-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709967#comment-14709967
 ] 

Andrew Wang commented on MAPREDUCE-6455:


Sorry if I'm missing something, but IIUC that surefire change affects parsing 
of java system properties when running a test. Why are the fixes happening in 
Configuration, vs. in a test class or the pom or something?

 Unable to use surefire  2.18
 -

 Key: MAPREDUCE-6455
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6455
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Charlie Helin
Assignee: Charlie Helin
 Fix For: 3.0.0

 Attachments: mr-6455.1.patch, mr-6455.2.patch, mr-6455.2.patch


 There are some compelling features in later version of surefire which lets 
 one exclude/include tests based the content of a file, re-running of test 
 case etc.
 However introduced in Surefire 2.18 is also 
 https://issues.apache.org/jira/browse/SUREFIRE-649. Which changed the 
 convention of null properties to empty string values (). This only applies 
 to forked tests such as the MapReduce tests and cause a couple of them to 
 fail because of functionality that is directly or indirectly dependent on the 
 value being null. One such example is Configuration.substituteVars() and 
 TaskLog.getBaseLogDir().
 substituteVars() shows the issue when the getProperty returns empty String, 
 skipping the getRaw(var) expression. One way to work around this could be
 {code} 
if (val == null || val.isEmpty()) {
 String raw = getRaw(var);
 if (raw != null) {
   // raw contains a value, otherwise default to whatever 
 System.getProperty returned
   // since it could be an empty string
   val = raw;
 }
   }
 {code}
 getBaseLogDir, similarly when returns an empty string the schematics of 
 java.io.File differs dependent on whether parent is null or . A null value 
 is interpreted as new File(file); whereas  will be interpreted as new 
 File(defaultParent /* / */, file);
 This could simply be addressed with 
 {code}
   static String getBaseLogDir() {
 String logDir = System.getProperty(hadoop.log.dir);
 // there is a difference how null and  is treated as a parent
 // directory when creating a file
 return logDir == null || logDir.isEmpty() ? null : logDir;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6455) Unable to use surefire 2.18

2015-08-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710053#comment-14710053
 ] 

Andrew Wang commented on MAPREDUCE-6455:


I talked with [~chelin] about this offline, seems pretty complex. Charlie's 
current thinking is that surefire is somehow passing some properties 
incorrectly when running in fork mode, leading to some expected variables like 
hadoop.log.dir being unset, and then us running into this surefire behavior 
change. That sounds like a more fundamental issue than null vs. {{}}. The 
cleaner fix seems like setting these variables properly rather than relying on 
null/{{}}/default parsing, and will avoid modifying non-test and non-pom code.

Thanks again to [~chelin] for the discussion and working on this issue.

 Unable to use surefire  2.18
 -

 Key: MAPREDUCE-6455
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6455
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Charlie Helin
Assignee: Charlie Helin
 Fix For: 3.0.0

 Attachments: mr-6455.1.patch, mr-6455.2.patch, mr-6455.2.patch


 There are some compelling features in later version of surefire which lets 
 one exclude/include tests based the content of a file, re-running of test 
 case etc.
 However introduced in Surefire 2.18 is also 
 https://issues.apache.org/jira/browse/SUREFIRE-649. Which changed the 
 convention of null properties to empty string values (). This only applies 
 to forked tests such as the MapReduce tests and cause a couple of them to 
 fail because of functionality that is directly or indirectly dependent on the 
 value being null. One such example is Configuration.substituteVars() and 
 TaskLog.getBaseLogDir().
 substituteVars() shows the issue when the getProperty returns empty String, 
 skipping the getRaw(var) expression. One way to work around this could be
 {code} 
if (val == null || val.isEmpty()) {
 String raw = getRaw(var);
 if (raw != null) {
   // raw contains a value, otherwise default to whatever 
 System.getProperty returned
   // since it could be an empty string
   val = raw;
 }
   }
 {code}
 getBaseLogDir, similarly when returns an empty string the schematics of 
 java.io.File differs dependent on whether parent is null or . A null value 
 is interpreted as new File(file); whereas  will be interpreted as new 
 File(defaultParent /* / */, file);
 This could simply be addressed with 
 {code}
   static String getBaseLogDir() {
 String logDir = System.getProperty(hadoop.log.dir);
 // there is a difference how null and  is treated as a parent
 // directory when creating a file
 return logDir == null || logDir.isEmpty() ? null : logDir;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6171) The visibilities of the distributed cache files and archives should be determined by both their permissions and if they are located in HDFS encryption zone

2014-12-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6171.

   Resolution: Duplicate
Fix Version/s: 2.7.0

Duping this to HADOOP-11341 since Dian reports that it fixes this issue. Thanks 
again Dian/Arun for finding and working on this.

 The visibilities of the distributed cache files and archives should be 
 determined by both their permissions and if they are located in HDFS 
 encryption zone
 ---

 Key: MAPREDUCE-6171
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6171
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Dian Fu
 Fix For: 2.7.0


 The visibilities of the distributed cache files and archives are currently 
 determined by the permission of these files or archives. 
 The following is the logic of method isPublic() in class 
 ClientDistributedCacheManager:
 {code}
 static boolean isPublic(Configuration conf, URI uri,
   MapURI, FileStatus statCache) throws IOException {
 FileSystem fs = FileSystem.get(uri, conf);
 Path current = new Path(uri.getPath());
 //the leaf level file should be readable by others
 if (!checkPermissionOfOther(fs, current, FsAction.READ, statCache)) {
   return false;
 }
 return ancestorsHaveExecutePermissions(fs, current.getParent(), 
 statCache);
   }
 {code}
 At NodeManager side, it will use yarn user to download public files and use 
 the user who submits the job to download private files. In normal cases, 
 there is no problem with this. However, if the files are located in an 
 encryption zone(HDFS-6134) and yarn user are configured to be disallowed to 
 fetch the DataEncryptionKey(DEK) of this encryption zone by KMS, the download 
 process of this file will fail. 
 You can reproduce this issue with the following steps (assume you submit job 
 with user testUser): 
 # create a clean cluster which has HDFS cryptographic FileSystem feature
 # create directory /data/ in HDFS and make it as an encryption zone with 
 keyName testKey
 # configure KMS to only allow user testUser can decrypt DEK of key 
 testKey in KMS
 {code}
   property
 namekey.acl.testKey.DECRYPT_EEK/name
 valuetestUser/value
   /property
 {code}
 # execute job teragen with user testUser:
 {code}
 su -s /bin/bash testUser -c hadoop jar hadoop-mapreduce-examples*.jar 
 teragen 1 /data/terasort-input 
 {code}
 # execute job terasort with user testUser:
 {code}
 su -s /bin/bash testUser -c hadoop jar hadoop-mapreduce-examples*.jar 
 terasort /data/terasort-input /data/terasort-output
 {code}
 You will see logs like this at the job submitter's console:
 {code}
 INFO mapreduce.Job: Job job_1416860917658_0002 failed with state FAILED due 
 to: Application application_1416860917658_0002 failed 2 times due to AM 
 Container for appattempt_1416860917658_0002_02 exited with  exitCode: 
 -1000 due to: org.apache.hadoop.security.authorize.AuthorizationException: 
 User [yarn] is not authorized to perform [DECRYPT_EEK] on key with ACL name 
 [testKey]!!
 {code}
 The initial idea to solve this issue is to modify the logic in 
 ClientDistributedCacheManager.isPublic to consider also whether this file is 
 in an encryption zone. If it is in an encryption zone, this file should be 
 considered as private. Then at NodeManager side, it will use user who submits 
 the job to fetch the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (MAPREDUCE-6041) Fix TestOptionsParser

2014-08-20 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang moved HDFS-6872 to MAPREDUCE-6041:
--

  Component/s: (was: security)
   (was: namenode)
   security
Fix Version/s: (was: fs-encryption (HADOOP-10150 and HDFS-6134))
   fs-encryption
 Target Version/s: fs-encryption  (was: fs-encryption (HADOOP-10150 and 
HDFS-6134))
Affects Version/s: (was: fs-encryption (HADOOP-10150 and HDFS-6134))
  Key: MAPREDUCE-6041  (was: HDFS-6872)
  Project: Hadoop Map/Reduce  (was: Hadoop HDFS)

 Fix TestOptionsParser
 -

 Key: MAPREDUCE-6041
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6041
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: fs-encryption

 Attachments: HDFS-6872.001.patch


 Error Message
 expected:...argetPathExists=true[]} but was:...argetPathExists=true[, 
 preserveRawXattrs=false]}
 Stacktrace
 org.junit.ComparisonFailure: expected:...argetPathExists=true[]} but 
 was:...argetPathExists=true[, preserveRawXattrs=false]}
   at org.junit.Assert.assertEquals(Assert.java:115)
   at org.junit.Assert.assertEquals(Assert.java:144)
   at 
 org.apache.hadoop.tools.TestOptionsParser.testToString(TestOptionsParser.java:361)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6041) Fix TestOptionsParser

2014-08-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14104387#comment-14104387
 ] 

Andrew Wang commented on MAPREDUCE-6041:


While doing the CHANGES.TXT update, I noticed this was an HDFS JIRA in the MR 
CHANGES.txt, so I moved this to a MAPREDUCE JIRA.

 Fix TestOptionsParser
 -

 Key: MAPREDUCE-6041
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6041
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: fs-encryption

 Attachments: HDFS-6872.001.patch


 Error Message
 expected:...argetPathExists=true[]} but was:...argetPathExists=true[, 
 preserveRawXattrs=false]}
 Stacktrace
 org.junit.ComparisonFailure: expected:...argetPathExists=true[]} but 
 was:...argetPathExists=true[, preserveRawXattrs=false]}
   at org.junit.Assert.assertEquals(Assert.java:115)
   at org.junit.Assert.assertEquals(Assert.java:144)
   at 
 org.apache.hadoop.tools.TestOptionsParser.testToString(TestOptionsParser.java:361)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6040) Automatically use /.reserved/raw when run by the superuser

2014-08-19 Thread Andrew Wang (JIRA)
Andrew Wang created MAPREDUCE-6040:
--

 Summary: Automatically use /.reserved/raw when run by the superuser
 Key: MAPREDUCE-6040
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6040
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: fs-encryption
Reporter: Andrew Wang
Assignee: Charles Lamb


On HDFS-6134, [~sanjay.radia] asked for distcp to automatically prepend 
/.reserved/raw if the distcp is being performed by the superuser and 
/.reserved/raw is supported by both the source and destination filesystems.

Naturally, we'd also want a flag to disable this behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-6008) Update distcp docs to include new option that suppresses preservation of RAW.* namespace extended attributes

2014-08-08 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-6008.


Resolution: Not a Problem

We took care of the docs in parent JIRA, no need for this one.

 Update distcp docs to include new option that suppresses preservation of 
 RAW.* namespace extended attributes
 

 Key: MAPREDUCE-6008
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6008
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb

 Update the docs to include this new option.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved

2014-08-07 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089876#comment-14089876
 ] 

Andrew Wang commented on MAPREDUCE-6007:


Nice work, this is way simpler. I think we're pretty close.

* In the md.vm file, let's scratch the change to the table, I think the section 
is enough by itself.
* Not sure we're fully qualifying relative paths correctly. I wrote a small 
test which I expected to work. Could you confirm? I think we just need to 
qualify the src paths with the src FileSystem first.

{code}
  @Test
  public void testWorkingDir() throws Exception {
final Path wd = fs.getWorkingDirectory();
try {
  fs.setWorkingDirectory(new Path(/.reserved/raw/));
  doTestPreserveRawXAttrs(raw/src, raw/dest, -px, true, true, 
  DistCpConstants.SUCCESS);
} finally {
  fs.setWorkingDirectory(wd);
}
  }
{code}

 Create a new option for distcp -p which causes raw.* namespace extended 
 attributes to not be preserved
 --

 Key: MAPREDUCE-6007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch, 
 MAPREDUCE-6007.003.patch


 As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
 new option for distcp which causes raw.* namespace extended attributes to not 
 be preserved. See the doc in HDFS-6509 for details. The default for this 
 option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved

2014-08-07 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089878#comment-14089878
 ] 

Andrew Wang commented on MAPREDUCE-6007:


Eh, I looked at the test output, and it's complaining about raw/src doesn't 
exist. I guess distcp doesn't support relative paths?

In that case, +1 pending the doc change.

 Create a new option for distcp -p which causes raw.* namespace extended 
 attributes to not be preserved
 --

 Key: MAPREDUCE-6007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch, 
 MAPREDUCE-6007.003.patch


 As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
 new option for distcp which causes raw.* namespace extended attributes to not 
 be preserved. See the doc in HDFS-6509 for details. The default for this 
 option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved

2014-08-07 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089890#comment-14089890
 ] 

Andrew Wang commented on MAPREDUCE-6007:


Okay, so I need to stop rushing this :) If you fix my above test by removing 
the raw path components, you'll see that the target path isn't being 
qualified before being checked. Try adding this near the top of 
SimpleCopyListing#validatePaths:

{code}
# Qualify the target path before checking
targetPath = targetFS.makeQualified(targetPath);
final boolean targetIsReservedRaw =
Path.getPathWithoutSchemeAndAuthority(targetPath).toString().
startsWith(HDFS_RESERVED_RAW_DIRECTORY_NAME);
{code}

 Create a new option for distcp -p which causes raw.* namespace extended 
 attributes to not be preserved
 --

 Key: MAPREDUCE-6007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch, 
 MAPREDUCE-6007.003.patch


 As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
 new option for distcp which causes raw.* namespace extended attributes to not 
 be preserved. See the doc in HDFS-6509 for details. The default for this 
 option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved

2014-08-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088300#comment-14088300
 ] 

Andrew Wang commented on MAPREDUCE-6007:


bq. the only typo above is the last line which should be no raw xattrs are 
preserved

If none of these flags are specified, AFAIK neither non-raw or raw xattrs are 
preserved, i.e. no xattrs. Yes?

bq. I convinced myself that a relative path could never be relative to 
/.reserved/raw since you can't set your working directory to that.

AFAIK you can set your wd to whatever you want, and you can have .. in 
absolute paths too. We need to make sure that this path is fully normalized if 
we're doing a prefix check. Paths from a FileStatus are normalized, but paths 
coming from the user (like the ones coming out of a DistCpOptions) are suspect. 
setTargetPathExists has one of these suspect checks.

Doc
* This is hard to read, could we expand this into a separate section and a new 
table? I'd particularly like to see a fuller explanation of what happens with 
different dst options.

CopyListing
* Let's improve the InvalidInputException message. Paths don't really specify 
something, you could say starts with or something instead. We should also 
print the target path.
* I don't quite understand this error either, why is a {{/.r/r}} src and 
{{-pd}} not okay? The exception also mentions the target not starting with 
{{/.r/r}}, but that's not part of the if check.
* Line longer than 80chars
* I expected to see a check that was if (-p || -px)  !-pd  src is /.r/r, 
then also check that the dst supports xattrs and is /.r/r. I wish there was a 
way to test that it's HDFS too, but looking for dest having /.r/r is probably 
good enough.

CopyMapper
* Can we expand the block comment to say that toCopyListingFileStatus is used 
to filter xattrs, and passing copyXAttrs in twice is okay because we already 
did it earlier? The double passing looks weird, though logically correct.

DistCp:
* I really don't like setting the DISABLERAWXATTRS flag in setTargetPathExists, 
since the expectation is that Options flags are set by the user. This method is 
also not named such that doing this there makes sense. We have the target path 
via the DistCpOptions, so let's be explicit and verbose with the checks 
instead. This is quite possibly why the CopyListing check is confusing to me.
* To expand on the above, -px means preserving all xattrs, while -pxd means 
preserving non-raw xattrs. Then we have {{toCopyListingFileStatus}} where the 
{{preserveXAttrs}} parameter actually means preserve non-raw xattrs. This is 
also definitely confusing...

DistCpOptionSwitch:
* XATTR is not a standard capitalization style, let's lower case it as xattr 
here. XAttr isn't standard either, but that ship has sailed.

Test
* I'd like tests for weird src and dst paths, i.e. relative or containing ..s
* We could also test the no preserve flags behavior, that no xattrs at all 
are preserved.

 Create a new option for distcp -p which causes raw.* namespace extended 
 attributes to not be preserved
 --

 Key: MAPREDUCE-6007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: MAPREDUCE-6007.001.patch, MAPREDUCE-6007.002.patch


 As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
 new option for distcp which causes raw.* namespace extended attributes to not 
 be preserved. See the doc in HDFS-6509 for details. The default for this 
 option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved

2014-08-04 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085543#comment-14085543
 ] 

Andrew Wang commented on MAPREDUCE-6007:


Hi Charles, thanks for the patch,

I had to take notes while reviewing this patch, the behavior is kind of 
complicated. We have a variety of flags that can be specified, and the 
destination FS can have different levels of support. It'd be very useful to 
specify this behavior in gory detail in the DistCp documentation.

Check me on this though:

Options:

{noformat}
-px : preserve raw and non-raw xattrs
-pr : no xattrs are preserved
-p  : preserve raw xattrs
-pxr: preserve non-raw xattrs
: no xattrs are preserved
{noformat}

Behavior with a given src and dst, varying levels of dst support:

* raw src, raw dst: the options apply as specified above
* raw src, not-raw dst, dst supports xattrs but no {{/reserved/.raw}}: we will 
fail to set raw xattrs at runtime.
* raw src, dst doesn't support xattrs: if {{-pX}} is specified, throws an 
exception. Else, silently discards raw xattrs.

Some discussion on the above:
* If the src is {{/reserved/.raw}}, the user is expecting preservation of raw 
xattrs when {{-p}} or {{-pX}} is specified. In this scenario, we should test 
that the dest is {{/.reserved/raw}} and that it's present on the dstFS.
* There might be other weird cases, haven't thought through all of them

Some code review comments:

Misc:
- We have both {{noPreserveRaw}} and {{preserveRaw}} booleans, can we 
standardize on one everywhere? I'd like a negative one, call it {{disableRaw}} 
or {{excludeRaw}} since it better captures the meaning of the flag. {{exclude}} 
feels a bit better IMO, but it looks like {{-pe}} is taken.
- What's the expected behavior when the dest doesn't support xattrs or reserved 
raw, or supports xattrs but not reserved raw?
- CopyListing, this is where we'd also test to see if the destFS has a 
/.reserved/raw directory
- CopyMapper, two periods in the block comment

Documentation:
- I don't want to tie raw preservation just to encryption since we might also 
use it for compression, how about this instead:
{quote}
d: disable preservation of raw namespace extended attributes
...
raw namespace extended attributes are preserved by default if supported. 
Specifying -pd disables preservation of these xattrs.
{quote}
- As noted above, it'd be good to have the expected preservation behavior laid 
out in the distcp documentation.

DistCp:
{code}
if (!Path.getPathWithoutSchemeAndAuthority(target).toString().
{code}
What if the target is a relative path here?

Test:
- Any reason this isn't part of the existing XAttr test? They seem pretty 
similar, and you also added a PXD test to the existing test.
- Don't need to do makeFilesAndDirs inO the BeforeClass
- Doesn't there need to be a non-raw attribute set so you can test some of 
these combinations?
- Can we test what happens when the dest FS doesn't support xattrs or raw 
xattrs?

 Create a new option for distcp -p which causes raw.* namespace extended 
 attributes to not be preserved
 --

 Key: MAPREDUCE-6007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: distcp
Affects Versions: fs-encryption
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: MAPREDUCE-6007.001.patch


 As part of the Data at Rest Encryption work (HDFS-6134), we need to create a 
 new option for distcp which causes raw.* namespace extended attributes to not 
 be preserved. See the doc in HDFS-6509 for details. The default for this 
 option will be to preserve raw.* xattrs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5971) move the default options for distcp -p to DistCpOptionSwitch

2014-07-16 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063917#comment-14063917
 ] 

Andrew Wang commented on MAPREDUCE-5971:


Hi Charles, thanks for working on this, I took a quick look:

Since we're adding a new getDefaultValue to DistCpOptionSwitch, shouldn't we 
make the handling of default values in CustomParser generic as well? Right now 
using the default value is still special cased only for PRESERVE_STATUS. Maybe 
build a map with the default values in CustomParser?

 move the default options for distcp -p to DistCpOptionSwitch
 

 Key: MAPREDUCE-5971
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5971
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: trunk
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Trivial
 Attachments: MAPREDUCE-5971.001.patch


 The default preserve flags for distcp -p are embedded in the OptionsParser 
 code. Refactor to co-locate them with the actual flag initialization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5971) move the default options for distcp -p to DistCpOptionSwitch

2014-07-16 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064171#comment-14064171
 ] 

Andrew Wang commented on MAPREDUCE-5971:


+1 pending, thanks charles

 move the default options for distcp -p to DistCpOptionSwitch
 

 Key: MAPREDUCE-5971
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5971
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: trunk
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Trivial
 Attachments: MAPREDUCE-5971.001.patch, MAPREDUCE-5971.002.patch


 The default preserve flags for distcp -p are embedded in the OptionsParser 
 code. Refactor to co-locate them with the actual flag initialization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5971) Move the default options for distcp -p to DistCpOptionSwitch

2014-07-16 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-5971:
---

Summary: Move the default options for distcp -p to DistCpOptionSwitch  
(was: move the default options for distcp -p to DistCpOptionSwitch)

 Move the default options for distcp -p to DistCpOptionSwitch
 

 Key: MAPREDUCE-5971
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5971
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: trunk
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Trivial
 Attachments: MAPREDUCE-5971.001.patch, MAPREDUCE-5971.002.patch


 The default preserve flags for distcp -p are embedded in the OptionsParser 
 code. Refactor to co-locate them with the actual flag initialization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5971) Move the default options for distcp -p to DistCpOptionSwitch

2014-07-16 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-5971:
---

   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Charles

 Move the default options for distcp -p to DistCpOptionSwitch
 

 Key: MAPREDUCE-5971
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5971
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: distcp
Affects Versions: trunk
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Trivial
 Fix For: 2.6.0

 Attachments: MAPREDUCE-5971.001.patch, MAPREDUCE-5971.002.patch


 The default preserve flags for distcp -p are embedded in the OptionsParser 
 code. Refactor to co-locate them with the actual flag initialization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-19 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002373#comment-14002373
 ] 

Andrew Wang commented on MAPREDUCE-5867:


Hey Devraj, I think TestKillAMPreemptionPolicy.java was committed with CRLFs 
rather than LFs, which messes up {{git diff}} for those of us using the git 
mirror. Do you mind fixing this? Thanks.

 Possible NPE in KillAMPreemptionPolicy related to 
 ProportionalCapacityPreemptionPolicy
 --

 Key: MAPREDUCE-5867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 3.0.0

 Attachments: MapReduce-5867-updated.patch, 
 MapReduce-5867-updated.patch, MapReduce-5867.2.patch, MapReduce-5867.3.patch, 
 Yarn-1980.1.patch


 I configured KillAMPreemptionPolicy for My Application Master and tried to 
 check preemption of queues.
 In one scenario I have seen below NPE in my AM
 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
   at java.lang.Thread.run(Thread.java:662)
 I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5867) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

2014-05-19 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002412#comment-14002412
 ] 

Andrew Wang commented on MAPREDUCE-5867:


Actually, nevermind, I fixed it myself. I learned something new about SVN, 
apparently we should be doing svn propset svn:eol-style native file on new 
files (thanks cmccabe for the tip). I ran {{dos2unix}} to convert the newlines 
too.

 Possible NPE in KillAMPreemptionPolicy related to 
 ProportionalCapacityPreemptionPolicy
 --

 Key: MAPREDUCE-5867
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5867
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 3.0.0

 Attachments: MapReduce-5867-updated.patch, 
 MapReduce-5867-updated.patch, MapReduce-5867.2.patch, MapReduce-5867.3.patch, 
 Yarn-1980.1.patch


 I configured KillAMPreemptionPolicy for My Application Master and tried to 
 check preemption of queues.
 In one scenario I have seen below NPE in my AM
 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
   at java.lang.Thread.run(Thread.java:662)
 I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5790) Default map hprof profile options do not work

2014-03-10 Thread Andrew Wang (JIRA)
Andrew Wang created MAPREDUCE-5790:
--

 Summary: Default map hprof profile options do not work
 Key: MAPREDUCE-5790
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5790
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
 Environment: java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
Reporter: Andrew Wang


I have an MR job doing the following:

{code}
Job job = Job.getInstance(conf);

// Enable profiling
job.setProfileEnabled(true);
job.setProfileTaskRange(true, 0);
job.setProfileTaskRange(false, 0);
{code}

When I run this job, some of my map tasks fail with this error message:

{noformat}
org.apache.hadoop.util.Shell$ExitCodeException: 
/data/5/yarn/nm/usercache/hdfs/appcache/application_1394482121761_0012/container_1394482121761_0012_01_41/launch_container.sh:
 line 32: $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true 
-Dhadoop.metrics.log.level=WARN   -Xmx825955249 -Djava.io.tmpdir=$PWD/tmp 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41
 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
${mapreduce.task.profile.params} org.apache.hadoop.mapred.YarnChild 
10.20.212.12 43135 attempt_1394482121761_0012_r_00_0 41 
1/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stdout
 
2/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stderr
 : bad substitution
{noformat}

It looks like ${mapreduce.task.profile.params} is not getting subbed in 
correctly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5620) distcp1 -delete fails when target directory contains files with percent signs

2013-11-12 Thread Andrew Wang (JIRA)
Andrew Wang created MAPREDUCE-5620:
--

 Summary: distcp1 -delete fails when target directory contains 
files with percent signs
 Key: MAPREDUCE-5620
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5620
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Andrew Wang
Assignee: Andrew Wang


Debugging a distcp1 issue, it fails to delete extra files in the target 
directory when there is a percent sign in the filename. I'm pretty sure this is 
an issue with how percent encoding is handled in FsShell (reproduced with just 
hadoop fs -rmr), but we can also fix this in distcp1 by using FileSystem 
instead of FsShell. This is what distcp2 does.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5620) distcp1 -delete fails when target directory contains files with percent signs

2013-11-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved MAPREDUCE-5620.


Resolution: Invalid

Turns out this was due to running distcp1 with hadoop 2's FsShell. I couldn't 
repro this on a pure branch-1 setup, so resolving as invalid.

 distcp1 -delete fails when target directory contains files with percent signs
 -

 Key: MAPREDUCE-5620
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5620
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Andrew Wang
Assignee: Andrew Wang

 Debugging a distcp1 issue, it fails to delete extra files in the target 
 directory when there is a percent sign in the filename. I'm pretty sure this 
 is an issue with how percent encoding is handled in FsShell (reproduced with 
 just hadoop fs -rmr), but we can also fix this in distcp1 by using 
 FileSystem instead of FsShell. This is what distcp2 does.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5379) Include token tracking ids in jobconf

2013-09-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766901#comment-13766901
 ] 

Andrew Wang commented on MAPREDUCE-5379:


Thanks Karthik, the patch looks good to me. As I'm not well-versed in the ways 
of MR, it'd be good to get confirmation from someone else as well.

 Include token tracking ids in jobconf
 -

 Key: MAPREDUCE-5379
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5379
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: job submission, security
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Attachments: MAPREDUCE-5379-1.patch, MAPREDUCE-5379-2.patch, 
 MAPREDUCE-5379.patch, mr-5379-3.patch


 HDFS-4680 enables audit logging delegation tokens. By storing the tracking 
 ids in the job conf, we can enable tracking what files each job touches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5193) A few MR tests use block sizes which are smaller than the default minimum block size

2013-05-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-5193:
---

Attachment: mapreduce-5193-1.patch

ATM told me it was okay to poach this, so here's a patch. It sets the min block 
size to 0 in the /src/test/resource {{hdfs-site.xml}}, which is the same fix we 
used for the HDFS tests.

I ran the failed tests from the MAPREDUCE-5156 patch successfully. Looking at 
the daily build, most of the other components are fine. I also ran the tests in 
the skipped components {{hs-plugin}} and examples successfully, so hopefully 
it'll fix everything.

 A few MR tests use block sizes which are smaller than the default minimum 
 block size
 

 Key: MAPREDUCE-5193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.5-beta
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: MAPREDUCE-5156.1.patch, mapreduce-5193-1.patch


 HDFS-4305 introduced a new configurable minimum block size of 1MB. A few MR 
 tests deliberately set much smaller block sizes. This JIRA is to update those 
 tests to fix these failing tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)

2013-02-26 Thread Andrew Wang (JIRA)
Andrew Wang created MAPREDUCE-5033:
--

 Summary: mapred shell script should respect usage flags (--help 
-help -h)
 Key: MAPREDUCE-5033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor


Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y 
help flags.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)

2013-02-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-5033:
---

Attachment: mapreduce-5033-1.patch

Little patch attached. Tested manually by running the mapred script.

 mapred shell script should respect usage flags (--help -help -h)
 

 Key: MAPREDUCE-5033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor
 Attachments: mapreduce-5033-1.patch


 Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y 
 help flags.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5033) mapred shell script should respect usage flags (--help -help -h)

2013-02-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-5033:
---

Status: Patch Available  (was: Open)

 mapred shell script should respect usage flags (--help -help -h)
 

 Key: MAPREDUCE-5033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5033
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.3-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor
 Attachments: mapreduce-5033-1.patch


 Like in HADOOP-9267, the mapred shell script should respect the normal Unix-y 
 help flags.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (MAPREDUCE-5026) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations

2013-02-25 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang moved HDFS-4527 to MAPREDUCE-5026:
--

  Component/s: (was: performance)
   tasktracker
   performance
Fix Version/s: (was: 1.1.1)
   1.1.1
 Target Version/s:   (was: 1.1.1)
Affects Version/s: (was: 1.1.1)
   1.1.1
  Key: MAPREDUCE-5026  (was: HDFS-4527)
  Project: Hadoop Map/Reduce  (was: Hadoop HDFS)

 For shortening the time of TaskTracker heartbeat, decouple the statics 
 collection operations
 

 Key: MAPREDUCE-5026
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5026
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, tasktracker
Affects Versions: 1.1.1
Reporter: sam liu
  Labels: patch
 Fix For: 1.1.1

 Attachments: HDFS-4527.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 In each heartbeat of TaskTracker, it will calculate some system statics, like 
 the free disk space, available virtual/physical memory, cpu usage, etc. 
 However, it's not necessary to calculate all the statics in every heartbeat, 
 and this will consume many system resource and impace the performance of 
 TaskTracker heartbeat. Furthermore, the characteristics of system 
 properties(disk, memory, cpu) are different and it's better to collect their 
 statics in different intervals.
 To reduce the latency of TaskTracker heartbeat, one solution is to decouple 
 all the system statics collection operations from it, and issue separate 
 threads to do the statics collection works when the TaskTracker starts. The 
 threads could be three: the first one is to collect cpu related statics in a 
 short interval; the second one is to collect memory related statics in a 
 normal interval; the third one is to collect disk related statics in a long 
 interval. And all the interval could be customized by the parameter 
 mapred.stats.collection.interval in the mapred-site.xml. At last, the 
 heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5026) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations

2013-02-25 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586136#comment-13586136
 ] 

Andrew Wang commented on MAPREDUCE-5026:


Hi Sam,

Thanks for the patch. I moved your issue to MAPREDUCE, since the TaskTracker 
isn't a component of HDFS.

A few minor comments:

* Please rename Statics to Statistics in the code.
* Could you provide some performance numbers, to quantify the before and after 
improvement?

 For shortening the time of TaskTracker heartbeat, decouple the statics 
 collection operations
 

 Key: MAPREDUCE-5026
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5026
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, tasktracker
Affects Versions: 1.1.1
Reporter: sam liu
  Labels: patch
 Fix For: 1.1.1

 Attachments: HDFS-4527.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 In each heartbeat of TaskTracker, it will calculate some system statics, like 
 the free disk space, available virtual/physical memory, cpu usage, etc. 
 However, it's not necessary to calculate all the statics in every heartbeat, 
 and this will consume many system resource and impace the performance of 
 TaskTracker heartbeat. Furthermore, the characteristics of system 
 properties(disk, memory, cpu) are different and it's better to collect their 
 statics in different intervals.
 To reduce the latency of TaskTracker heartbeat, one solution is to decouple 
 all the system statics collection operations from it, and issue separate 
 threads to do the statics collection works when the TaskTracker starts. The 
 threads could be three: the first one is to collect cpu related statics in a 
 short interval; the second one is to collect memory related statics in a 
 normal interval; the third one is to collect disk related statics in a long 
 interval. And all the interval could be customized by the parameter 
 mapred.stats.collection.interval in the mapred-site.xml. At last, the 
 heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >