[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-02-10 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140912#comment-15140912
 ] 

Akira AJISAKA commented on MAPREDUCE-6607:
--

Fixed. Thanks [~lewuathe] for reporting this.

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-02-10 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140967#comment-15140967
 ] 

Akira AJISAKA commented on MAPREDUCE-6607:
--

Hi [~maysamyabandeh] and [~lewuathe], I'm thinking it is reasonable that 
.staging dir is not cleaned if either of the two parameters is set. This is 
because there may be some failed tasks even if the mapreduce job is succeeded.

bq. The former was supposed to keep only .staging of failed tasks
AFAIK, the files in .staging can be used for all tasks, so I'm thinking it's 
difficult to search what is the .staging of the failed tasks.

By the way, now regex match is not done even if the 
"mapreduce.task.files.preserve.filepattern" is set. We need to fix it.

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-02-10 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140892#comment-15140892
 ] 

Akira AJISAKA commented on MAPREDUCE-6607:
--

bq. Is it included by accident?
Yes. I'm sorry about that. I'll fix it shortly.

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6191) TestJavaSerialization fails with getting incorrect MR job result

2016-02-10 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6191:
--
Fix Version/s: 2.6.5
   2.7.3

I committed this to branch-2, branch-2.8, branch-2.7, and branch-2.6.

> TestJavaSerialization fails with getting incorrect MR job result
> 
>
> Key: MAPREDUCE-6191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6191
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Minor
> Fix For: 2.7.3, 2.6.5
>
> Attachments: MAPREDUCE-6191.patch
>
>
> TestJavaSerialization#testMapReduceJob() fails with getting incorrect MR job 
> result:
> "junit.framework.ComparisonFailure: expected:<[a ]1> but was:<[0 1]1>"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6191) TestJavaSerialization fails with getting incorrect MR job result

2016-02-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141186#comment-15141186
 ] 

Hudson commented on MAPREDUCE-6191:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9274 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9274/])
Update CHANGES.txt for commit of MAPREDUCE-6191 to other branches. (jlowe: rev 
a429f857b2aea63e23128728274bb2985c5bf087)
* hadoop-mapreduce-project/CHANGES.txt


> TestJavaSerialization fails with getting incorrect MR job result
> 
>
> Key: MAPREDUCE-6191
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6191
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: sam liu
>Assignee: sam liu
>Priority: Minor
> Fix For: 2.7.3, 2.6.5
>
> Attachments: MAPREDUCE-6191.patch
>
>
> TestJavaSerialization#testMapReduceJob() fails with getting incorrect MR job 
> result:
> "junit.framework.ComparisonFailure: expected:<[a ]1> but was:<[0 1]1>"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6632) Master.getMasterAddress() should be updated to use YARN-4629

2016-02-10 Thread Daniel Templeton (JIRA)
Daniel Templeton created MAPREDUCE-6632:
---

 Summary: Master.getMasterAddress() should be updated to use 
YARN-4629
 Key: MAPREDUCE-6632
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6632
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Reporter: Daniel Templeton
Assignee: Daniel Templeton
Priority: Minor


The new {{YarnClientUtil.getRmPrincipal()}} method can replace most of the 
{{Master.getMasterAddress()}} method and should to reduce redundancy and 
improve servicability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-02-10 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created MAPREDUCE-6633:
-

 Summary: AM should retry map attempts if the reduce task 
encounters commpression related errors.
 Key: MAPREDUCE-6633
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.2
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah


When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the jib definitely 
would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6460) TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException fails

2016-02-10 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-6460:
--
Fix Version/s: 2.7.3

I committed this to branch-2.7.

> TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException 
> fails
> ---
>
> Key: MAPREDUCE-6460
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6460
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0, 2.7.3
>
> Attachments: MAPREDUCE-6460.000.patch
>
>
> TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException 
> fails with the following logs:
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 94.525 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator
> testAttemptNotFoundCausesRMCommunicatorException(org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator)
>   Time elapsed: 2.606 sec  <<< FAILURE!
> java.lang.AssertionError: Expected exception: 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException
>   at 
> org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:32)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Results :
> Failed tests: 
>   TestRMContainerAllocator.testAttemptNotFoundCausesRMCommunicatorException 
> Expected exception: 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocationException
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-02-10 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated MAPREDUCE-6633:
--
Description: 
When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the job definitely 
would have succeeded.

  was:
When reduce task encounters compression related errors, AM  doesn't retry the 
corresponding map task.
In one of the case we encountered, here is the stack trace.
{noformat}
2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#29
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
{noformat}
In this case, the node on which the map task ran had a bad drive.
If the AM had retried running that map task somewhere else, the jib definitely 
would have succeeded.


> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at 

[jira] [Commented] (MAPREDUCE-6630) TestUserGroupInformation#testGetServerSideGroups fails in chroot

2016-02-10 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141693#comment-15141693
 ] 

Eric Badger commented on MAPREDUCE-6630:


The TestCopyPreserveFlag JUnit failure is related to 
[HADOOP-12589|https://issues.apache.org/jira/browse/HADOOP-12589] and is 
completely separate from the patch for this JIRA. The patch for this JIRA fixes 
the TestUserGroupInformation#testGetServerSideGroups issue. 

> TestUserGroupInformation#testGetServerSideGroups fails in chroot
> 
>
> Key: MAPREDUCE-6630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6630
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security, test
>Affects Versions: 2.1.0-beta
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Minor
> Attachments: MAPREDUCE-6630-001.patch
>
>
> Bug fixed by [HADOOP-7811] broken by [HADOOP-8562]. Need to re-introduce the 
> fix. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6630) TestUserGroupInformation#testGetServerSideGroups fails in chroot

2016-02-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141712#comment-15141712
 ] 

Jason Lowe commented on MAPREDUCE-6630:
---

+1 lgtm.  Committing this.

> TestUserGroupInformation#testGetServerSideGroups fails in chroot
> 
>
> Key: MAPREDUCE-6630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6630
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security, test
>Affects Versions: 2.1.0-beta
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Minor
> Attachments: MAPREDUCE-6630-001.patch
>
>
> Bug fixed by [HADOOP-7811] broken by [HADOOP-8562]. Need to re-introduce the 
> fix. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6603) Add counters for failed task attempts

2016-02-10 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141417#comment-15141417
 ] 

Kuhu Shukla commented on MAPREDUCE-6603:


To have this change configurable will allow backward compatibility, such that 
we can turn off the capability to report failed task counters allowing legacy 
consumers of the counters to not get affected. Also, the present code has an 
inconsistency where a failed task with one or more killed attempts would still 
report counters versus a failed task with all failed attempts. This stems from 
how TaskAttemptImpl.EMPTY_COUNTERS is used when the bestAttempt is null. This 
is a static final variable that has  groups attached to it which get populated, 
so essentially when we assign counters to be EMPTY_COUNTERS it may have 
countergroups populated through {{addGroups}} making the {{isEmpty}} call to 
return false. I will update with more findings on this. 

Appreciate any comments/ideas/suggestions from the community on this.

> Add counters for failed task attempts
> -
>
> Key: MAPREDUCE-6603
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6603
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Minor
>
> The counters for failed task attempts are currently unavailable and would be 
> nice to have for troubleshooting whilst not including them in the aggregate 
> counters at task or job level. One should be able to view them at attempt 
> level.
> {code}
> Sorry it looks like task_1_2_r_3 has no counters. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6579) JobStatus#getFailureInfo should not output diagnostic information when the job is running

2016-02-10 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142217#comment-15142217
 ] 

Rohith Sharma K S commented on MAPREDUCE-6579:
--

The patch mostly reasonable to me. Only doubt I have is I am not pretty sure 
how was the getFailureInfo was behaving for success/killed/failed. Current 
patch returns empty string for failed/killed. Should it return empty string for 
success state?
[~jlowe] would you please provide your thoughts/opinion this.

> JobStatus#getFailureInfo should not output diagnostic information when the 
> job is running
> -
>
> Key: MAPREDUCE-6579
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Akira AJISAKA
>Priority: Blocker
> Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, 
> MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch
>
>
> From 
> [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt]
>  TestNetworkedJob are failed intermittently.
> {code}
> Running org.apache.hadoop.mapred.TestNetworkedJob
> Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec 
> <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob
> testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob)  Time elapsed: 
> 30.55 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 + 2015] 
> Application is Activated, waiting for resources to be assigned for AM.  
> Details : AM Partition =  ; Partition Resource = 
>  ; Queue's Absolute capacity = 100.0 % ; Queue's 
> Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> 
> but was:<[]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-02-10 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140678#comment-15140678
 ] 

Tsuyoshi Ozawa commented on MAPREDUCE-6607:
---

[~lewuathe] the patch seems to be stale, so could you rebase it on trunk?

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6607) .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or mapreduce.task.files.preserve.filepattern are set

2016-02-10 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140735#comment-15140735
 ] 

Kai Sasaki commented on MAPREDUCE-6607:
---

[~ozawa] Sure, but it's weird. Although HDFS-9686 patch does not seem to 
include the test of {{TestStagingCleanup}}, the commit includes the test code 
I've written here.
{code}
git show fe124da5ffc16e4795c3dd5542accd58361e1b08
...
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java
+++ 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java
@@ -245,6 +245,63 @@ public void testDeletionofStagingOnKillLastTry() throws 
IOException {
  verify(fs).delete(stagingJobPath, true);
}

+  @Test
+  public void testByPreserveFailedStaging() throws IOException {
+conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, stagingJobDir);
+// Failed task's staging files should be kept
+conf.setBoolean(MRJobConfig.PRESERVE_FAILED_TASK_FILES, true);
+fs = mock(FileSystem.class);
+when(fs.delete(any(Path.class), anyBoolean())).thenReturn(true);
...
{code}

Updating {{TestStagingCleanup}} does not seem to related to HDFS-9686 
basically. Is it included by accident? 

> .staging dir is not cleaned up if mapreduce.task.files.preserve.failedtask or 
> mapreduce.task.files.preserve.filepattern are set
> ---
>
> Key: MAPREDUCE-6607
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6607
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.1
>Reporter: Maysam Yabandeh
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: MAPREDUCE-6607.01.patch
>
>
> if either of the following configs are set, then .staging dir is not cleaned 
> up:
> * mapreduce.task.files.preserve.failedtask 
> * mapreduce.task.files.preserve.filepattern
> The former was supposed to keep only .staging of failed tasks and the latter 
> was supposed to be used only if that task name matches against the specified 
> regular expression.
> {code}
>   protected boolean keepJobFiles(JobConf conf) {
> return (conf.getKeepTaskFilesPattern() != null || conf
> .getKeepFailedTaskFiles());
>   }
> {code}
> {code}
>   public void cleanupStagingDir() throws IOException {
> /* make sure we clean the staging files */
> String jobTempDir = null;
> FileSystem fs = getFileSystem(getConfig());
> try {
>   if (!keepJobFiles(new JobConf(getConfig( {
> jobTempDir = getConfig().get(MRJobConfig.MAPREDUCE_JOB_DIR);
> if (jobTempDir == null) {
>   LOG.warn("Job Staging directory is null");
>   return;
> }
> Path jobTempDirPath = new Path(jobTempDir);
> LOG.info("Deleting staging directory " + 
> FileSystem.getDefaultUri(getConfig()) +
> " " + jobTempDir);
> fs.delete(jobTempDirPath, true);
>   }
> } catch(IOException io) {
>   LOG.error("Failed to cleanup staging dir " + jobTempDir, io);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)