[jira] [Commented] (MAPREDUCE-6660) Add MR Counters for bytes-read-by-network-distance FileSystem metrics

2016-04-08 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233200#comment-15233200
 ] 

Junping Du commented on MAPREDUCE-6660:
---

Thanks [~mingma] for delivering the patch! Quickly go through patch, and a few 
comments:
1. About naming of new FileSystemCounter:
{noformat}
+  BYTES_READ_LOCAL_HOST,
+  BYTES_READ_LOCAL_RACK,
+  BYTES_READ_FIRST_DEGREE_REMOTE_RACK,
+  BYTES_READ_SECOND_OR_MORE_DEGREE_REMOTE_RACK,
{noformat}
Shall we just simply name it as: BYTES_READ_LOCAL_HOST,  BYTES_READ_LOCAL_RACK, 
 BYTES_READ_LOCAL_DATACENTER, BYTES_READ_REMOTE_DATACENTER and put some 
comments on it? I think it sounds more understandable. 
BTW, the last comma is not necessary.

Also, shall we add some simple test to verify number of BYTES_READ = sum of 4 
new read counters?


 

> Add MR Counters for bytes-read-by-network-distance FileSystem metrics
> -
>
> Key: MAPREDUCE-6660
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6660
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: MAPREDUCE-6660.patch, MAPREDUCE-6660.png
>
>
> This is the MR part of the change which is to consume 
> bytes-read-by-network-distance metrics generated by 
> https://issues.apache.org/jira/browse/HDFS-9579.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.

2016-04-08 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233181#comment-15233181
 ] 

Eric Payne commented on MAPREDUCE-6633:
---

{quote}
In this case the decompressor threw RuntimeException 
(ArrayIndexOutOfBondsException is a subclass).
If we had re run the map on another node, the job would have succeeded.
...
I understand your concern but I think its a good change according to me.
{quote}
Thanks [~shahrs87]]. It would be ideal to come up with a subset that would 
cover only the exceptions that could be thrown, but I agree that the change is 
fine as it is.
+1

> AM should retry map attempts if the reduce task encounters commpression 
> related errors.
> ---
>
> Key: MAPREDUCE-6633
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the 
> corresponding map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#29
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
>   at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job 
> definitely would have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233178#comment-15233178
 ] 

Hadoop QA commented on MAPREDUCE-6628:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s 
{color} | {color:red} root: patch generated 1 new + 232 unchanged - 4 fixed = 
233 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 1s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 45s 
{color} | {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core-jdk1.8.0_77
 with JDK v1.8.0_77 generated 4 new + 96 unchanged - 4 fixed = 100 total (was 
100) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 33s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 57s 
{color} | {color:green} hadoop-mapreduce-client-core in the patch passed with 
JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 58s 
{color} | {color:green} hadoop-common in the patch passed with 

[jira] [Updated] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-04-08 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-6628:

Status: Open  (was: Patch Available)

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-04-08 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233063#comment-15233063
 ] 

Mariappan Asokan commented on MAPREDUCE-6628:
-

Uploaded a patch based on "Possible Fix - 2."


> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-04-08 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-6628:

Status: Patch Available  (was: Open)

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-04-08 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-6628:

Attachment: MAPREDUCE-6628.002.patch

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6302) Preempt reducers after a configurable timeout irrespective of headroom

2016-04-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233001#comment-15233001
 ] 

Wangda Tan commented on MAPREDUCE-6302:
---

Done committed to branch-2.6/2.7

> Preempt reducers after a configurable timeout irrespective of headroom
> --
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.5
>
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> MAPREDUCE-6302.branch-2.6.0001.patch, MAPREDUCE-6302.branch-2.7.0001.patch, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-5.patch, mr-6302-6.patch, mr-6302-7.patch, mr-6302-prelim.patch, 
> mr-6302_branch-2.patch, queue_with_max163cores.png, 
> queue_with_max263cores.png, queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6302) Preempt reducers after a configurable timeout irrespective of headroom

2016-04-08 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated MAPREDUCE-6302:
--
Fix Version/s: 2.6.5
   2.7.3

> Preempt reducers after a configurable timeout irrespective of headroom
> --
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 2.8.0, 2.7.3, 2.6.5
>
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> MAPREDUCE-6302.branch-2.6.0001.patch, MAPREDUCE-6302.branch-2.7.0001.patch, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-5.patch, mr-6302-6.patch, mr-6302-7.patch, mr-6302-prelim.patch, 
> mr-6302_branch-2.patch, queue_with_max163cores.png, 
> queue_with_max263cores.png, queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails

2016-04-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232997#comment-15232997
 ] 

Eric Badger commented on MAPREDUCE-6658:


TestContainerManagerSecurity is tracked by 
[YARN-4342|https://issues.apache.org/jira/browse/YARN-4342]. Sorry for the 
typo. 

> TestMRJobs fails
> 
>
> Key: MAPREDUCE-6658
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6658.001.patch
>
>
> TestMRJobs#testJobWithChangePriority fails.
> {noformat}
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs
> testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs)  Time 
> elapsed: 21.477 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6302) Preempt reducers after a configurable timeout irrespective of headroom

2016-04-08 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232987#comment-15232987
 ] 

Wangda Tan commented on MAPREDUCE-6302:
---

Apologize I forgot backporting patches to maintenance releases. Doing it now.

> Preempt reducers after a configurable timeout irrespective of headroom
> --
>
> Key: MAPREDUCE-6302
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: mai shurong
>Assignee: Karthik Kambatla
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, 
> MAPREDUCE-6302.branch-2.6.0001.patch, MAPREDUCE-6302.branch-2.7.0001.patch, 
> log.txt, mr-6302-1.patch, mr-6302-2.patch, mr-6302-3.patch, mr-6302-4.patch, 
> mr-6302-5.patch, mr-6302-6.patch, mr-6302-7.patch, mr-6302-prelim.patch, 
> mr-6302_branch-2.patch, queue_with_max163cores.png, 
> queue_with_max263cores.png, queue_with_max333cores.png
>
>
> I submit a  big job, which has 500 maps and 350 reduce, to a 
> queue(fairscheduler) with 300 max cores. When the big mapreduce job is 
> running 100% maps, the 300 reduces have occupied 300 max cores in the queue. 
> And then, a map fails and retry, waiting for a core, while the 300 reduces 
> are waiting for failed map to finish. So a deadlock occur. As a result, the 
> job is blocked, and the later job in the queue cannot run because no 
> available cores in the queue.
> I think there is the similar issue for memory of a queue .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5397) AM crashes because Webapp failed to start on multi node cluster

2016-04-08 Thread Joshua Snyder (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232983#comment-15232983
 ] 

Joshua Snyder commented on MAPREDUCE-5397:
--

I've seen this exact error when users consumed all the inodes in the /tmp file 
system.  Is there any way to have jetty extract to a particular directory other 
than /tmp?

> AM crashes because Webapp failed to start on multi node cluster
> ---
>
> Key: MAPREDUCE-5397
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5397
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jian He
> Attachments: MRAppMasterlog.txt, log.txt
>
>
> I set up a 12 nodes cluster and tried submitting jobs but get this exception.
> But job is able to succeed after AM crashes and retry a few times(2 or 3)
> {code}
> 2013-07-12 18:56:28,438 INFO [main] org.mortbay.log: Extract 
> jar:file:/grid/0/dev/jhe/hadoop-2.1.0-beta/share/hadoop/yarn/hadoop-yarn-common-2.1.0-beta.jar!/webapps/mapreduce
>  to /tmp/Jetty_0_0_0_0_43554_mapreduceljbmlg/webapp
> 2013-07-12 18:56:28,528 WARN [main] org.mortbay.log: Failed startup of 
> context 
> org.mortbay.jetty.webapp.WebAppContext@2726b2{/,jar:file:/grid/0/dev/jhe/hadoop-2.1.0-beta/share/hadoop/yarn/hadoop-yarn-common-2.1.0-beta.jar!/webapps/mapreduce}
> java.io.FileNotFoundException: 
> /tmp/Jetty_0_0_0_0_43554_mapreduceljbmlg/webapp/webapps/mapreduce/.keep 
> (No such file or directory)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:194)
>   at java.io.FileOutputStream.(FileOutputStream.java:145)
>   at org.mortbay.resource.JarResource.extract(JarResource.java:215)
>   at 
> org.mortbay.jetty.webapp.WebAppContext.resolveWebApp(WebAppContext.java:974)
>   at 
> org.mortbay.jetty.webapp.WebAppContext.getWebInf(WebAppContext.java:832)
>   at 
> org.mortbay.jetty.webapp.WebInfConfiguration.configureClassLoader(WebInfConfiguration.java:62)
>   at 
> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:489)
>   at 
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>   at 
> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
>   at 
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
>   at org.mortbay.jetty.Server.doStart(Server.java:224)
>   at 
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
>   at org.apache.hadoop.http.HttpServer.start(HttpServer.java:684)
>   at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:211)
>   at 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:134)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:101)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1019)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1394)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1390)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails

2016-04-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232840#comment-15232840
 ] 

Eric Badger commented on MAPREDUCE-6658:


TestMiniYarnClusterNodeUtilization is tracked by 
[YARN-4453|https://issues.apache.org/jira/browse/YARN-4453]

TestMiniYarnClusterNodeUtilization is tracked by 
[YARN-4342|https://issues.apache.org/jira/browse/YARN-4342]

It concerns me a little bit that both of these tests failed twice in a row with 
the addition of my patch. However, I am unable to reproduce the failure on my 
local Mac or Linux boxes, even under high-cpu situations (similar to Jenkins). 

> TestMRJobs fails
> 
>
> Key: MAPREDUCE-6658
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6658.001.patch
>
>
> TestMRJobs#testJobWithChangePriority fails.
> {noformat}
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs
> testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs)  Time 
> elapsed: 21.477 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232778#comment-15232778
 ] 

Varun Saxena commented on MAPREDUCE-6513:
-

For checkstyle issue to be fixed I would need to change indentation of 
surrounding code which is not required to be changed. So I have left it as it 
is.

Regarding checking for priority as compared to rescheduled event, well the 
priority is set in RMContainerAllocator. In TestMRApp, there is a custom 
allocator so we cannot check that.
We can however check ContainerRequestEvent and see if the flag for earlier map 
task-attempt failed is set or not. If its set RMContainerAllocator will set the 
priority of next map task to 5.
And we have coverage in TestRMContainerAllocator for that part of the flow.

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5937) hadoop/mapred job -history shows the counters twice in the output.

2016-04-08 Thread Andres Perez (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232603#comment-15232603
 ] 

Andres Perez commented on MAPREDUCE-5937:
-

The thing is that TestCounters#testLegacyGetGroupsNames is actually expecting 
to find duplicates (because is expecting to find the group name + the legacy 
name of the same group), so modifying AbstractCounters#getGroupNames so it 
doesn't return duplicates will always make that tests fail.

> hadoop/mapred job -history  shows the counters twice in the 
> output.
> -
>
> Key: MAPREDUCE-5937
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5937
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.2.0, 2.7.2
>Reporter: Jinghui Wang
>Assignee: Andres Perez
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-5937-branch-2.7-02.patch, 
> MAPREDUCE-5937-branch-2.7.2.002.patch, MAPREDUCE-5937.patch, 
> job_history_cli_sample.out
>
>
> HiistoryView#printCounters method uses AbstractCounter#getGroupNames, which 
> includes legacy groups can cause duplicates on CLI output.
> See attached example output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails

2016-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232479#comment-15232479
 ] 

Hadoop QA commented on MAPREDUCE-6658:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 53s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 48s {color} 
| {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 45s {color} 
| {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.yarn.server.TestMiniYarnClusterNodeUtilization |
|   | hadoop.yarn.server.TestContainerManagerSecurity |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.TestMiniYarnClusterNodeUtilization |
|   | hadoop.yarn.server.TestContainerManagerSecurity |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797375/MAPREDUCE-6658.001.patch
 |
| JIRA Issue | MAPREDUCE-6658 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | 

[jira] [Commented] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232419#comment-15232419
 ] 

Hadoop QA commented on MAPREDUCE-6671:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} MAPREDUCE-6671 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797741/MAPREDUCE-6671.1.patch
 |
| JIRA Issue | MAPREDUCE-6671 |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6423/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Incorrect error message while setting dfs.block.size to wrong value
> ---
>
> Key: MAPREDUCE-6671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6671
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: MAPREDUCE-6671.1.patch
>
>
> Execute in Hive
> {code}
> hive> SET dfs.block.size=3200; 
> hive> select count(*) from test;
> {code}
> See logs
> {code}
> Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1460123221842_0001, Tracking URL = 
> http://cdh-master:8088/proxy/application_1460123221842_0001/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
> 2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1460123221842_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
> job_1460123221842_0001
> Task with the most failures(4): 
> -
> Task ID:
>   task_1460123221842_0001_m_00
> URL:
>   
> http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
> -
> Diagnostic Messages for this Task:
> Exception from container-launch.
> Container id: container_1460123221842_0001_01_05
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
>   at org.apache.hadoop.util.Shell.run(Shell.java:478)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> {code}
> We need to have more informative error message here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated MAPREDUCE-6671:

Description: 
Execute in Hive

{code}
hive> SET dfs.block.size=3200; 
hive> select count(*) from test;
{code}

See logs

{code}
Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1460123221842_0001, Tracking URL = 
http://cdh-master:8088/proxy/application_1460123221842_0001/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1460123221842_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
job_1460123221842_0001

Task with the most failures(4): 
-
Task ID:
  task_1460123221842_0001_m_00

URL:
  
http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
-
Diagnostic Messages for this Task:
Exception from container-launch.
Container id: container_1460123221842_0001_01_05
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
{code}

We need to have more informative error message here

  was:
Execute in Hive

{code}
hive> SET dfs.block.size=3200; 
hive> select count(*) from test;
{code}

Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1460123221842_0001, Tracking URL = 
http://cdh-master:8088/proxy/application_1460123221842_0001/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1460123221842_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
job_1460123221842_0001

Task with the most failures(4): 
-
Task ID:
  task_1460123221842_0001_m_00

URL:
  
http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
-
Diagnostic Messages for this Task:
Exception from container-launch.
Container id: container_1460123221842_0001_01_05
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 

[jira] [Updated] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated MAPREDUCE-6671:

Attachment: MAPREDUCE-6671.1.patch

> Incorrect error message while setting dfs.block.size to wrong value
> ---
>
> Key: MAPREDUCE-6671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6671
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: MAPREDUCE-6671.1.patch
>
>
> Execute in Hive
> {code}
> hive> SET dfs.block.size=3200; 
> hive> select count(*) from test;
> {code}
> Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1460123221842_0001, Tracking URL = 
> http://cdh-master:8088/proxy/application_1460123221842_0001/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
> 2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1460123221842_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
> job_1460123221842_0001
> Task with the most failures(4): 
> -
> Task ID:
>   task_1460123221842_0001_m_00
> URL:
>   
> http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
> -
> Diagnostic Messages for this Task:
> Exception from container-launch.
> Container id: container_1460123221842_0001_01_05
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
>   at org.apache.hadoop.util.Shell.run(Shell.java:478)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated MAPREDUCE-6671:

Status: Patch Available  (was: In Progress)

> Incorrect error message while setting dfs.block.size to wrong value
> ---
>
> Key: MAPREDUCE-6671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6671
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Attachments: MAPREDUCE-6671.1.patch
>
>
> Execute in Hive
> {code}
> hive> SET dfs.block.size=3200; 
> hive> select count(*) from test;
> {code}
> Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1460123221842_0001, Tracking URL = 
> http://cdh-master:8088/proxy/application_1460123221842_0001/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
> 2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1460123221842_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
> job_1460123221842_0001
> Task with the most failures(4): 
> -
> Task ID:
>   task_1460123221842_0001_m_00
> URL:
>   
> http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
> -
> Diagnostic Messages for this Task:
> Exception from container-launch.
> Container id: container_1460123221842_0001_01_05
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
>   at org.apache.hadoop.util.Shell.run(Shell.java:478)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-6671 started by Oleksiy Sayankin.
---
> Incorrect error message while setting dfs.block.size to wrong value
> ---
>
> Key: MAPREDUCE-6671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6671
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>
> Execute in Hive
> {code}
> hive> SET dfs.block.size=3200; 
> hive> select count(*) from test;
> {code}
> Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1460123221842_0001, Tracking URL = 
> http://cdh-master:8088/proxy/application_1460123221842_0001/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
> 2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1460123221842_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
> job_1460123221842_0001
> Task with the most failures(4): 
> -
> Task ID:
>   task_1460123221842_0001_m_00
> URL:
>   
> http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
> -
> Diagnostic Messages for this Task:
> Exception from container-launch.
> Container id: container_1460123221842_0001_01_05
> Exit code: 1
> Stack trace: ExitCodeException exitCode=1: 
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
>   at org.apache.hadoop.util.Shell.run(Shell.java:478)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Container exited with a non-zero exit code 1
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6671) Incorrect error message while setting dfs.block.size to wrong value

2016-04-08 Thread Oleksiy Sayankin (JIRA)
Oleksiy Sayankin created MAPREDUCE-6671:
---

 Summary: Incorrect error message while setting dfs.block.size to 
wrong value
 Key: MAPREDUCE-6671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6671
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Oleksiy Sayankin
Assignee: Oleksiy Sayankin


Execute in Hive

{code}
hive> SET dfs.block.size=3200; 
hive> select count(*) from test;
{code}

Query ID = vagrant_20160408135656_fd1937b3-b330-4d54-842a-0f3ec544ceea
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1460123221842_0001, Tracking URL = 
http://cdh-master:8088/proxy/application_1460123221842_0001/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1460123221842_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2016-04-08 13:57:18,494 Stage-1 map = 0%,  reduce = 0%
2016-04-08 13:58:06,821 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1460123221842_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1460123221842_0001_m_00 (and more) from job 
job_1460123221842_0001

Task with the most failures(4): 
-
Task ID:
  task_1460123221842_0001_m_00

URL:
  
http://cdh-master:8088/taskdetails.jsp?jobid=job_1460123221842_0001=task_1460123221842_0001_m_00
-
Diagnostic Messages for this Task:
Exception from container-launch.
Container id: container_1460123221842_0001_01_05
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails

2016-04-08 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232375#comment-15232375
 ] 

Eric Badger commented on MAPREDUCE-6658:


[~eepayne], [~jlowe], [~kasha], can one of you review this patch? It's a very 
small change that will fix some recurring test failures.

> TestMRJobs fails
> 
>
> Key: MAPREDUCE-6658
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Eric Badger
> Attachments: MAPREDUCE-6658.001.patch
>
>
> TestMRJobs#testJobWithChangePriority fails.
> {noformat}
> Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs
> testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs)  Time 
> elapsed: 21.477 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232088#comment-15232088
 ] 

Hadoop QA commented on MAPREDUCE-6513:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: 
patch generated 1 new + 548 unchanged - 1 fixed = 549 total (was 549) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 10s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 47s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 40s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797710/MAPREDUCE-6513.03.patch
 |
| JIRA Issue | MAPREDUCE-6513 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 81a3b046ca4a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Status: Patch Available  (was: Open)

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Attachment: MAPREDUCE-6513.03.patch

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Status: Open  (was: Patch Available)

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232011#comment-15232011
 ] 

Hadoop QA commented on MAPREDUCE-6513:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: 
patch generated 2 new + 548 unchanged - 1 fixed = 550 total (was 549) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 9s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 48s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 23s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797702/MAPREDUCE-6513.02.patch
 |
| JIRA Issue | MAPREDUCE-6513 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 568c8ea75ff0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Status: Patch Available  (was: Open)

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Attachment: MAPREDUCE-6513.02.patch

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Attachment: (was: MAPREDUCE-6513.02.patch)

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-04-08 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated MAPREDUCE-6513:

Attachment: MAPREDUCE-6513.02.patch

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)