date:20160907

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Resolution: Invalid
Status: Resolved  (was: Patch Available)

Resolved this as it will be handled in the original issue.

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471879#comment-15471879
 ] 

Hadoop QA commented on MAPREDUCE-6775:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 44s {color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ipc.TestRPCWaitForProxy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827447/MAPREDUCE-6775-v1.patch
 |
| JIRA Issue | MAPREDUCE-6775 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e80b785c3282 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f414d5e |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6707/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6707/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6707/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6707/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Fix MapReduce failures caused by default RPC engine changing
>

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Status: Patch Available  (was: Open)

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Attachment: MAPREDUCE-6775-v1.patch

Provided a fix to change back the default RPC engine.

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)

Kai Zheng created MAPREDUCE-6775:


 Summary: Fix MapReduce failures caused by default RPC engine 
changing
 Key: MAPREDUCE-6775
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2
Reporter: Kai Zheng
Assignee: Kai Zheng


HADOOP-13218 changed the default RPC engine, which isn't inappropriate because 
MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to use 
ProtobufRPCEngine.

[~jlowe] reported the following errors:
{noformat}
2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy10.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
Caused by: com.google.protobuf.ServiceException: Too many or few parameters for 
request. Method: [getTask], Expected: 2, Actual: 1
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
... 2 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6765) MR should not schedule container requests in cases where reducer or mapper containers demand resource larger than the maximum supported

2016-09-07 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471227#comment-15471227
 ] 

Daniel Templeton commented on MAPREDUCE-6765:
-

I share Djikstra's dislike of multiple exit points from a function, especially 
in this case where the code is explicitly structured as a series of 
conditionals to avoid having multiple exit points.  In both cases, could you 
please put the subsequent few lines of code in an _else_ instead of adding the 
returns?

> MR should not schedule container requests in cases where reducer or mapper 
> containers demand resource larger than the maximum supported
> ---
>
> Key: MAPREDUCE-6765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6765
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: mapreduce6765.001.patch
>
>
> When mapper or reducer containers request resource larger than the 
> maxResourceRequest in the cluster, job is to be killed. In such cases, it is 
> unnecessary to still schedule container requests.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-07 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471152#comment-15471152
 ] 

Karthik Kambatla commented on MAPREDUCE-6638:
-

bq. MR does not seem to have any safe store to persist the encryption key 
across job attempts
One could store the encrypted key in KMS. Once stored, we could do one of the 
following:
# Tasks have a way to fetch this key directly
#  Leave the tasks as is, but augment the AM to recover this key as part of 
recovery. 

bq. Not quite sure about when to make something like this incline or a tiny 
method.
It is okay to pull this into a separate method. That said, I feel this method 
could do with improved logging. For instance, when we don't recover, the method 
dumps a bunch of information that is hard for end-users to parse. Instead, I 
would like something like this:
{code}
boolean attemptRecovery = true;
boolean recoveryEnabled = getConfig().getBoolean(
MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE,
MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE_DEFAULT);
if (!recoveryEnabled) {
  LOG.info("Not attempting to recover. Recovery disabled. To enable " +
  "recovery, set " + MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE);
  attemptRecovery = false;
}

boolean recoverySupportedByCommitter = isRecoverySupported();
if (!recoverySupportedByCommitter) {
  LOG.info("Not attempting to recover. Recovery is not supported by " +
  "committer. Use an OutputCommitter that allows recovery.");
  attemptRecovery = false;
}

int numReduceTasks = getConfig().getInt(MRJobConfig.NUM_REDUCES, 0);
boolean shuffleKeyValidForRecovery =
TokenCache.getShuffleSecretKey(jobCredentials) != null;
if (numReduceTasks > 0 && !shuffleKeyValidForRecovery) {
  LOG.info("Not attempting to recover. Shuffle key is not valid for " +
  "recovery.");
  attemptRecovery = false;
}

boolean recoverySucceeded = true;
if (attemptRecovery) {
  LOG.info("Attempting to recover.");
  try {
parsePreviousJobHistory();
  } catch (IOException e) {
LOG.warn("Unable to parse prior job history, aborting recovery", e);
recoverySucceeded = false;
  }
}

if (!attemptRecovery || !recoverySucceeded) {
  // Get the amInfos anyways whether recovery is enabled or not
  am
{code}
Alternatively, processRecovery could return true if recovery succeeded or if it 
is first attempt, and false otherwise. 




> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-07 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-6638:

Priority: Major  (was: Critical)

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-07 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-6638:

Summary: Do not attempt to recover jobs if encrypted spill is enabled  
(was: Jobs with encrypted spills don't recover if the AM goes down)

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
>Priority: Critical
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

2016-09-07 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-6638:

Issue Type: Improvement  (was: Bug)

> Do not attempt to recover jobs if encrypted spill is enabled
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
>Priority: Critical
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

2016-09-07 Thread Mariappan Asokan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470583#comment-15470583
 ] 

Mariappan Asokan commented on MAPREDUCE-6628:
-

Chris, thanks for looking at the patch.  I agree with you.  I will create a new 
unit test for {{CryptoOutputStream}} and upload a new patch.

> Potential memory leak in CryptoOutputStream
> ---
>
> Key: MAPREDUCE-6628
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6628
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.6.4
>Reporter: Mariappan Asokan
>Assignee: Mariappan Asokan
> Attachments: MAPREDUCE-6628.001.patch, MAPREDUCE-6628.002.patch, 
> MAPREDUCE-6628.003.patch, MAPREDUCE-6628.004.patch, MAPREDUCE-6628.005.patch, 
> MAPREDUCE-6628.006.patch, MAPREDUCE-6628.007.patch
>
>
> There is a potential memory leak in {{CryptoOutputStream.java.}}  It 
> allocates two direct byte buffers ({{inBuffer}} and {{outBuffer}}) that get 
> freed when {{close()}} method is called.  Most of the time, {{close()}} 
> method is called.  However, when writing to intermediate Map output file or 
> the spill files in {{MapTask}}, {{close()}} is never called since calling so  
> would close the underlying stream which is not desirable.  There is a single 
> underlying physical stream that contains multiple logical streams one per 
> partition of Map output.  
> By default the amount of memory allocated per byte buffer is 128 KB and  so 
> the total memory allocated is 256 KB,  This may not sound much.  However, if 
> the number of partitions (or number of reducers) is large (in the hundreds) 
> and/or there are spill files created in {{MapTask}}, this can grow into a few 
> hundred MB. 
> I can think of two ways to address this issue:
> h2. Possible Fix - 1
> According to JDK documentation:
> {quote}
> The contents of direct buffers may reside outside of the normal 
> garbage-collected heap, and so their impact upon the memory footprint of an 
> application might not be obvious.  It is therefore recommended that direct 
> buffers be allocated primarily for large, long-lived buffers that are subject 
> to the underlying system's native I/O operations.  In general it is best to 
> allocate direct buffers only when they yield a measureable gain in program 
> performance.
> {quote}
> It is not clear to me whether there is any benefit of allocating direct byte 
> buffers in {{CryptoOutputStream.java}}.  In fact, there is a slight CPU 
> overhead in moving data from {{outBuffer}} to a temporary byte array as per 
> the following code in {{CryptoOutputStream.java}}.
> {code}
> /*
>  * If underlying stream supports {@link ByteBuffer} write in future, needs
>  * refine here. 
>  */
> final byte[] tmp = getTmpBuf();
> outBuffer.get(tmp, 0, len);
> out.write(tmp, 0, len);
> {code}
> Even if the underlying stream supports direct byte buffer IO (or direct IO in 
> OS parlance), it is not clear whether it will yield any measurable 
> performance gain.
> The fix would be to allocate a ByteBuffer on the heap for inBuffer and wrap a 
> byte array in a {{ByteBuffer}} for {{outBuffer}}.  By the way, the 
> {{inBuffer}} and {{outBuffer}} have to be {{ByteBuffer}} as demanded by the 
> {{encrypt()}} method in {{Encryptor}}.
> h2. Possible Fix - 2
> Assuming that we want to keep the buffers as direct byte buffers, we can 
> create a new constructor to {{CryptoOutputStream}} and pass a boolean flag 
> {{ownOutputStream}} to indicate whether the underlying stream will be owned 
> by {{CryptoOutputStream}}. If it is true, then calling the {{close()}} method 
> will close the underlying stream.  Otherwise, when {{close()}} is called only 
> the direct byte buffers will be freed and the underlying stream will not be 
> closed.
> The scope of changes for this fix will be somewhat wider.  We need to modify 
> {{MapTask.java}}, {{CryptoUtils.java}}, and {{CryptoFSDataOutputStream.java}} 
> as well to pass the ownership flag mentioned above.
> I can post a patch for either of the above.  I welcome any other ideas from 
> developers to fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6638) Jobs with encrypted spills don't recover if the AM goes down

2016-09-07 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470532#comment-15470532
 ] 

Haibo Chen commented on MAPREDUCE-6638:
---

[~ka...@cloudera.com] Thanks for your reviews! I have left this jira to do 1), 
so I will create another jira that does 2) after this is done. But as I said in 
my previous comment, MR does not seem to have any safe store to persist the 
encryption key across job attempts (Not sure how much efforts it is to 
introduce one).

bq. Can the boolean condition be simplified to numReduceTasks <= 0 || 
shuffleKeyValidForRecovery || !spillEncrypted?
Not quite sure about when to make something like this incline or a tiny method. 
I was thinking of using the method name as comment sort of. Any guideline on 
this?

> Jobs with encrypted spills don't recover if the AM goes down
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
>Priority: Critical
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6638) Jobs with encrypted spills don't recover if the AM goes down

2016-09-07 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469707#comment-15469707
 ] 

Karthik Kambatla commented on MAPREDUCE-6638:
-

We should have two JIRAs for this - (1) Avoid recovering an AM if encrypted 
spill is enabled, and (2) Support recovering an AM even when encrypted spill is 
enabled. I am fine with using this JIRA for either, but we should file the 
other one too. 

Comments on the patch:
# The comment for the variable can be simplified to be clear. For instance, 
"When intermediate data is encrypted, recovering the job requires access to the 
key. Until the encryption key is persisted, we should avoid recovery attempts."
# Can the boolean condition be simplified to {{numReduceTasks <= 0 || 
shuffleKeyValidForRecovery || !spillEncrypted}}? 
# I assume the new method {{recovered()}} is for tests only. Should we annotate 
it @VisibleForTesting? 


> Jobs with encrypted spills don't recover if the AM goes down
> 
>
> Key: MAPREDUCE-6638
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 2.7.2
>Reporter: Karthik Kambatla
>Assignee: Haibo Chen
>Priority: Critical
> Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be 
> recovered if the AM fails. We should store the key some place safe so they 
> can actually be recovered. If there is no "safe" place, at least we should 
> restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

[jira] [Commented] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

[jira] [Created] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

[jira] [Commented] (MAPREDUCE-6765) MR should not schedule container requests in cases where reducer or mapper containers demand resource larger than the maximum supported

[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

[jira] [Updated] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

[jira] [Commented] (MAPREDUCE-6628) Potential memory leak in CryptoOutputStream

[jira] [Commented] (MAPREDUCE-6638) Jobs with encrypted spills don't recover if the AM goes down

[jira] [Commented] (MAPREDUCE-6638) Jobs with encrypted spills don't recover if the AM goes down

13 matches

Site Navigation

Mail list logo

Footer information