[jira] [Commented] (MAPREDUCE-6542) HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe

2015-12-10 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052083#comment-15052083
 ] 

Daniel Templeton commented on MAPREDUCE-6542:
-

[~piaoyu zhang], that patch looks pretty good to me.  I'll take closer look 
later, but nice work!

> HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe
> -
>
> Key: MAPREDUCE-6542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.2.0, 2.7.1
> Environment: CentOS6.5 Hadoop  
>Reporter: zhangyubiao
>Assignee: zhangyubiao
> Attachments: MAPREDUCE-6542-v2.patch, MAPREDUCE-6542-v3.patch, 
> MAPREDUCE-6542-v4.patch, MAPREDUCE-6542-v5.patch, MAPREDUCE-6542-v6.patch, 
> MAPREDUCE-6542-v7.patch, MAPREDUCE-6542-v8.patch, MAPREDUCE-6542.patch
>
>
> I use SimpleDateFormat to Parse the JobHistory File before 
> {code}
> private static final SimpleDateFormat dateFormat =
> new SimpleDateFormat("-MM-dd HH:mm:ss");
>  public static String getJobDetail(JobInfo job) {
> StringBuffer jobDetails = new StringBuffer("");
> SummarizedJob ts = new SummarizedJob(job);
> jobDetails.append(job.getJobId().toString().trim()).append("\t");
> jobDetails.append(job.getUsername()).append("\t");
> jobDetails.append(job.getJobname().replaceAll("\\n", 
> "")).append("\t");
> jobDetails.append(job.getJobQueueName()).append("\t");
> jobDetails.append(job.getPriority()).append("\t");
> jobDetails.append(job.getJobConfPath()).append("\t");
> jobDetails.append(job.getUberized()).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getSubmitTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getLaunchTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getFinishTime())).append("\t");
>return jobDetails.toString();
> }
> {code}
> But I find I query the SubmitTime and LaunchTime in hive and compare 
> JobHistory File time , I find that the submitTime  and launchTime was wrong.
> Finally,I change to use the FastDateFormat to parse the time format and the 
> time become right



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6542) HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe

2015-12-10 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052051#comment-15052051
 ] 

zhangyubiao commented on MAPREDUCE-6542:


 MAPREDUCE-6542-v8.patch for review

> HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe
> -
>
> Key: MAPREDUCE-6542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.2.0, 2.7.1
> Environment: CentOS6.5 Hadoop  
>Reporter: zhangyubiao
>Assignee: zhangyubiao
> Attachments: MAPREDUCE-6542-v2.patch, MAPREDUCE-6542-v3.patch, 
> MAPREDUCE-6542-v4.patch, MAPREDUCE-6542-v5.patch, MAPREDUCE-6542-v6.patch, 
> MAPREDUCE-6542-v7.patch, MAPREDUCE-6542-v8.patch, MAPREDUCE-6542.patch
>
>
> I use SimpleDateFormat to Parse the JobHistory File before 
> {code}
> private static final SimpleDateFormat dateFormat =
> new SimpleDateFormat("-MM-dd HH:mm:ss");
>  public static String getJobDetail(JobInfo job) {
> StringBuffer jobDetails = new StringBuffer("");
> SummarizedJob ts = new SummarizedJob(job);
> jobDetails.append(job.getJobId().toString().trim()).append("\t");
> jobDetails.append(job.getUsername()).append("\t");
> jobDetails.append(job.getJobname().replaceAll("\\n", 
> "")).append("\t");
> jobDetails.append(job.getJobQueueName()).append("\t");
> jobDetails.append(job.getPriority()).append("\t");
> jobDetails.append(job.getJobConfPath()).append("\t");
> jobDetails.append(job.getUberized()).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getSubmitTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getLaunchTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getFinishTime())).append("\t");
>return jobDetails.toString();
> }
> {code}
> But I find I query the SubmitTime and LaunchTime in hive and compare 
> JobHistory File time , I find that the submitTime  and launchTime was wrong.
> Finally,I change to use the FastDateFormat to parse the time format and the 
> time become right



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6542) HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe

2015-12-10 Thread zhangyubiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052050#comment-15052050
 ] 

zhangyubiao commented on MAPREDUCE-6542:


Thanks a lot for [~templedf]  for patient guidance. 

> HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe
> -
>
> Key: MAPREDUCE-6542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.2.0, 2.7.1
> Environment: CentOS6.5 Hadoop  
>Reporter: zhangyubiao
>Assignee: zhangyubiao
> Attachments: MAPREDUCE-6542-v2.patch, MAPREDUCE-6542-v3.patch, 
> MAPREDUCE-6542-v4.patch, MAPREDUCE-6542-v5.patch, MAPREDUCE-6542-v6.patch, 
> MAPREDUCE-6542-v7.patch, MAPREDUCE-6542-v8.patch, MAPREDUCE-6542.patch
>
>
> I use SimpleDateFormat to Parse the JobHistory File before 
> {code}
> private static final SimpleDateFormat dateFormat =
> new SimpleDateFormat("-MM-dd HH:mm:ss");
>  public static String getJobDetail(JobInfo job) {
> StringBuffer jobDetails = new StringBuffer("");
> SummarizedJob ts = new SummarizedJob(job);
> jobDetails.append(job.getJobId().toString().trim()).append("\t");
> jobDetails.append(job.getUsername()).append("\t");
> jobDetails.append(job.getJobname().replaceAll("\\n", 
> "")).append("\t");
> jobDetails.append(job.getJobQueueName()).append("\t");
> jobDetails.append(job.getPriority()).append("\t");
> jobDetails.append(job.getJobConfPath()).append("\t");
> jobDetails.append(job.getUberized()).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getSubmitTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getLaunchTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getFinishTime())).append("\t");
>return jobDetails.toString();
> }
> {code}
> But I find I query the SubmitTime and LaunchTime in hive and compare 
> JobHistory File time , I find that the submitTime  and launchTime was wrong.
> Finally,I change to use the FastDateFormat to parse the time format and the 
> time become right



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6542) HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe

2015-12-10 Thread zhangyubiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangyubiao updated MAPREDUCE-6542:
---
Attachment: MAPREDUCE-6542-v8.patch

> HistoryViewer use SimpleDateFormat,But SimpleDateFormat is not threadsafe
> -
>
> Key: MAPREDUCE-6542
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6542
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Affects Versions: 2.2.0, 2.7.1
> Environment: CentOS6.5 Hadoop  
>Reporter: zhangyubiao
>Assignee: zhangyubiao
> Attachments: MAPREDUCE-6542-v2.patch, MAPREDUCE-6542-v3.patch, 
> MAPREDUCE-6542-v4.patch, MAPREDUCE-6542-v5.patch, MAPREDUCE-6542-v6.patch, 
> MAPREDUCE-6542-v7.patch, MAPREDUCE-6542-v8.patch, MAPREDUCE-6542.patch
>
>
> I use SimpleDateFormat to Parse the JobHistory File before 
> {code}
> private static final SimpleDateFormat dateFormat =
> new SimpleDateFormat("-MM-dd HH:mm:ss");
>  public static String getJobDetail(JobInfo job) {
> StringBuffer jobDetails = new StringBuffer("");
> SummarizedJob ts = new SummarizedJob(job);
> jobDetails.append(job.getJobId().toString().trim()).append("\t");
> jobDetails.append(job.getUsername()).append("\t");
> jobDetails.append(job.getJobname().replaceAll("\\n", 
> "")).append("\t");
> jobDetails.append(job.getJobQueueName()).append("\t");
> jobDetails.append(job.getPriority()).append("\t");
> jobDetails.append(job.getJobConfPath()).append("\t");
> jobDetails.append(job.getUberized()).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getSubmitTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getLaunchTime())).append("\t");
> 
> jobDetails.append(dateFormat.format(job.getFinishTime())).append("\t");
>return jobDetails.toString();
> }
> {code}
> But I find I query the SubmitTime and LaunchTime in hive and compare 
> JobHistory File time , I find that the submitTime  and launchTime was wrong.
> Finally,I change to use the FastDateFormat to parse the time format and the 
> time become right



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6567) mapreduce ACLs documentation shows incorrect syntax

2015-12-10 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051772#comment-15051772
 ] 

Daniel Templeton commented on MAPREDUCE-6567:
-

The syntax proposed in the patch should not work.  The accepted syntax is: 
{{user1[,user2[...[,userN]]] group1[,group2[...[groupN]]]}}.  If the entire 
user or group section is {{*}}, it allows all access, e.g. {{user *}} or {{* 
group}} or just {{*}}.  See 
https://www.codatlas.com/github.com/apache/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/AccessControlList.java?line=105.

> mapreduce ACLs documentation shows incorrect syntax
> ---
>
> Key: MAPREDUCE-6567
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6567
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Dustin Cote
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: MAPREDUCE-6567-1.patch
>
>
> The description of the mapreduce.job.acl-* in the mapred-default.xml shows 
> the wrong syntax.  
> {quote}
> For specifying a list of users and groups the format to use is "user1,user2 
> group1,group". If set to '*', it allows all users/groups to modify this job.
> {quote}
> This doesn't actually work.  The syntax that does work:
> {quote}
> For specifying a list of users and groups the format to use is "user1,user2 
> group1,* group". If set to '*', it allows all users/groups to modify this job.
> {quote}
> The difference being that to make all members of a group have permissions for 
> an ACL, the specification must be '* group' not just 'group'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-6397) MAPREDUCE makes many endian-dependent assumptions

2015-12-10 Thread Alan Burlison (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison reassigned MAPREDUCE-6397:


Assignee: Alan Burlison

> MAPREDUCE makes many endian-dependent assumptions
> -
>
> Key: MAPREDUCE-6397
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6397
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.7.0
> Environment: Any big-endian platform
>Reporter: Alan Burlison
>Assignee: Alan Burlison
>
> MAPREDUCE native code contains multiple uses of the bswap and bswap64 
> assembler functions (from 
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib/primitives.h).
> primitives.h contains neither a sparc implementation of bswap nor a 
> platform-independent C fallback implementation. In addition, byte swaps are 
> nearly always made without checking if the platform is big or little endian, 
> the assumption hard-coded throughout the source seems to be that the platform 
> is little-endian. This most likely means that the MAPREDUCE native is 
> currently code non-portable to big-endian platforms. The code needs to be 
> examined carefully to determine which byte swaps are correct on all platforms 
> and which are endian-dependent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2015-12-10 Thread archit neema (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

archit neema updated MAPREDUCE-3418:

Assignee: John George  (was: archit neema)

> If map output is not found, shuffle runs in tight loop
> --
>
> Key: MAPREDUCE-3418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.0, 2.3.0
>Reporter: John George
>Assignee: John George
>
> Sharad Agarwal bumped into this while simulating fetch failures. 
> Removed the map output directory. Shuffle runs in tight loop throwing
> :
> 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
> Invalid map id 
> java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
> Internal Server Error
> Content-Type: text/plain; charset=UTF is not properly formed
> at 
> org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
> Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2015-12-10 Thread archit neema (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

archit neema reassigned MAPREDUCE-3418:
---

Assignee: archit neema  (was: sahitya pavurala)

> If map output is not found, shuffle runs in tight loop
> --
>
> Key: MAPREDUCE-3418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.0, 2.3.0
>Reporter: John George
>Assignee: archit neema
>
> Sharad Agarwal bumped into this while simulating fetch failures. 
> Removed the map output directory. Shuffle runs in tight loop throwing
> :
> 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
> Invalid map id 
> java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
> Internal Server Error
> Content-Type: text/plain; charset=UTF is not properly formed
> at 
> org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
> Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5903) If Kerberos Authentication is enabled, MapReduce job is failing on reducer phase

2015-12-10 Thread archit neema (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050931#comment-15050931
 ] 

archit neema commented on MAPREDUCE-5903:
-

Refer 
https://issues.apache.org/jira/browse/MAPREDUCE-3418

> If Kerberos Authentication is enabled, MapReduce job is failing on reducer 
> phase
> 
>
> Key: MAPREDUCE-5903
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5903
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.4.0
> Environment: hadoop: 2.4.0.2.1.2.0
>Reporter: Victor Kim
>Priority: Critical
>  Labels: shuffle
>
> I have 3-node cluster configuration: 1 ResourceManager and 3 NodeManagers, 
> Kerberos is enabled, have hdfs, yarn, mapred principals\keytabs. 
> ResourceManager and NodeManager are ran under yarn user, using yarn Kerberos 
> principal. 
> Use case 1: WordCount, submit job using yarn UGI (i.e. superuser, the one 
> having Kerberos principal on all boxes). Result: job successfully completed.
> Use case 2: WordCount, submit job using LDAP user impersonation via yarn UGI. 
> Result: Map tasks are completed SUCCESSfully, Reduce task fails with 
> ShuffleError Caused by: java.io.IOException: Exceeded 
> MAX_FAILED_UNIQUE_FETCHES (see the stack trace below).
> The use case with user impersonation used to work on earlier versions, 
> without YARN (with JT&TT).
> I found similar issue with Kerberos AUTH involved here: 
> https://groups.google.com/forum/#!topic/nosql-databases/tGDqs75ACqQ
> And here https://issues.apache.org/jira/browse/MAPREDUCE-4030 it's marked as 
> resolved, which is not the case when Kerberos Authentication is enabled.
> The exception trace from YarnChild JVM:
> 2014-05-21 12:49:35,687 FATAL [fetcher#3] 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Shuffle failed 
> with too many fetch failures and insufficient progress!
> 2014-05-21 12:49:35,688 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#3
> at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; 
> bailing-out.
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:323)
> at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:245)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-3418) If map output is not found, shuffle runs in tight loop

2015-12-10 Thread archit neema (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050915#comment-15050915
 ] 

archit neema commented on MAPREDUCE-3418:
-

This issue replicated at my end 
Reason :- 
previously hadoop was running in unsecured (kerberos) environment
And i ran Maprjobs
appcache Directory was created inside the yarn//nm/usercache/username/ 
Automatically access rights on this directory is drwx--x---

currently i switch my hadoop from unsure to secure environment
i am not able to run any mapreduce jobs 

Workaround :
Go To yarn//nm/usercache/username/
Delete appcache Directory, Restart yarn service 
and re-run the failure jobs, Automatically appcache will created with  
drwx--S--- Access rights 

Possible Solution:
while switching from unsecured environment to  secured Environment then Access 
rights should be check on all the directories  




> If map output is not found, shuffle runs in tight loop
> --
>
> Key: MAPREDUCE-3418
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3418
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.0, 2.3.0
>Reporter: John George
>Assignee: sahitya pavurala
>
> Sharad Agarwal bumped into this while simulating fetch failures. 
> Removed the map output directory. Shuffle runs in tight loop throwing
> :
> 2011-06-01 09:02:20,511 WARN org.apache.hadoop.mapreduce.task.reduce.Fetcher: 
> Invalid map id 
> java.lang.IllegalArgumentException: TaskAttemptId string : TTP/1.1 500 
> Internal Server Error
> Content-Type: text/plain; charset=UTF is not properly formed
> at 
> org.apache.hadoop.mapreduce.TaskAttemptID.forName(TaskAttemptID.java:174)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:284)
> at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
> at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
> Fetch failure is not triggered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API

2015-12-10 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-5485:
--
Release Note: 
Previously, the MR job will get failed if AM get restarted for some reason 
(like node failure, etc.) during its doing commit job no matter if AM attempts 
reach to the maximum attempts. 
In this improvement, we add a new API isCommitJobRepeatable() to 
OutputCommitter interface which to indicate if job's committer can do commitJob 
again if previous commit work is interrupted by NM/AM failures, etc. The 
instance of OutputCommitter, which support repeatable job commit (like 
FileOutputCommitter in algorithm 2), can allow AM to continue the commitJob() 
after AM restart as a new attempt.

> Allow repeating job commit by extending OutputCommitter API
> ---
>
> Key: MAPREDUCE-5485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 2.1.0-beta
>Reporter: Nemon Lou
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.7.3
>
> Attachments: MAPREDUCE-5485-demo-2.patch, MAPREDUCE-5485-demo.patch, 
> MAPREDUCE-5485-v1.patch, MAPREDUCE-5485-v2.patch, MAPREDUCE-5485-v3.1.patch, 
> MAPREDUCE-5485-v3.patch, MAPREDUCE-5485-v4.1.patch, MAPREDUCE-5485-v4.patch, 
> MAPREDUCE-5485-v5-branch-2.7.patch, MAPREDUCE-5485-v5.patch
>
>
> There are chances MRAppMaster crush during job committing,or NodeManager 
> restart cause the committing AM exit due to container expire.In these cases 
> ,the job will fail.
> However,some jobs can redo commit so failing the job becomes unnecessary.
> Let clients tell AM to allow redo commit or not is a better choice.
> This idea comes from Jason Lowe's comments in MAPREDUCE-4819 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6436) JobHistory cache issue

2015-12-10 Thread Ryu Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryu Kobayashi updated MAPREDUCE-6436:
-
Assignee: Kai Sasaki  (was: Ryu Kobayashi)

> JobHistory cache issue
> --
>
> Key: MAPREDUCE-6436
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6436
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ryu Kobayashi
>Assignee: Kai Sasaki
> Attachments: MAPREDUCE-6436.1.patch, MAPREDUCE-6436.2.patch, 
> stacktrace1.txt, stacktrace2.txt, stacktrace3.txt
>
>
> Problem: 
> HistoryFileManager.addIfAbsent produces large amount of logs if number of
> cached entries whose age is less than mapreduce.jobhistory.max-age-ms becomes
> larger than mapreduce.jobhistory.joblist.cache.size by far.
> Example:
> For example, if the cache contains 5 entries in total and 10,000 entries
> newer than mapreduce.jobhistory.max-age-ms where
> mapreduce.jobhistory.joblist.cache.size is 2, 
> HistoryFileManager.addIfAbsent
> method produces 5 - 2 = 3 lines of "Waiting to remove  from
> JobListCache because it is not in done yet" message.
> It will attach a stacktrace.
> Impact:
> In addition to large disk consumption, this issue blocks JobHistory.getJob
> long time and slows job execution down significantly because getJob is called
> by RPC such as HistoryClientService.HSClientProtocolHandler.getJobReport.
> This impact happens because HistoryFileManager.UserLogDir.scanIfNeeded
> eventually calls HistoryFileManager.addIfAbsent in a synchronized block. When
> multiple threads call scanIfNeeded simultaneously, one of them acquires lock
> and the other threads are blocked until the first thread completes 
> long-running
> HistoryFileManager.addIfAbsent call.
> Solution: 
> * Reduce amount of logs so that HistoryFileManager.addIfAbsent doesn't take 
> too long time.
> * Good to have if possible: HistoryFileManager.UserLogDir.scanIfNeeded skips
>   scanning if another thread is already scanning. This changes semantics of
>   some HistoryFileManager methods (such as getAllFileInfo and getFileInfo)
>   because scanIfNeeded keep outdated state.
> * Good to have if possible: Make scanIfNeeded asynchronous so that RPC calls 
> are
>   not blocked by a loop at scale of tens of thousands.
>  
> This patch implemented the first item.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6417) MapReduceClient's primitives.h is toxic and should be extirpated

2015-12-10 Thread Alan Burlison (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050369#comment-15050369
 ] 

Alan Burlison commented on MAPREDUCE-6417:
--

Yes I have. Except on very small sizes (less than around 8 bytes) primitives.h 
is slower. And it really doesn't matter even if it is faster if it is 
incorrect, which it is.

> MapReduceClient's primitives.h is toxic and should be extirpated
> 
>
> Key: MAPREDUCE-6417
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6417
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
>Priority: Blocker
> Attachments: MAPREDUCE-6417.001.patch
>
>
> MapReduceClient's primitives.h attempts to provide optimised versions of 
> standard library memory copy and comparison functions. It has been the 
> subject of several portability-related bugs:
> * HADOOP-11505 hadoop-mapreduce-client-nativetask uses bswap where be32toh is 
> needed, doesn't work on non-x86
> * HADOOP-11665 Provide and unify cross platform byteorder support in native 
> code
> * MAPREDUCE-6397 MAPREDUCE makes many endian-dependent assumptions
> * HADOOP-11484 hadoop-mapreduce-client-nativetask fails to build on ARM 
> AARCH64 due to x86 asm statements
> At present it only works on x86 and ARM64 as it lacks definitions for bswap 
> and bswap64 for any platforms other than those.
> However it has even more serious problems on non-x86 architectures, for 
> example on SPARC simple_memcpy simply doesn't work at all:
> {code}
> $ cat bang.cc
> #include 
> #define SIMPLE_MEMCPY
> #include "primitives.h"
> int main(int argc, char **argv)
> {
> char b1[9];
> char b2[9];
> simple_memcpy(b2, b1, sizeof(b1));
> }
> $ gcc -o bang bang.cc && ./bang
> Bus Error (core dumped)
> {code}
> That's because simple_memcpy does pointer fiddling that results in misaligned 
> accesses, which are illegal on SPARC.
> fmemcmp is also broken. Even if a definition of bswap is provided, on 
> big-endian architectures the result is simply wrong because of its 
> unconditional use of bswap:
> {code}
> $ cat thud.cc
> #include 
> #include 
> #include "primitives.h"
> int main(int argc, char **argv)
> {
> char a[] = { 0,1,2,0 };
> char b[] = { 0,2,1,0 };
> printf("%lld %d\n", fmemcmp(a, b, sizeof(a), memcmp(a, b, sizeof(a;
> }
> $ g++ -o thud thud.cc && ./thud
> 65280 -1
> {code}
> And in addition fmemcmp suffers from the same misalignment issues as 
> simple_memcpy and coredumps on SPARC when asked to compare odd-sized buffers.
> primitives.h provides the following functions:
> * bswap - used in 12 files in MRC but as HADOOP-11505 points out, mostly 
> incorrectly as it takes no account of platform endianness
> * bswap64 - used in 4 files in MRC, same comments as per bswap apply
> * simple_memcpy - used in 3 files in MRC, should be replaced with the 
> standard memcpy
> * fmemcmp - used in 1 file, should be replaced with the standard memcmp
> * fmemeq - used in 1 file, should be replaced with the standard memcmp
> * frmemeq - not used at all, should just be removed
> *Summary*: primitives.h should simply be deleted and replaced with the 
> standard memory copy & compare functions, or with thin wrappers around them 
> where the APIs are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6436) JobHistory cache issue

2015-12-10 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050335#comment-15050335
 ] 

Kai Sasaki commented on MAPREDUCE-6436:
---

[~zxu] Sorry for bothering you again. Could you review this?

> JobHistory cache issue
> --
>
> Key: MAPREDUCE-6436
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6436
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ryu Kobayashi
>Assignee: Ryu Kobayashi
> Attachments: MAPREDUCE-6436.1.patch, MAPREDUCE-6436.2.patch, 
> stacktrace1.txt, stacktrace2.txt, stacktrace3.txt
>
>
> Problem: 
> HistoryFileManager.addIfAbsent produces large amount of logs if number of
> cached entries whose age is less than mapreduce.jobhistory.max-age-ms becomes
> larger than mapreduce.jobhistory.joblist.cache.size by far.
> Example:
> For example, if the cache contains 5 entries in total and 10,000 entries
> newer than mapreduce.jobhistory.max-age-ms where
> mapreduce.jobhistory.joblist.cache.size is 2, 
> HistoryFileManager.addIfAbsent
> method produces 5 - 2 = 3 lines of "Waiting to remove  from
> JobListCache because it is not in done yet" message.
> It will attach a stacktrace.
> Impact:
> In addition to large disk consumption, this issue blocks JobHistory.getJob
> long time and slows job execution down significantly because getJob is called
> by RPC such as HistoryClientService.HSClientProtocolHandler.getJobReport.
> This impact happens because HistoryFileManager.UserLogDir.scanIfNeeded
> eventually calls HistoryFileManager.addIfAbsent in a synchronized block. When
> multiple threads call scanIfNeeded simultaneously, one of them acquires lock
> and the other threads are blocked until the first thread completes 
> long-running
> HistoryFileManager.addIfAbsent call.
> Solution: 
> * Reduce amount of logs so that HistoryFileManager.addIfAbsent doesn't take 
> too long time.
> * Good to have if possible: HistoryFileManager.UserLogDir.scanIfNeeded skips
>   scanning if another thread is already scanning. This changes semantics of
>   some HistoryFileManager methods (such as getAllFileInfo and getFileInfo)
>   because scanIfNeeded keep outdated state.
> * Good to have if possible: Make scanIfNeeded asynchronous so that RPC calls 
> are
>   not blocked by a loop at scale of tens of thousands.
>  
> This patch implemented the first item.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)