[jira] [Commented] (MAPREDUCE-4683) We need to fix our build to create/distribute hadoop-mapreduce-client-core-tests.jar
[ https://issues.apache.org/jira/browse/MAPREDUCE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281074#comment-15281074 ] Akira AJISAKA commented on MAPREDUCE-4683: -- Hi [~jianhe], can we target this to trunk? This fix is needed for MAPREDUCE-4253. > We need to fix our build to create/distribute > hadoop-mapreduce-client-core-tests.jar > > > Key: MAPREDUCE-4683 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4683 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Arun C Murthy >Assignee: Akira AJISAKA >Priority: Critical > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-4683.patch > > > We need to fix our build to create/distribute > hadoop-mapreduce-client-core-tests.jar, need this before MAPREDUCE-4253 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6108) ShuffleError OOM while reserving memory by MergeManagerImpl
[ https://issues.apache.org/jira/browse/MAPREDUCE-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281053#comment-15281053 ] Wangda Tan commented on MAPREDUCE-6108: --- [~kasha], [~vinodkv] is this still an issue in existing code base? Can we close as not-reproducible if it cannot be reproduced? Thanks, > ShuffleError OOM while reserving memory by MergeManagerImpl > --- > > Key: MAPREDUCE-6108 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6108 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Dongwook Kwon >Priority: Critical > > Shuffle has OOM issue from time to time. > Such as this email reported. > http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201408.mbox/%3ccabwxxjnk-on0xtrmurijd8sdgjjtamsvqw2czpm3oekj3ym...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6099) Adding getSplits(JobContext job, List stats) to mapreduce CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He resolved MAPREDUCE-6099. Resolution: Won't Fix Close as Jason mentioned > Adding getSplits(JobContext job, List stats) to mapreduce > CombineFileInputFormat > - > > Key: MAPREDUCE-6099 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6099 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client >Affects Versions: 2.4.1 >Reporter: Pankit Thapar >Priority: Critical > Attachments: MAPREDUCE-6099.patch > > > Currently we have getSplits(JobContext job) in CombineFileInputFormat. > This api does not give freedom to the client to create a list if file status > it self and then create splits on the resultant List stats. > The client might be able to perform some filtering on its end on the File > sets in the input paths. For the reasons, above it would be a good idea to > have getSplits(JobContext, List). > Please let me know what you think about this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-4758) jobhistory web ui not showing correct # failed reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated MAPREDUCE-4758: --- Target Version/s: 2.9.0 (was: 2.8.0) Priority: Major (was: Critical) An improvement on the UI. Unlikely, this will get done. move out > jobhistory web ui not showing correct # failed reducers > --- > > Key: MAPREDUCE-4758 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4758 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, webapps >Affects Versions: 0.23.4 >Reporter: Thomas Graves > > we had a job fail due to a reducer failing 4 times. Unfortunately the job > history UI didn't show this particular failed reducer which lead to > confusion as to why the job failed. > This reducer failed to launch all 4 task attempts with a Token Expiration > error and the jobhistory file only gets an event when the task attempt > transitions to launched. The webapp JobInfo object only counts the task > attempts in the jobhistory file to display under the "Attempt Type" table, so > since this task didn't have an attempt with it, it did show it on the UI. > We need to reconcile the task list with the task attempts or also shows more > stats for the tasks vs task attempts. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-4683) We need to fix our build to create/distribute hadoop-mapreduce-client-core-tests.jar
[ https://issues.apache.org/jira/browse/MAPREDUCE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated MAPREDUCE-4683: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) I guess this could break existing script , close > We need to fix our build to create/distribute > hadoop-mapreduce-client-core-tests.jar > > > Key: MAPREDUCE-4683 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4683 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Arun C Murthy >Assignee: Akira AJISAKA >Priority: Critical > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-4683.patch > > > We need to fix our build to create/distribute > hadoop-mapreduce-client-core-tests.jar, need this before MAPREDUCE-4253 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time
[ https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280963#comment-15280963 ] Jian He commented on MAPREDUCE-6513: looks like TaskAttemptKillEvent will be sent twice for each mapper First at below code in RMContainerAllocator#handleUpdatedNodes, JobImpl will in turn send the TaskAttemptKillEvent event for each mapper on the unusable node. {code} // send event to the job to act upon completed tasks eventHandler.handle(new JobUpdatedNodesEvent(getJob().getID(), updatedNodes)); {code} Second time at this code in the same method {code} // If map, reschedule next task attempt. boolean rescheduleNextAttempt = (i == 0) ? true : false; eventHandler.handle(new TaskAttemptKillEvent(tid, "TaskAttempt killed because it ran on unusable node" + taskAttemptNodeId, rescheduleNextAttempt)); } {code} This is how it was long time ago, Not sure why that is. With the new change, will this cause more container requests get scheduled ? > MR job got hanged forever when one NM unstable for some time > > > Key: MAPREDUCE-6513 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, resourcemanager >Affects Versions: 2.7.0 >Reporter: Bob.zhao >Assignee: Varun Saxena >Priority: Critical > Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, > MAPREDUCE-6513.03.patch, MAPREDUCE-6513.3.branch-2.8.patch, > MAPREDUCE-6513.3_1.branch-2.7.patch, MAPREDUCE-6513.3_1.branch-2.8.patch > > > when job is in-progress which is having more tasks,one node became unstable > due to some OS issue.After the node became unstable, the map on this node > status changed to KILLED state. > Currently maps which were running on unstable node are rescheduled, and all > are in scheduled state and wait for RM assign container.Seen ask requests for > map till Node is good (all those failed), there are no ask request after > this. But AM keeps on preempting the reducers (it's recycling). > Finally reducers are waiting for complete mappers and mappers did n't get > container.. > My Question Is: > > why map requests did not sent AM ,once after node recovery.? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280928#comment-15280928 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 2 new + 16 unchanged - 0 fixed = 18 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 3s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 27s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 46s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800866/mapreduce6657.005.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 95c33ef8963a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280712#comment-15280712 ] Haibo Chen commented on MAPREDUCE-6657: --- Sorry for misunderstanding your previous comments. Do you think we should create a subclass of RetriableException for this instead? [~djp] The message is derived from a instance method this.nn.getRole(), and doing string matching is probably not the cleanest way. If so, I can create file a follow-up jira in HDFS and update isNameNodeStillNotStarted() when we have the new 'NameNodeNotStartedException' that extends RetriableException. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated MAPREDUCE-6657: -- Attachment: (was: mapreduce6657.006.patch) > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280564#comment-15280564 ] Junping Du commented on MAPREDUCE-6657: --- Thanks for updating the patch, [~haibochen]. My above comments is actually trying to say we should define static string in where exception get throw. In this case, we should also change NameNodeRpcServer.java: {noformat} private void checkNNStartup() throws IOException { if (!this.nn.isStarted()) { throw new RetriableException(this.nn.getRole() + " still not started"); } } {noformat} If we define some static string in HDFS and use in both side (NameNodeRpcServer and HistoryFileManager), that can make sure we won't hit this issue again in future if we update exception string. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280516#comment-15280516 ] Haibo Chen commented on MAPREDUCE-6657: --- Thanks very much for your review, [~djp]. I have updated the patch according to your comments. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated MAPREDUCE-6657: -- Attachment: mapreduce6657.006.patch > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280426#comment-15280426 ] Junping Du commented on MAPREDUCE-6657: --- Thanks [~haibochen] for the patch. The hard code of checking message string is very flaky: {noformat} +return ex.toString().contains("SafeModeException") || +(ex instanceof RetriableException && ex.getMessage().contains( +"NameNode still not started")); {noformat} If HDFS in future change exception message to something else. i.e. "Namenode not start yet.", then the issue will come up again. Instead, we should define the message as a static string. Other looks fine. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6693) Job history entry missing when JOB name is of mapreduce.jobhistory.jobname.limit length
[ https://issues.apache.org/jira/browse/MAPREDUCE-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280373#comment-15280373 ] Kousuke Saruta commented on MAPREDUCE-6693: --- On the second thought, only {code} if (encodedString.length() < limitLength) {code} should be changed to {code} if (encodedString.length() <= limitLength) {code} and {code} index + increase > limitLength {code} should be kept. The reason is if we have {code} if (encodedString.length() <= limitLength) { return encodedString; } {code} the size of strBytes is at least limitLength + 1, means maximum index is limitLength. So even if index + increase is limitLength, it's safe. > Job history entry missing when JOB name is of > mapreduce.jobhistory.jobname.limit length > --- > > Key: MAPREDUCE-6693 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6693 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Ajith S >Priority: Critical > > Job history entry missing when JOB name is of > {{mapreduce.jobhistory.jobname.limit}} character > {noformat} > 2016-05-10 06:51:00,674 DEBUG [Thread-73] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Interrupting > Event Handling thread > 2016-05-10 06:51:00,674 DEBUG [Thread-73] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Waiting for > Event Handling thread to complete > 2016-05-10 06:51:00,674 ERROR [eventHandlingThread] > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[eventHandlingThread,5,main] threw an Exception. > java.lang.ArrayIndexOutOfBoundsException: 50 > at > org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils.trimURLEncodedString(FileNameIndexUtils.java:326) > at > org.apache.hadoop.mapreduce.v2.jobhistory.FileNameIndexUtils.getDoneFileName(FileNameIndexUtils.java:86) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processDoneFiles(JobHistoryEventHandler.java:1147) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:635) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$1.run(JobHistoryEventHandler.java:341) > at java.lang.Thread.run(Thread.java:745) > 2016-05-10 06:51:00,675 DEBUG [Thread-73] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Shutting down > timer for Job MetaInfo for job_1462840033869_0009 history file > hdfs://hacluster:9820/staging-dir/dsperf/.staging/job_1462840033869_0009/job_1462840033869_0009_1.jhist > 2016-05-10 06:51:00,675 DEBUG [Thread-73] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Shutting down > timer Job MetaInfo for job_1462840033869_0009 history file > hdfs://hacluster:9820/staging-dir/dsperf/.staging/job_1462840033869_0009/job_1462840033869_0009_1.jhist > 2016-05-10 06:51:00,676 DEBUG [Thread-73] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Closing Writer > {noformat} > Looks like 50 character check is going wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280287#comment-15280287 ] Jason Lowe commented on MAPREDUCE-6558: --- Thanks, [~wilfreds]! Patch looks good overall. I think we can significantly reduce the size of the testcase file since the problem occurs early in it. I noticed that if we cut the file down to just 530 records instead of 20,000 records and compress with bzip2 -1 it still catches the failure but is only a 10K binary rather than a 409K binary. > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6558.1.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280098#comment-15280098 ] Daniel Templeton commented on MAPREDUCE-6657: - OK. Latest patch looks good to me. [~rkanter]? > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org