[jira] [Commented] (MAPREDUCE-6652) Add configuration property to prevent JHS from loading jobs with a task count greater than X
[ https://issues.apache.org/jira/browse/MAPREDUCE-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376094#comment-15376094 ] Haibo Chen commented on MAPREDUCE-6652: --- I ran it on my local laptop with mvn test. It was fine. The is the output "Final Memory: 118M/821M". Running TestAppPage, I got "117M/806M", no real difference. > Add configuration property to prevent JHS from loading jobs with a task count > greater than X > > > Key: MAPREDUCE-6652 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6652 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6652.001.patch, mapreduce6652.002.patch, > mapreduce6652.003.patch, mapreduce6652.004.patch, mapreduce6652.005.patch, > mapreduce6652.007.branch2.patch, mapreduce6652.007.patch, > mapreduce6652.008.branch2.patch, mapreduce6652.008.patch > > > Jobs with large number of tasks can have job history files that are large in > size and resource-consuming(mainly memory) to parse in Job History Server. If > there are many such jobs, the job history server can very easily hang. > It would be a good usability feature if we added a new config property that > could be set to X, where the JHS wouldn't load the details for a job with > more than X tasks. The job would still show up on the list of jobs page, but > clicking on it would give a warning message that the job is too big, instead > of actually loading the job. This way we can prevent users from loading a job > that's way too big for the JHS, which currently makes the JHS hang. The > default value can be -1 so that it's disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6652) Add configuration property to prevent JHS from loading jobs with a task count greater than X
[ https://issues.apache.org/jira/browse/MAPREDUCE-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376068#comment-15376068 ] Robert Kanter commented on MAPREDUCE-6652: -- [~haibochen], can you take a look at {{TestHsJobBlock}}? The first time I ran it, it timed out. The second time, it really slowed down my computer after a while (eating all my RAM I suppose), and I had to ctrl-c it, which wasn't easy. > Add configuration property to prevent JHS from loading jobs with a task count > greater than X > > > Key: MAPREDUCE-6652 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6652 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6652.001.patch, mapreduce6652.002.patch, > mapreduce6652.003.patch, mapreduce6652.004.patch, mapreduce6652.005.patch, > mapreduce6652.007.branch2.patch, mapreduce6652.007.patch, > mapreduce6652.008.branch2.patch, mapreduce6652.008.patch > > > Jobs with large number of tasks can have job history files that are large in > size and resource-consuming(mainly memory) to parse in Job History Server. If > there are many such jobs, the job history server can very easily hang. > It would be a good usability feature if we added a new config property that > could be set to X, where the JHS wouldn't load the details for a job with > more than X tasks. The job would still show up on the list of jobs page, but > clicking on it would give a warning message that the job is too big, instead > of actually loading the job. This way we can prevent users from loading a job > that's way too big for the JHS, which currently makes the JHS hang. The > default value can be -1 so that it's disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6652) Add configuration property to prevent JHS from loading jobs with a task count greater than X
[ https://issues.apache.org/jira/browse/MAPREDUCE-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376061#comment-15376061 ] Hadoop QA commented on MAPREDUCE-6652: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} MAPREDUCE-6652 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12817836/mapreduce6652.008.branch2.patch | | JIRA Issue | MAPREDUCE-6652 | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6613/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Add configuration property to prevent JHS from loading jobs with a task count > greater than X > > > Key: MAPREDUCE-6652 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6652 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6652.001.patch, mapreduce6652.002.patch, > mapreduce6652.003.patch, mapreduce6652.004.patch, mapreduce6652.005.patch, > mapreduce6652.007.branch2.patch, mapreduce6652.007.patch, > mapreduce6652.008.branch2.patch, mapreduce6652.008.patch > > > Jobs with large number of tasks can have job history files that are large in > size and resource-consuming(mainly memory) to parse in Job History Server. If > there are many such jobs, the job history server can very easily hang. > It would be a good usability feature if we added a new config property that > could be set to X, where the JHS wouldn't load the details for a job with > more than X tasks. The job would still show up on the list of jobs page, but > clicking on it would give a warning message that the job is too big, instead > of actually loading the job. This way we can prevent users from loading a job > that's way too big for the JHS, which currently makes the JHS hang. The > default value can be -1 so that it's disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6652) Add configuration property to prevent JHS from loading jobs with a task count greater than X
[ https://issues.apache.org/jira/browse/MAPREDUCE-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376021#comment-15376021 ] Robert Kanter commented on MAPREDUCE-6652: -- +1 > Add configuration property to prevent JHS from loading jobs with a task count > greater than X > > > Key: MAPREDUCE-6652 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6652 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6652.001.patch, mapreduce6652.002.patch, > mapreduce6652.003.patch, mapreduce6652.004.patch, mapreduce6652.005.patch, > mapreduce6652.007.branch2.patch, mapreduce6652.007.patch, > mapreduce6652.008.branch2.patch, mapreduce6652.008.patch > > > Jobs with large number of tasks can have job history files that are large in > size and resource-consuming(mainly memory) to parse in Job History Server. If > there are many such jobs, the job history server can very easily hang. > It would be a good usability feature if we added a new config property that > could be set to X, where the JHS wouldn't load the details for a job with > more than X tasks. The job would still show up on the list of jobs page, but > clicking on it would give a warning message that the job is too big, instead > of actually loading the job. This way we can prevent users from loading a job > that's way too big for the JHS, which currently makes the JHS hang. The > default value can be -1 so that it's disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6652) Add configuration property to prevent JHS from loading jobs with a task count greater than X
[ https://issues.apache.org/jira/browse/MAPREDUCE-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated MAPREDUCE-6652: -- Attachment: mapreduce6652.008.branch2.patch > Add configuration property to prevent JHS from loading jobs with a task count > greater than X > > > Key: MAPREDUCE-6652 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6652 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6652.001.patch, mapreduce6652.002.patch, > mapreduce6652.003.patch, mapreduce6652.004.patch, mapreduce6652.005.patch, > mapreduce6652.007.branch2.patch, mapreduce6652.007.patch, > mapreduce6652.008.branch2.patch, mapreduce6652.008.patch > > > Jobs with large number of tasks can have job history files that are large in > size and resource-consuming(mainly memory) to parse in Job History Server. If > there are many such jobs, the job history server can very easily hang. > It would be a good usability feature if we added a new config property that > could be set to X, where the JHS wouldn't load the details for a job with > more than X tasks. The job would still show up on the list of jobs page, but > clicking on it would give a warning message that the job is too big, instead > of actually loading the job. This way we can prevent users from loading a job > that's way too big for the JHS, which currently makes the JHS hang. The > default value can be -1 so that it's disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6724) Unsafe conversion from long to int in MergeManagerImpl.unconditionalReserve()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375783#comment-15375783 ] Daniel Templeton commented on MAPREDUCE-6724: - Thanks for the patch, [~haibochen]! A couple comments: {code} final long totalReduceMem = 8L * 1024 * 1024 * 1024; final float shuffleInputBuf = 1.0f; final float reduceInputBuf = 1.0f; final MergeManagerImplmgr = createMergeManager( totalReduceMem, shuffleInputBuf, 0.95f, reduceInputBuf); {code} It would be better to be consistent in the parameters. Everything is a variable except for the shuffle memory limit. {code} final long totalReduceMem = 1L; final float shuffleInputBuf = 1.0f; final float shuffleMemLimit = singleShuffleLimitConfiged / (float) totalReduceMem; final MergeManagerImpl mgr = createMergeManager( totalReduceMem, shuffleInputBuf, shuffleMemLimit, 1.0f); {code} Same comment here, except with reducer input buffer. > Unsafe conversion from long to int in MergeManagerImpl.unconditionalReserve() > - > > Key: MAPREDUCE-6724 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6724 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6724.001.patch, mapreduce6724.002.patch, > mapreduce6724.003.patch > > > When shuffle is done in memory, MergeManagerImpl converts the requested size > to an int to allocate an instance of InMemoryMapOutput. This results in an > overflow if the requested size is bigger than Integer.MAX_VALUE and > eventually causes the reducer to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO
[ https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374421#comment-15374421 ] mingleizhang commented on MAPREDUCE-6729: - Thanks to Kai's help. And I've done this. > Accurately compute the test execute time in DFSIO > - > > Key: MAPREDUCE-6729 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: benchmarks, performance, test >Affects Versions: 2.9.0 >Reporter: mingleizhang >Assignee: mingleizhang >Priority: Minor > Labels: performance, test > Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch > > > When doing DFSIO test as a distributed i/o benchmark tool. Then especially > writes plenty of files to disk or read from, both can cause performance issue > and imprecise value in a way. The question is that existing practices needs > to delete files when before running a job and that will cause extra time > consumption and furthermore cause performance issue, statistical time error > and imprecise throughput while the files are lots of. So we need to replace > or improve this hack to prevent this from happening in the future. > {code} > public static void testWrite() throws Exception { > FileSystem fs = cluster.getFileSystem(); > long tStart = System.currentTimeMillis(); > bench.writeTest(fs); // this line of code will cause extra time > consumption because of fs.delete(*,*) by the writeTest method > long execTime = System.currentTimeMillis() - tStart; > bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime); > } > private void writeTest(FileSystem fs) throws IOException { > Path writeDir = getWriteDir(config); > fs.delete(getDataDir(config), true); > fs.delete(writeDir, true); > runIOTest(WriteMapper.class, writeDir); > } > {code} > [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org