[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833354#action_12833354 ]
Chris Douglas commented on MAPREDUCE-1408: ------------------------------------------ * All the fields in {{StatsCollector}} can be final * Jobs pushed in {{StatsCollector::add}} do not need to be retained in a list, since now only its size is used. For the current purpose, it is sufficient to keep only a counter * Instead of fixing {{MAX_JOBS_COMPLETED_IN_POLL_INTERVAL}} to 1, it would make sense to let this parameter be configurable. {{SERIAL}} submission should set it to 1 but {{STRESS}} can be less aggressive by default. It would make sense to let this be defined by the enum, as the polling interval is in the current patch * Excluding {{SERIAL}} where the max jobs per interval is implicit, letting the {{STRESS}} and {{REPLAY}} define defaults for configurable parameters rather than fixed values may make sense (for both max jobs/interval and the interval period) * The default of 1000ms for {{STRESS}} is too low * Setting the interrupt flag is unnecessary if that is the condition for shutting down the thread- as in {{StatsCollector.Collector::run}} and {{JobFactory::run}}. The only reason to reset the flag is if the current context does not know how to handle the interrupt, but its caller might and it cannot throw {{InterruptedException}}. Returning from {{run}} is correct in this case. * The {{StatListener}} interface, the new {{JobFactory}} subclasses, and {{StatsCollector}} should be package-private. If {{JobFactory}} is made an abstract base, it may make sense to make it public * {{StatsCollector::join}} should respect the timeout provided, not wait forever * This block in {{StressReaderThread::run}}: {noformat} + synchronized (lock) { + while (loadStatus.isOverloaded) { + //Wait while JT is overloaded. + try { + lock.wait(); + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + } + } + } {noformat} Does not seem correct. If interrupted while waiting, {{InterruptedException}} will be thrown, caught, the flag reset, causing another exception to be thrown. If interrupted, the thread should exit. A similar pattern is in {{StatsCollector.Collector::run}} * As with {{SerialJobFactory}}, {{StressJobFactory}} should be submitting jobs to be run immediately, not at the interval recorded in the trace. The submission time in the trace can be disregarded entirely. * {{StatsCollector::addClusterStatsObservers}} has no callers; how does one subscribe to the statistics updates from the JT? * {{LoadCalculator::checkLoadAndGetSlotsAvailable}} should use commons logging at debug level, rather than {{System.out}}. Similarly, many of these statements are non-trivial to construct and should be guarded by {{LOG.isDebugEnabled()}} * Log message such as: {{LOG.error(ie.getCause(), ie)}} Should instead include a message and the exception (which will also include its cause) * The {{LoadCalculator}} methods can probably be part of {{StressJobFactory}} rather than static methods in a separate class > Allow customization of job submission policies > ---------------------------------------------- > > Key: MAPREDUCE-1408 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/gridmix > Reporter: rahul k singh > Attachments: 1408-1.patch > > > Currently, gridmix3 replay job submission faithfully. For evaluation > purposes, it would be great if we can support other job submission policies > such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.