[ 
https://issues.apache.org/jira/browse/MAPREDUCE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753309#action_12753309
 ] 

Hong Tang commented on MAPREDUCE-966:
-------------------------------------

Proposed changes:
- Isolate the tools dependent on Rumen to three simple interfaces: 
## JobStory (describing a MapReduce job).
## ClusterStory (describing the cluster setup and topology etc) 
## JobStoryProducer that produces a sequence of jobs.
 
Accordingly, ZombieJob adapts a LoggedJob to JobStory, and ZombieCluster adapts 
LoggedNetworkTopology to ClusterStory (indirectly through 
AbstractClusterStory). Finally ZombieJobProducer reads rumen traces and 
produces a sequence of JobStory instances.
- Encapsulate the logic of JSON parsing within Rumen and remove the Parser 
class. Two reader classes are added to parse json encoded LoggedJob and 
LoggedNetworkTopology (JobTraceReader and ClusterTopologyReader). No throw of 
Json-specific exceptions from the interface.
- Better sanity check in ZombieJob, and fill in made-up data if the source data 
are missing or invalid. 


> Rumen interface improvement
> ---------------------------
>
>                 Key: MAPREDUCE-966
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-966
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.21.0
>            Reporter: Hong Tang
>            Assignee: Hong Tang
>
> Rumen could expose a cleaner interface to simplify the integration with other 
> tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to