[
https://issues.apache.org/jira/browse/GIRAPH-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581473#comment-13581473
]
Eli Reisman commented on GIRAPH-13:
-----------------------------------
Hey one more idea to throw out there regarding all the IO format issues with
YARN, what do you think of this:
Since some of our internals are prettty bound up in some MRv1 classes, we can
do the refactor and wrapping already spoken about above to hide this
dependency. Another approach I might explore is to simply have a generic task
runner (that owns GraphTaskManager, and replaces GraphMapper in our YARN impl)
that just instantiates the TaskAttemptContext and other Hadoop MRv1 classes and
populates them with the info they need to run the job (taken from the
giraphConfiguration and/or the YARN classes that report some of the same data
to the running job) and just hand those off to our Giraph code that expects
these objects. Since this activity is self-contained in the runner class, no
platform-dependent setup code (for YARN, mesos, whoever) has to know anything
about the runner, just create it and hand it the data it needs, set it to
running on the right compute nodes, etc.
This is a tiny bit hacky, but gets the job done with minimal changes to
existing code, allows for future JIRAs to do more extensive refactors, and does
not hide from the fact that we will still carry dependencies on the Hadoop JARs
for as long as we support MRv1 too, so we will have access to these classes to
instantiate even on Mesos or YARN. I am not entirely sure this approach is
possible but its one I have toyed with as an alternative to doing the full
"wrap all MRv1 IO objects" approach.
Any opinions? I will be exploring the options for the IO dilemma in great
detail later in the week and will post my findings/opinions as I survey the
landscape. Just need to get the rest of the Yarn job setup code done today and
post that patch first...
> Port Giraph to YARN
> -------------------
>
> Key: GIRAPH-13
> URL: https://issues.apache.org/jira/browse/GIRAPH-13
> Project: Giraph
> Issue Type: New Feature
> Reporter: Jakob Homan
> Assignee: Eli Reisman
> Attachments: GIRAPH-13-1.patch, GIRAPH-13-2.patch
>
>
> Now that YARN (aka MR2 aka MAPREDUCE-279) has been merged into the Hadoop
> trunk, we should think about what it would take to separate out the graph
> processing bits of Giraph from the MR1-specific code so as to take advantage
> of the less-MR centric aspects of YARN, while still supporting both over the
> medium term.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira