[
https://issues.apache.org/jira/browse/GIRAPH-214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401176#comment-13401176
]
Eli Reisman commented on GIRAPH-214:
------------------------------------
Looked at the HiveConf, thats a much cleaner solution, could refactor to
emulate that somewhat, but the way the conf stuff is accessed in Giraph is kind
of all over the place and not entirely uniform in style.
Again, my approach was to be pretty basic and just get the stuff out of the
GiraphJob and try to encapsulate it in a very basic way. once the next patch is
up, it will work and would represent a first step, from which I think a
discussion is warranted as to what we really want it to look like and how it
should be referenced, and how references to it should be obtained in the code
that uses it. Then more dramatic steps to really clean up and homogenize the
API and/or access patterns could be put in place. Any advice/ideas?
It does seem like a lot of different modules get access to the conf and use it
in different ways. Could a single per-node (per-JVM) static instance that is
reachable by anyone be better? Is there something about the original Hadoop
Configuration object that makes this too unwieldy? Seems like everyone and his
uncle has access to this thing as it is anyway and not everyone comes by that
access in the same way. maybe trading easy (very public, static utility class)
access for clarity in the code is a good trade?
I'm very open to suggestions at this point before I log a bunch more time
pursuing a direction that may or may not appeal to everyone else.
> GiraphJob should have configuration split out of it to be cleaner (GiraphConf)
> ------------------------------------------------------------------------------
>
> Key: GIRAPH-214
> URL: https://issues.apache.org/jira/browse/GIRAPH-214
> Project: Giraph
> Issue Type: Bug
> Reporter: Avery Ching
> Assignee: Eli Reisman
> Priority: Minor
> Attachments: GIRAPH-214-1.patch
>
>
> Currently all the configuration for Giraph is part of GiraphJob, making
> things messy for GiraphJob.
> It would be better if we added a GiraphConf (similar to Hive) that is
> responsible for handling configuration of the Job.
> i.e.
> public class GiraphJob extends Configuration....
> To simplify config, we should make get/set methods for as many of the
> parameters as possible.
> We are targeting configuration such as
> /**
> * Set the vertex class (required)
> *
> * @param vertexClass Runs vertex computation
> */
> public final void setVertexClass(Class<?> vertexClass) {
> getConfiguration().setClass(VERTEX_CLASS, vertexClass, BasicVertex.class);
> }
> /**
> * Set the vertex input format class (required)
> *
> * @param vertexInputFormatClass Determines how graph is input
> */
> public final void setVertexInputFormatClass(
> Class<?> vertexInputFormatClass) {
> getConfiguration().setClass(VERTEX_INPUT_FORMAT_CLASS,
> vertexInputFormatClass,
> VertexInputFormat.class);
> }
> /**
> * Set the vertex output format class (optional)
> *
> * @param vertexOutputFormatClass Determines how graph is output
> */
> public final void setVertexOutputFormatClass(
> Class<?> vertexOutputFormatClass) {
> getConfiguration().setClass(VERTEX_OUTPUT_FORMAT_CLASS,
> vertexOutputFormatClass,
> VertexOutputFormat.class);
> }
> /**
> * Set the vertex combiner class (optional)
> *
> * @param vertexCombinerClass Determines how vertex messages are combined
> */
> public final void setVertexCombinerClass(Class<?> vertexCombinerClass) {
> getConfiguration().setClass(VERTEX_COMBINER_CLASS,
> vertexCombinerClass,
> VertexCombiner.class);
> }
> /**
> * Set the graph partitioner class (optional)
> *
> * @param graphPartitionerFactoryClass Determines how the graph is
> partitioned
> */
> public final void setGraphPartitionerFactoryClass(
> Class<?> graphPartitionerFactoryClass) {
> getConfiguration().setClass(GRAPH_PARTITIONER_FACTORY_CLASS,
> graphPartitionerFactoryClass,
> GraphPartitionerFactory.class);
> }
> /**
> * Set the vertex resolver class (optional)
> *
> * @param vertexResolverClass Determines how vertex mutations are resolved
> */
> public final void setVertexResolverClass(Class<?> vertexResolverClass) {
> getConfiguration().setClass(VERTEX_RESOLVER_CLASS,
> vertexResolverClass,
> VertexResolver.class);
> }
> /**
> * Set the worker context class (optional)
> *
> * @param workerContextClass Determines what code is executed on a each
> * worker before and after each superstep and computation
> */
> public final void setWorkerContextClass(Class<?> workerContextClass) {
> getConfiguration().setClass(WORKER_CONTEXT_CLASS,
> workerContextClass,
> WorkerContext.class);
> }
> ...etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira