date:20150325

[jira] [Commented] (GIRAPH-1000) Multi Output support

2015-03-25 Thread Lukas Nalezenec (JIRA)


[ 
https://issues.apache.org/jira/browse/GIRAPH-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379631#comment-14379631
 ] 

Lukas Nalezenec commented on GIRAPH-1000:
-

FYI: 
Its already possible to write multiple outputs in Giraph using WorkerContext. 
Its not ideal - you have to care of failed tasks manually but it works.

See file SimpleVertexWithWorkerContext.java in project giraph-examples.

 Multi Output support
 

 Key: GIRAPH-1000
 URL: https://issues.apache.org/jira/browse/GIRAPH-1000
 Project: Giraph
  Issue Type: Improvement
  Components: bsp, conf and scripts, graph
Affects Versions: 1.0.0, 1.1.0, 1.2.0-SNAPSHOT
Reporter: Alessio Arleo
  Labels: features

 Hadoop natively supports multiple outputs. The objective is to extend Giraph 
 to support multiple output formats during a single giraph run.
 According to the official Hadoop apidocs*, to take advantage of multiple 
 outputs the  the pattern is the following:
 - Modify the job submission
 - Modify the reducer class to write on the declared different outputs
 Since Giraph jobs are executed as mappers, probably this approach (or at 
 least its second part) is not feasible, so further investigation is necessary.
 *https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (GIRAPH-1000) Multi Output support

2015-03-25 Thread Sergey Edunov (JIRA)

[
https://issues.apache.org/jira/browse/GIRAPH-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379892#comment-14379892
]

Sergey Edunov commented on GIRAPH-1000:
---

That would be a great addition to Giraph!
I was thinking about it a while ago. Seems like we can implement it in a
similar to multiple input format way. See for example:
org.apache.giraph.io.formats.multi.MultiVertexInputFormat and other classes in
the same package. This is essentially a wrapper around a list on inputs
providing the same API as single input format does.
In a same way we can have a wrapper around VertexOutputFormat and
EdgeOutputFormat providing same APIs, and then just plug them in.
We also need this feature, so I'll be happy to help

Multi Output support

Key: GIRAPH-1000
URL: https://issues.apache.org/jira/browse/GIRAPH-1000
Project: Giraph
Issue Type: Improvement
Components: bsp, conf and scripts, graph
Affects Versions: 1.0.0, 1.1.0, 1.2.0-SNAPSHOT
Reporter: Alessio Arleo
Labels: features

Hadoop natively supports multiple outputs. The objective is to extend Giraph
to support multiple output formats during a single giraph run.
According to the official Hadoop apidocs*, to take advantage of multiple
outputs the the pattern is the following:
- Modify the job submission
- Modify the reducer class to write on the declared different outputs
Since Giraph jobs are executed as mappers, probably this approach (or at
least its second part) is not feasible, so further investigation is necessary.
*https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (GIRAPH-1000) Multi Output support

2015-03-25 Thread Lukas Nalezenec (JIRA)

[
https://issues.apache.org/jira/browse/GIRAPH-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379707#comment-14379707
]

Lukas Nalezenec commented on GIRAPH-1000:
-

I have never used Hadoop MultipleOutputs - I evaluated it when it was new but
it was hard to unit test. We have decided to replace it in MapReduce by our own
internal implementation. In my humble opinion MultipleOutputs are badly
designed. Just my two cents.

I think there is not much documentation on Giraph internals. You have to read
source code. The code is well written and you will learn a lot. I don know much
about these parts of Giraph but if I will know i will help you.

Multi Output support

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (GIRAPH-1000) Multi Output support

[jira] [Commented] (GIRAPH-1000) Multi Output support

[jira] [Commented] (GIRAPH-1000) Multi Output support

3 matches

Site Navigation

Mail list logo

Footer information