I think I understand what you are trying to do, a kind of a distributed
logging for debugging. I think such a feature can definitely be
useful. Aggregators might be able to do what you want, then with things
like https://issues.apache.org/jira/browse/GIRAPH-10, perhaps not just
at the end of the application, but after each superstep, might be able
to accomplish what you want.
Feel free to take a crack at the issue...let's see what interfaces make
On 9/26/11 7:03 AM, Claudio Martella wrote:
i'm really just trying to emit "results" into an hdfs file at
different moments of the computation. I'm really just thinking at a
functionality like log.debug(), to give an example, where all the
messages are collected from different workers at different supersteps.
At the moment I've implemented this:
which i assign to each vertex at preApplication() and close from each
vertex at postApplication(). I'm not super happy about this solution.
During this weekend though, I thought I might use an Aggregator to
send my ResultSet object and use the Aggregator to write to disk. That
would be a nice design and I could contribute the JIRA about storing
What do you think?
On Fri, Sep 23, 2011 at 1:40 AM, Avery Ching<avery.ch...@gmail.com> wrote:
This is more of a limitation of the fact that files are immutable in HDFS.
Any more insight on what you're trying to do? Perhaps we can think of a
more general way to address the issue.
On 9/22/11 10:31 AM, Claudio Martella wrote:
thanks, yes it does. The question would be though how to share the
file handle between the vertices on the same node. i could open the
file on the preApplication() and close it on the postApplication() but
i would end up potentially with as many files as vertices in the
Do you have any idea on this side? Maybe share somehow the handle and a
On Thu, Sep 22, 2011 at 4:07 PM, Avery Ching<ach...@apache.org> wrote:
There are some methods in Vertex (i.e. preApplication(), preSuperstep(),
postApplication(), postSuperstep()) that can be overidden to do anything
like, for instance write out some data to an HDFS file. We have an open
issue on outputting Aggregator values that is unassigned if you'd like to
take a look at it as well
Hope this helps,
On 9/22/11 7:34 AM, Claudio Martella wrote:
I have the need to emit to HDFS once in a while some Text. This
doesn't happen necessarily at the end of the computation and I might
need to emit something more complex than just the VertexValue, so I'd
like more control than what the VertexWriter gives me.
What do you suggest I might do to obtain a handler to a HDFS file (it
can be in parts aswell) to write to?
Is there any code I can start looking at?