i'm really just trying to emit "results" into an hdfs file at different moments of the computation. I'm really just thinking at a functionality like log.debug(), to give an example, where all the messages are collected from different workers at different supersteps. At the moment I've implemented this:
https://github.com/claudiomartella/graffiti/blob/master/src/main/java/org/acaro/graffiti/processing/GraffitiEmitter.java which i assign to each vertex at preApplication() and close from each vertex at postApplication(). I'm not super happy about this solution. During this weekend though, I thought I might use an Aggregator to send my ResultSet object and use the Aggregator to write to disk. That would be a nice design and I could contribute the JIRA about storing Aggregator results. What do you think? On Fri, Sep 23, 2011 at 1:40 AM, Avery Ching <[email protected]> wrote: > This is more of a limitation of the fact that files are immutable in HDFS. > Any more insight on what you're trying to do? Perhaps we can think of a > more general way to address the issue. > > Avery > > On 9/22/11 10:31 AM, Claudio Martella wrote: >> >> Hi Avery, >> >> thanks, yes it does. The question would be though how to share the >> file handle between the vertices on the same node. i could open the >> file on the preApplication() and close it on the postApplication() but >> i would end up potentially with as many files as vertices in the >> graph. >> >> Do you have any idea on this side? Maybe share somehow the handle and a >> lock? >> >> On Thu, Sep 22, 2011 at 4:07 PM, Avery Ching<[email protected]> wrote: >>> >>> There are some methods in Vertex (i.e. preApplication(), preSuperstep(), >>> postApplication(), postSuperstep()) that can be overidden to do anything >>> you >>> like, for instance write out some data to an HDFS file. We have an open >>> issue on outputting Aggregator values that is unassigned if you'd like to >>> take a look at it as well >>> (https://issues.apache.org/jira/browse/GIRAPH-10). >>> >>> Hope this helps, >>> >>> Avery >>> >>> On 9/22/11 7:34 AM, Claudio Martella wrote: >>>> >>>> Hello list, >>>> >>>> I have the need to emit to HDFS once in a while some Text. This >>>> doesn't happen necessarily at the end of the computation and I might >>>> need to emit something more complex than just the VertexValue, so I'd >>>> like more control than what the VertexWriter gives me. >>>> >>>> What do you suggest I might do to obtain a handler to a HDFS file (it >>>> can be in parts aswell) to write to? >>>> Is there any code I can start looking at? >>>> >>>> Thanks! >>>> Claudio >>>> >>> >> >> > > -- Claudio Martella [email protected]
