I will be using giraph/hadoop for other use cases anyways, and I don't want to install/maintain Storm just for the real-time streaming use case.
I am also thinking of adding real-time logs to hbase and have giraph pick up the incremental feeds from hbase based on time stamp. On 1/4/12, Avery Ching <ach...@apache.org> wrote: > Interesting idea. You could actually implement the code to load the new > input data in preSuperstep(). If the input data is resilient (i.e. > stored on HDFS), then the system would inherit Giraph's reliability > guarantees. Implementing an external trigger to stop the application > wouldn't be too difficult, (i.e. dump a file stamp or something and > check for it every n supersteps). Still, as I'm not that familiar with > Storm, what would be the advantages of this over Storm? > > Avery > > On 1/3/12 5:30 PM, prasenjit mukherjee wrote: >> As Jake mentioned, you can have continous processing by making the >> mappers in Giraph stop based on an external condition ( I.e. >> Specifically asked to do so ) and one can call voteForHalt() only if >> that condition is satisfied. >> >> Additionally, the VertexInputSource can be modified to read it from a >> continuous input ( like ActiveMQ or even a port ) potentially outside >> of HDFS. >> >> >> On 1/3/12, Sebastian Schelter<s...@apache.org> wrote: >>> Hi Prasen, >>> >>> Storm is supposed to process a continuous stream of data while Giraph is >>> a parallel batch processing platform. I think these are inherently >>> different systems and one cannot easily be transformed into the other. >>> >>> -sebastian >>> >>> On 03.01.2012 17:51, prasenjit mukherjee wrote: >>>> I have a use case which maps perfectly with the open source >>>> implementation of storm ( by twitter team ). I think Giraph can be >>>> easily modified to have an implementation simulating storm's use >>>> cases. Just curious, if anybody had similar thoughts. >>>> >>>> -Prasen >>> > > -- Sent from my mobile device