Definitely keep us up to date with your progress. Don't hesitate to file and/or fix JIRAs =) (


On 1/3/12 6:13 PM, prasenjit mukherjee wrote:
I will be using giraph/hadoop for other use cases anyways, and I don't
want to install/maintain Storm just for the real-time streaming use

I am also thinking of adding real-time logs to hbase and have giraph
pick up the incremental feeds from hbase based on  time stamp.

On 1/4/12, Avery Ching<>  wrote:
Interesting idea.  You could actually implement the code to load the new
input data in preSuperstep().  If the input data is resilient (i.e.
stored on HDFS), then the system would inherit Giraph's reliability
guarantees.  Implementing an external trigger to stop the application
wouldn't be too difficult, (i.e. dump a file stamp or something and
check for it every n supersteps).  Still, as I'm not that familiar with
Storm, what would be the advantages of this over Storm?


On 1/3/12 5:30 PM, prasenjit mukherjee wrote:
As Jake mentioned, you can have continous processing by making the
mappers in Giraph stop based on an external condition ( I.e.
Specifically asked to do so ) and one can call voteForHalt() only if
that condition is satisfied.

Additionally, the VertexInputSource can be modified to read it from a
continuous input ( like ActiveMQ or even a port ) potentially outside
of HDFS.

On 1/3/12, Sebastian Schelter<>   wrote:
Hi Prasen,

Storm is supposed to process a continuous stream of data while Giraph is
a parallel batch processing platform. I think these are inherently
different systems and one cannot easily be transformed into the other.


On 03.01.2012 17:51, prasenjit mukherjee wrote:
I have a use case which maps perfectly with the open source
implementation of storm ( by twitter team ). I think Giraph can be
easily modified to have an implementation simulating storm's use
cases. Just curious, if anybody had similar thoughts.


Reply via email to