Hi Yuval

Thanks for responding, Here is what I have in mind I was thinking to
aggregate the data on hourly basis in memory and persisting every hour. Now
because of any reason machine with hourly aggregated data goes down I want
missing hour tupples to replay from my queue.  Any suggestions?

Regards
Nipun



On Tue, Oct 14, 2014 at 4:33 PM, Yuval Oren <[email protected]> wrote:

> Nipun,
>
> That seems to be contrary to the typical storm pattern of continuous
> processing. Is there a reason you can’t continuously read new data? That
> might also scale better.
>
> --
> Yuval Oren
> *N3TWORK*
>
> On Oct 14, 2014, at 8:52 AM, Nipun Batra <[email protected]> wrote:
>
> Hi
>
> I have non ending data feed and I want to define a batch on hourly basis
> i.e. set batch id for all the tuples coming in at particular hour. if I
> write my custom spout how do I set batch ID / Tx Id
>
> Later the data feed will be consumed from Kafka topic, If I plan to use
> Kafka Spout again is there a way to batch OR TxID by hour.
>
> I have looked at many examples but I am not able to find it.  Will
> appreciate if you can point me to right direction OR any example of custom
> spout setting batch id
>
> I apologize if this is already asked, I tried to look around but found
> nothing.
>
> Thank you in advance
> Nipun
>
>
>

Reply via email to