Hey, 

So I’m contemplating using Storm for processing for doing rather complicated 
analyses on user submitted data (either through HTTP or WebSockets). 
Storm seems perfect for the multi-stage processing that I need, and it’s 
real-time nature would fit the type of interactions I require. 
Furthermore many steps would involve already written analyses in Python and R, 
so using bolts for that would be great.

However, hooking up Storm behind an HTTP like Ring (optionally with http-kit) 
seems non-trivial. 

I first thought of pushing the messages on a core.async queue and having a 
Spout consume them. But I realise this might fail in a cluster. 
So the current thinking is 

* HTTP Request -> create job & push job on Kafka jobs topic 
* Inform the user about the created job, which includes a (WebSocket) url to 
listen for results
* Storm consumes from Kafka 
* End results are pushed to bolts that push on a Kafka topic for results 
* Make server listen on results topic & push results to appropriate jobs (i.e. 
notify user on job url) 

But to be honest … this seems a bit of hassle to set-up. It would require 
server/developers to set-up Kafka, Storm and all related dependencies. 
It’s a lot of “stuff” just to get it running, which might hamper developer 
adaptation at our shop.

Is there a simpeler way of getting this going, or does this seem to be the most 
appropriate way?

Many thanks,
Joël

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to