Hey, So I’m contemplating using Storm for processing for doing rather complicated analyses on user submitted data (either through HTTP or WebSockets). Storm seems perfect for the multi-stage processing that I need, and it’s real-time nature would fit the type of interactions I require. Furthermore many steps would involve already written analyses in Python and R, so using bolts for that would be great.
However, hooking up Storm behind an HTTP like Ring (optionally with http-kit) seems non-trivial. I first thought of pushing the messages on a core.async queue and having a Spout consume them. But I realise this might fail in a cluster. So the current thinking is * HTTP Request -> create job & push job on Kafka jobs topic * Inform the user about the created job, which includes a (WebSocket) url to listen for results * Storm consumes from Kafka * End results are pushed to bolts that push on a Kafka topic for results * Make server listen on results topic & push results to appropriate jobs (i.e. notify user on job url) But to be honest … this seems a bit of hassle to set-up. It would require server/developers to set-up Kafka, Storm and all related dependencies. It’s a lot of “stuff” just to get it running, which might hamper developer adaptation at our shop. Is there a simpeler way of getting this going, or does this seem to be the most appropriate way? Many thanks, Joël
signature.asc
Description: Message signed with OpenPGP using GPGMail
