I am using 0.9. What I think is the issue is that storm.py is having problems when deserializing a byte array. When I encode as base64 binary string I have no problems and it deserializes fine. Of course I would like to avoid this extra overhead if possible. All my binary objects are relatively small 200-300k max.
On Sunday, January 12, 2014, 李家宏 wrote: > hi , Farchtchi, > > which storm version are you using ? > IF the tuple is not serialized, then there is no need to use a JSON parser > to parse the received tuple. I guess so. > > Regards > > > 2014/1/11 Ruhollah Farchtchi <[email protected]> > > Yes I read that in the docs. However when receiving the byte array in > storm.py it throws a json error when trying to parse the tuples. I didn't > have time to look into it further as I am new to storm and python. > > > On Saturday, January 11, 2014, 李家宏 wrote: > > There is no need to serialize binary data, just send it as it. > As by defalut storm-0.9.0 use kryo serializer to serialize tuple values, I > guess we can skip this serialization step. > > Regards > > > > 2014/1/10 Jon Logan <[email protected]> > > You're going to run into issues if you have large tuples, because they are > buffered in memory. I would suggest moving it to an exterior channel, like > Redis, etc, and only passing meta-data through Storm. > > Your other solution is to use quirky things like reflection to prevent > your application from running out of memory when tuples are buffered. > > > On Fri, Jan 10, 2014 at 8:49 AM, Ruhollah Farchtchi < > [email protected]> wrote: > > I am using storm to process small (< 100k) image files. I don't have a > real-time requirement as yet, but my bottle neck is more in the image > processing than message passing between bolts. I am using the Clojure DSL > and the python bolt. Everything I've put together right now is very much a > prototype so my next steps are some further processing and integration. > Passing byte arrays didn't seem to work so well so I have had to > encode/decode into base64 binary as it seems the JSON parsers on the python > side didn't like byte arrays. I plan to go back and perhaps re-do the > integration with a native C++ bolt, however I believe that there are other > ways to do this integration as well. I'm As with Wilson, I'm interested if > anyone else is using Storm to process binary payloads and what they have > found works. > > Thanks, > > Ruhollah > > Ruhollah Farchtchi > [email protected] > > > On Thu, Jan 9, 2014 at 10:24 PM, Lochlainn Wilson < > [email protected]> wrote: > > Hi all, > > I am new to Storm and have been tasked with determining whether it is > feasible for us to use Apache storm in my company. I have of course > configured the sample projects and have been poking around. A red flag is > raised with the "stream processing" style JSON parsing. > > I am considering using storm with real time image processing bolts in C++. > Packaging binary data into a JSON (by escaping it) looks like it will be > slow and expensive. Is there a better way? Does anyone have experience > processing large streams of binary data through storm? > > How did it go? > > Regards, > > Lochlainn > > > > > > > -- > > ====================================================== > > Gvain > > Email: [email protected] > > > > -- > Ruhollah Farchtchi > [email protected] > > > > > -- > > ====================================================== > > Gvain > > Email: [email protected] > -- Ruhollah Farchtchi [email protected]
