Totally agree. I'd switch back to Python in a second if I could. Might be worth taking a look at the pluggable serializer.
On Fri, May 30, 2014 at 5:14 PM, Andrew Montalenti <[email protected]> wrote: > For one thing, a recently accepted Storm pull request has made this > serialization pluggable and someone has already implemented a protobuf > variety. We plan to investigate alternative serialization options for > multilang once we get the other tooling out of the way. > > For another, it is true the overhead for serialization is non trivial, but > the overhead also tends to be a constant factor applied to data size, and > machines are cheap while programming time is expensive. Storm and Python's > data analysis and data integration libraries are a pretty powerful combo > worth the performance penalty. > On May 30, 2014 1:42 PM, "Larry Palmer" <[email protected]> wrote: > >> We had experimented with Storm/Python 6 months ago or so, but found the >> JSON serialization/deserialization overhead was quite high, on the order of >> several hundred usec per tuple every time it transitioned from java to >> python or vice versa, limiting total throughput on a 12 core server to >> around 25k tuples/second. Considered trying to switch to a different >> serializer but ended up just doing everything in Java instead. >> >> Is that still the case, or perhaps has the speed been improved? >> >> >> On Thu, May 29, 2014 at 10:06 PM, Andrew Montalenti <[email protected]> >> wrote: >> >>> We are building a new Storm and Python interop option that is called >>> streamparse: >>> >>> https://github.com/Parsely/streamparse >>> >>> It includes a heavily rewritten Storm interop library and a command line >>> tool, sparse, for managing local and remote Storm clusters. The idea is to >>> make Storm projects as easy to build and manage in Python as RQ or Celery >>> projects. >>> >>> It currently has support for running local clusters in a single command, >>> managing virtualenvs on remote worker machines, submitting topologies, >>> listing/killing topologies, and tailing remote log files. The multilang >>> layer also has better support for logging and exception/error handling. >>> Multiple topologies can be built from a single codebase and multiple remote >>> Storm clusters can be supported via a simple JSON configuration file. >>> >>> We are already using it for production topologies atop Storm 0.9.1 and >>> Storm 0.8. We welcome contributions and if you join our mailing list, feel >>> free to make requests. We continue to develop it actively and in an open >>> manner. >>> >>> -Andrew Montalenti >>> CTO, Parse.ly >>> On May 29, 2014 6:35 PM, "Ashu Goel" <[email protected]> wrote: >>> >>>> (the reason being is that we are still running Python 2.6 but Petrel is >>>> only compatible with 2.7) >>>> On May 29, 2014, at 2:48 PM, Ashu Goel <[email protected]> wrote: >>>> >>>> Awesome! I’m looking more into using the storm.thrift to define a >>>> non-JVM DSL… does anyone have any working examples of this? Python >>>> preferred but any example will do. the wiki is a bit confusing... >>>> On May 28, 2014, at 1:54 PM, FRANCISCO JESUS GOMEZ RODRIGUEZ < >>>> [email protected]> wrote: >>>> >>>> Ashu, take a look this project: http://github.com/AirSage/Petrel >>>> >>>> Write, submit, debug and monitor in python. >>>> >>>> @ffranz >>>> El 28/05/2014 22:49, Ashu Goel <[email protected]> escribió: >>>> Any examples where the entire infra is written in Python (including >>>> topology)? or is that not possible >>>> On May 28, 2014, at 1:33 PM, Dilpreet Singh <[email protected]> >>>> wrote: >>>> >>>> >>>> https://github.com/apache/incubator-storm/tree/master/examples/storm-starter >>>> >>>> The WordCountTopology contains an example python bolt. >>>> >>>> Regards, >>>> Dilpreet >>>> >>>> >>>> On Thu, May 29, 2014 at 1:59 AM, Ashu Goel <[email protected]> wrote: >>>> >>>>> Does anyone have a good example program/instructions of using Python >>>>> with storm? I can’t seem to find anything concrete online. >>>>> >>>>> Thanks, >>>>> Ashu Goel >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> >>>> Este mensaje y sus adjuntos se dirigen exclusivamente a su >>>> destinatario, puede contener información privilegiada o confidencial y es >>>> para uso exclusivo de la persona o entidad de destino. Si no es usted. el >>>> destinatario indicado, queda notificado de que la lectura, utilización, >>>> divulgación y/o copia sin autorización puede estar prohibida en virtud de >>>> la legislación vigente. Si ha recibido este mensaje por error, le rogamos >>>> que nos lo comunique inmediatamente por esta misma vía y proceda a su >>>> destrucción. >>>> >>>> The information contained in this transmission is privileged and >>>> confidential information intended only for the use of the individual or >>>> entity named above. If the reader of this message is not the intended >>>> recipient, you are hereby notified that any dissemination, distribution or >>>> copying of this communication is strictly prohibited. If you have received >>>> this transmission in error, do not read it. Please immediately reply to the >>>> sender that you have received this communication in error and then delete >>>> it. >>>> >>>> Esta mensagem e seus anexos se dirigem exclusivamente ao seu >>>> destinatário, pode conter informação privilegiada ou confidencial e é para >>>> uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o >>>> destinatário indicado, fica notificado de que a leitura, utilização, >>>> divulgação e/ou cópia sem autorização pode estar proibida em virtude da >>>> legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos >>>> o comunique imediatamente por esta mesma via e proceda a sua destruição >>>> >>>> >>>> >>>> >>
