Totally agree. I'd switch back to Python in a second if I could. Might be
worth taking a look at the pluggable serializer.


On Fri, May 30, 2014 at 5:14 PM, Andrew Montalenti <[email protected]>
wrote:

> For one thing, a recently accepted Storm pull request has made this
> serialization pluggable and someone has already implemented a protobuf
> variety. We plan to investigate alternative serialization options for
> multilang once we get the other tooling out of the way.
>
> For another, it is true the overhead for serialization is non trivial, but
> the overhead also tends to be a constant factor applied to data size, and
> machines are cheap while programming time is expensive. Storm and Python's
> data analysis and data integration libraries are a pretty powerful combo
> worth the performance penalty.
> On May 30, 2014 1:42 PM, "Larry Palmer" <[email protected]> wrote:
>
>> We had experimented with Storm/Python 6 months ago or so, but found the
>> JSON serialization/deserialization overhead was quite high, on the order of
>> several hundred usec per tuple every time it transitioned from java to
>> python or vice versa, limiting total throughput on a 12 core server to
>> around 25k tuples/second. Considered trying to switch to a different
>> serializer but ended up just doing everything in Java instead.
>>
>> Is that still the case, or perhaps has the speed been improved?
>>
>>
>> On Thu, May 29, 2014 at 10:06 PM, Andrew Montalenti <[email protected]>
>> wrote:
>>
>>> We are building a new Storm and Python interop option that is called
>>> streamparse:
>>>
>>> https://github.com/Parsely/streamparse
>>>
>>> It includes a heavily rewritten Storm interop library and a command line
>>> tool, sparse, for managing local and remote Storm clusters. The idea is to
>>> make Storm projects as easy to build and manage in Python as RQ or Celery
>>> projects.
>>>
>>> It currently has support for running local clusters in a single command,
>>> managing virtualenvs on remote worker machines, submitting topologies,
>>> listing/killing topologies, and tailing remote log files. The multilang
>>> layer also has better support for logging and exception/error handling.
>>> Multiple topologies can be built from a single codebase and multiple remote
>>> Storm clusters can be supported via a simple JSON configuration file.
>>>
>>> We are already using it for production topologies atop Storm 0.9.1 and
>>> Storm 0.8. We welcome contributions and if you join our mailing list, feel
>>> free to make requests. We continue to develop it actively and in an open
>>> manner.
>>>
>>> -Andrew Montalenti
>>> CTO, Parse.ly
>>> On May 29, 2014 6:35 PM, "Ashu Goel" <[email protected]> wrote:
>>>
>>>> (the reason being is that we are still running Python 2.6 but Petrel is
>>>> only compatible with 2.7)
>>>> On May 29, 2014, at 2:48 PM, Ashu Goel <[email protected]> wrote:
>>>>
>>>> Awesome! I’m looking more into using the storm.thrift to define a
>>>> non-JVM DSL… does anyone have any working examples of this? Python
>>>> preferred but any example will do. the wiki is a bit confusing...
>>>> On May 28, 2014, at 1:54 PM, FRANCISCO JESUS GOMEZ RODRIGUEZ <
>>>> [email protected]> wrote:
>>>>
>>>>  Ashu, take a look this project: http://github.com/AirSage/Petrel
>>>>
>>>> Write, submit, debug and monitor in python.
>>>>
>>>> @ffranz
>>>> El 28/05/2014 22:49, Ashu Goel <[email protected]> escribió:
>>>>  Any examples where the entire infra is written in Python (including
>>>> topology)? or is that not possible
>>>>  On May 28, 2014, at 1:33 PM, Dilpreet Singh <[email protected]>
>>>> wrote:
>>>>
>>>>
>>>> https://github.com/apache/incubator-storm/tree/master/examples/storm-starter
>>>>
>>>>  The WordCountTopology contains an example python bolt.
>>>>
>>>>  Regards,
>>>> Dilpreet
>>>>
>>>>
>>>> On Thu, May 29, 2014 at 1:59 AM, Ashu Goel <[email protected]> wrote:
>>>>
>>>>> Does anyone have a good example program/instructions of using Python
>>>>> with storm? I can’t seem to find anything concrete online.
>>>>>
>>>>> Thanks,
>>>>> Ashu Goel
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> Este mensaje y sus adjuntos se dirigen exclusivamente a su
>>>> destinatario, puede contener información privilegiada o confidencial y es
>>>> para uso exclusivo de la persona o entidad de destino. Si no es usted. el
>>>> destinatario indicado, queda notificado de que la lectura, utilización,
>>>> divulgación y/o copia sin autorización puede estar prohibida en virtud de
>>>> la legislación vigente. Si ha recibido este mensaje por error, le rogamos
>>>> que nos lo comunique inmediatamente por esta misma vía y proceda a su
>>>> destrucción.
>>>>
>>>> The information contained in this transmission is privileged and
>>>> confidential information intended only for the use of the individual or
>>>> entity named above. If the reader of this message is not the intended
>>>> recipient, you are hereby notified that any dissemination, distribution or
>>>> copying of this communication is strictly prohibited. If you have received
>>>> this transmission in error, do not read it. Please immediately reply to the
>>>> sender that you have received this communication in error and then delete
>>>> it.
>>>>
>>>> Esta mensagem e seus anexos se dirigem exclusivamente ao seu
>>>> destinatário, pode conter informação privilegiada ou confidencial e é para
>>>> uso exclusivo da pessoa ou entidade de destino. Se não é vossa senhoria o
>>>> destinatário indicado, fica notificado de que a leitura, utilização,
>>>> divulgação e/ou cópia sem autorização pode estar proibida em virtude da
>>>> legislação vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos
>>>> o comunique imediatamente por esta mesma via e proceda a sua destruição
>>>>
>>>>
>>>>
>>>>
>>

Reply via email to