Re: Performance of Multi-Lang protocol

Zhechao Ma Sun, 14 May 2017 20:38:03 -0700

I started to use storm with python since storm 0.9.2, and I'm concerned
about multi-lang performance improvement.


There is a pull request (https://github.com/apache/storm/pull/1136) for
multi-lang perfromance improvements opened a year ago, but has not been
merged yet. It uses MessagePackSerializer to repalce the default JSON
Serializer.

Also, there was  a mail mentioning python shell bolt performance issue on
2016/1/3. A benchmark result of Msgpack was given out in that mail.

I agree with @HeartSaVioR to do python optimization first.


2017-05-13 13:23 GMT+08:00 Jungtaek Lim <[email protected]>:

> I'd like to see other multi-lang users' voice as well.
>
> I guess many users are using Streamparse, so the users of Streamparse may
> be able report how much the performance difference is. If Streamparse uses
> non-default serde to reduce the performance hit, Storm could even use it to
> the default serde, but that requires breaking backward compatibility.
>
> Btw, IMHO, it might be considerable to focus less languages for
> optimization, like supporting only Python (as data scientists are familiar
> with it) as second language and trying to apply python-specific
> optimization. We also may need to support non-Java language for new Streams
> API, and it might not easy to support it with current multi-lang approach.
> PySpark-like approach would be reasonable.
>
> We could still support multi-lang, but without outstanding improvement.
>
> Would like to hear opinions on my proposal, too.
>
> - Jungtaek Lim (HeartSaVioR)
>
> 2017년 5월 13일 (토) 오전 9:46, Mauro Giusti <[email protected]>님이 작성:
>
>> *My PC:*
>>
>> My PC is a 8 Core Xeon E5 with 16 GB of RAM, when the test starts, I only
>> have 8 GB of memory occupied.
>>
>> I increased the memory of the Java VM to 4 GB and it only uses 1 GB when
>> the test runs.
>>
>>
>>
>> *The Topology:*
>>
>> On my PC, I have three Spouts in mono, and one Bolt in mono.
>>
>> The topology is described in Flux – so I have basically zero code in
>> Java, all in Flux .yaml + .Net with mono.
>>
>> All the messages use SHUFFLE and there is one worker only (my PC)
>>
>>
>>
>> I run in local mode and I also have a Docker container where I deployed
>> this.
>>
>>
>>
>> *Topology details:*
>>
>> The Spouts read from an internal service, I collect about 60/70,000
>> records each minute.
>>
>>
>>
>> The Bolt reads from the three Spouts and makes aggregation in memory
>> using SqlLite, the records are added to SqlLite as they arrive, then every
>> 30 seconds SqlLite runs an aggregation and emits the data to an instance of
>> Redis cache (via another Bolt hop).
>>
>>
>>
>> To test with Java, I replaced the Bolt with a simple Java Bolt that was
>> only logging every 10,000 records.
>>
>> To compare with Mono, I created an empty .net Bolt and did the same.
>>
>>
>>
>> *My Tests:*
>>
>> The Flux topology is attached.
>>
>> The Java class I used to test and the .Net Bolt are as well.
>>
>> Again, the Spouts are .Net classes that emits 65K rows per minute.
>>
>>
>>
>> The log files are attached, you can see how much time it takes for the
>> Bolt to consume 10,000 records –
>>
>> Inter-Language.txt is on my PC using the mono debug bolt, each 10,000
>> records takes around 4.5 seconds.
>>
>> The Java.txt is on my PC using Java (TransformEchoBolt.Java), each 10,000
>> records takes around 0.7 seconds.
>>
>> The Linux.txt is on the Docker container (still on my PC but using Docker
>> for Windows in Linux Dockers mode), using mono but on Linux this time - the
>> results are compatible with Mono on Windows (4.5 seconds per 10.000
>> records).
>>
>> I also tried calling directly the Windows exe on Windows in local mode,
>> bypassing mono – the results were not pretty: 15 seconds per 10,000 records
>> (NetExe.txt)
>>
>>
>>
>> *Results:*
>>
>> I know I can scale out and partition the data, but the amount of
>> processing did not seem to require that –
>>
>>
>>
>> Maybe one issue is that the object I am moving has 11 fields?
>>
>>
>>
>> I can try to create a mini-repro if the dev team is interested –
>> hopefully this might find what the bottleneck is -
>>
>>
>>
>> Thanks for your attention -
>>
>> Mauro.
>>
>>
>>
>> *From:* P. Taylor Goetz [mailto:[email protected]]
>> *Sent:* Friday, May 12, 2017 4:55 PM
>> *To:* [email protected]; [email protected]
>> *Subject:* Re: Performance of Multi-Lang protocol
>>
>>
>>
>> Adding dev@ mailing list...
>>
>>
>>
>> There is definitely a performance hit. But it shouldn't be as drastic as
>> you describe.
>>
>>
>>
>> Can you share some of your environment characteristics?
>>
>>
>>
>> I've been looking at the Apache Arrow project (full disclosure: I'm a PMC
>> member) as a means for improved performance (it essentially would remove
>> the performance hit for serialize/deserialize operations). This is
>> particularly relevant to multi-lang, but could also apply to same-machine
>> inter-worker communication.
>>
>>
>>
>> At this point I don't feel Arrow is at a production level maturity, but
>> is getting close. I definitely feel it's worth exploring at PoC level.
>>
>>
>>
>> -Taylor
>>
>>
>> On May 12, 2017, at 6:56 PM, Mauro Giusti <[email protected]> wrote:
>>
>> Hi –
>>
>> We are using multi-lang to pass data between storm and mono –
>>
>>
>>
>> We observe a 6x time increase when messages go from spout to bolt if the
>> bolt is in mono vs. being in Java –
>>
>>
>>
>> Java can process 10,000 records in 0.7 seconds, while mono requires 4.5
>> seconds.
>>
>> The mono bolt was an empty one created with Storm.Net.Adapter
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fziyunhx%2Fstorm-net-adapter&data=02%7C01%7Cmaurgi%40microsoft.com%7Cc1d9c2b13bab4297b2b508d499924f9d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636302300991869578&sdata=kaE4OjEttJv0KuGcwdUoJA%2BBDXIO1qvyv65S%2BBpMM%2F0%3D&reserved=0>
>> library
>>
>>
>>
>> This is on a single machine topology – we are still in dev phase and
>> using this solution for now -
>>
>>
>>
>> Is this expected?
>>
>> Should we try to minimize multi-lang and inter-process or is this a
>> problem with my specific scenario (mono and/or single machine) ?
>>
>>
>>
>> Thank you –
>>
>> Mauro.
>>
>>


-- 
Thanks
Zhechao Ma

Re: Performance of Multi-Lang protocol

Reply via email to