I started to use storm with python since storm 0.9.2, and I'm concerned about multi-lang performance improvement.
There is a pull request (https://github.com/apache/storm/pull/1136) for multi-lang perfromance improvements opened a year ago, but has not been merged yet. It uses MessagePackSerializer to repalce the default JSON Serializer. Also, there was a mail mentioning python shell bolt performance issue on 2016/1/3. A benchmark result of Msgpack was given out in that mail. I agree with @HeartSaVioR to do python optimization first. 2017-05-13 13:23 GMT+08:00 Jungtaek Lim <[email protected]>: > I'd like to see other multi-lang users' voice as well. > > I guess many users are using Streamparse, so the users of Streamparse may > be able report how much the performance difference is. If Streamparse uses > non-default serde to reduce the performance hit, Storm could even use it to > the default serde, but that requires breaking backward compatibility. > > Btw, IMHO, it might be considerable to focus less languages for > optimization, like supporting only Python (as data scientists are familiar > with it) as second language and trying to apply python-specific > optimization. We also may need to support non-Java language for new Streams > API, and it might not easy to support it with current multi-lang approach. > PySpark-like approach would be reasonable. > > We could still support multi-lang, but without outstanding improvement. > > Would like to hear opinions on my proposal, too. > > - Jungtaek Lim (HeartSaVioR) > > 2017년 5월 13일 (토) 오전 9:46, Mauro Giusti <[email protected]>님이 작성: > >> *My PC:* >> >> My PC is a 8 Core Xeon E5 with 16 GB of RAM, when the test starts, I only >> have 8 GB of memory occupied. >> >> I increased the memory of the Java VM to 4 GB and it only uses 1 GB when >> the test runs. >> >> >> >> *The Topology:* >> >> On my PC, I have three Spouts in mono, and one Bolt in mono. >> >> The topology is described in Flux – so I have basically zero code in >> Java, all in Flux .yaml + .Net with mono. >> >> All the messages use SHUFFLE and there is one worker only (my PC) >> >> >> >> I run in local mode and I also have a Docker container where I deployed >> this. >> >> >> >> *Topology details:* >> >> The Spouts read from an internal service, I collect about 60/70,000 >> records each minute. >> >> >> >> The Bolt reads from the three Spouts and makes aggregation in memory >> using SqlLite, the records are added to SqlLite as they arrive, then every >> 30 seconds SqlLite runs an aggregation and emits the data to an instance of >> Redis cache (via another Bolt hop). >> >> >> >> To test with Java, I replaced the Bolt with a simple Java Bolt that was >> only logging every 10,000 records. >> >> To compare with Mono, I created an empty .net Bolt and did the same. >> >> >> >> *My Tests:* >> >> The Flux topology is attached. >> >> The Java class I used to test and the .Net Bolt are as well. >> >> Again, the Spouts are .Net classes that emits 65K rows per minute. >> >> >> >> The log files are attached, you can see how much time it takes for the >> Bolt to consume 10,000 records – >> >> Inter-Language.txt is on my PC using the mono debug bolt, each 10,000 >> records takes around 4.5 seconds. >> >> The Java.txt is on my PC using Java (TransformEchoBolt.Java), each 10,000 >> records takes around 0.7 seconds. >> >> The Linux.txt is on the Docker container (still on my PC but using Docker >> for Windows in Linux Dockers mode), using mono but on Linux this time - the >> results are compatible with Mono on Windows (4.5 seconds per 10.000 >> records). >> >> I also tried calling directly the Windows exe on Windows in local mode, >> bypassing mono – the results were not pretty: 15 seconds per 10,000 records >> (NetExe.txt) >> >> >> >> *Results:* >> >> I know I can scale out and partition the data, but the amount of >> processing did not seem to require that – >> >> >> >> Maybe one issue is that the object I am moving has 11 fields? >> >> >> >> I can try to create a mini-repro if the dev team is interested – >> hopefully this might find what the bottleneck is - >> >> >> >> Thanks for your attention - >> >> Mauro. >> >> >> >> *From:* P. Taylor Goetz [mailto:[email protected]] >> *Sent:* Friday, May 12, 2017 4:55 PM >> *To:* [email protected]; [email protected] >> *Subject:* Re: Performance of Multi-Lang protocol >> >> >> >> Adding dev@ mailing list... >> >> >> >> There is definitely a performance hit. But it shouldn't be as drastic as >> you describe. >> >> >> >> Can you share some of your environment characteristics? >> >> >> >> I've been looking at the Apache Arrow project (full disclosure: I'm a PMC >> member) as a means for improved performance (it essentially would remove >> the performance hit for serialize/deserialize operations). This is >> particularly relevant to multi-lang, but could also apply to same-machine >> inter-worker communication. >> >> >> >> At this point I don't feel Arrow is at a production level maturity, but >> is getting close. I definitely feel it's worth exploring at PoC level. >> >> >> >> -Taylor >> >> >> On May 12, 2017, at 6:56 PM, Mauro Giusti <[email protected]> wrote: >> >> Hi – >> >> We are using multi-lang to pass data between storm and mono – >> >> >> >> We observe a 6x time increase when messages go from spout to bolt if the >> bolt is in mono vs. being in Java – >> >> >> >> Java can process 10,000 records in 0.7 seconds, while mono requires 4.5 >> seconds. >> >> The mono bolt was an empty one created with Storm.Net.Adapter >> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fziyunhx%2Fstorm-net-adapter&data=02%7C01%7Cmaurgi%40microsoft.com%7Cc1d9c2b13bab4297b2b508d499924f9d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636302300991869578&sdata=kaE4OjEttJv0KuGcwdUoJA%2BBDXIO1qvyv65S%2BBpMM%2F0%3D&reserved=0> >> library >> >> >> >> This is on a single machine topology – we are still in dev phase and >> using this solution for now - >> >> >> >> Is this expected? >> >> Should we try to minimize multi-lang and inter-process or is this a >> problem with my specific scenario (mono and/or single machine) ? >> >> >> >> Thank you – >> >> Mauro. >> >> -- Thanks Zhechao Ma
