Thank you for your reply Jonas, I'll give that a try. ---------------------------------- Trude Gentenaar Research&Development ---------------------------------- SemLab Zuidpoolsingel 14-A 2408 ZE Alphen a/d Rijn The Netherlands T: +31 172 494 777 E: [email protected] W: http://www.semlab.nl
On Mon, May 3, 2021 at 10:38 AM Jonas Krauss <[email protected]> wrote: > Hi Trude, > > one important difference between storm 1 and 2 from my experience when > upgrading is the handling of shuffleGrouping. In storm 1, it was "strict" > shuffle grouping, meaning tuples were distributed evenly across worker, no > matter where workers are located physically. With storm 2, when storm > recognizes that shuffle grouping would send a tuple to a worker not running > in the same JVM it would prefer to keep the tuple inside the current > worker, even if that means effectively that only one worker receives tuples. > > Have you examined the behavior, could it be that in storm 2 only one > worker is receiving tuples, when you configure two workers? > > To my knowledge, you can recreate the old behavior of storm 1 in storm 2 > by setting TOPOLOGY_DISABLE_LOADAWARE_MESSAGING to true. > > Regards > > Jonas > > Am Mo., 3. Mai 2021 um 10:23 Uhr schrieb Trude Gentenaar < > [email protected]>: > >> Hello all, >> >> After upgrading the Storm platform our topology is running approximately >> 100% slower on the same machine and with the same memory and threading >> settings, i.e. taking twice as long on the same testset. >> >> The topology is processing documents of varying lengths. The documents >> are split into sentences. Further processing is done by bolts that operate >> on either ‘document-level’ or ‘sentence-level’. Bolts that process >> sentences are set to higher parallelism. In Storm version 1.2.0 we found >> optimal performance when running on 2 workers on a single server, with >> document based bolts having their parallelism set to 2 and the sentence >> bolts having parallelism set to 8. Worker-xmx is set to 2048mb. This >> configuration runs twice as slow on Storm 2.2.0. When running the topology >> on 1 worker and with all parallelism set to 1 the speed returns to nearly >> that of 1.2.0. >> >> Further performance tuning has also been attempted but to no avail. This >> is not the behaviour that we expected of the new platform. Can anyone shed >> some light on this situation or perhaps let us know if our expectations >> were wrong? >> >> >> Thanks in advance, >> >> Trude >> >> >> ---------------------------------- >> Trude Gentenaar >> Research&Development >> ---------------------------------- >> SemLab >> Zuidpoolsingel 14-A >> 2408 ZE Alphen a/d Rijn >> The Netherlands >> T: +31 172 494 777 >> E: [email protected] >> W: http://www.semlab.nl >> >
