Hello all. We're using Storm with DRPC + Trident & ZeroMQ. We were testing our topology and got stuck with lots of memory usage - finally killed.
We have 3 machines and 3 workers. Topology emits tree-like tuples and last function aggregates into one. Let me name functions by A, B, C, D, E. Topology emits tuples to A:100 -> B:1 (total 100) -> C:20000 (total: 200000) -> D:1 (total 200000) -> E:1 (aggregate) C populates datas (10 datas per one populating) and emits to D sequentially, so we gave D higher parallelism (about 10x or higher). Each functions use repartitioning - shuffle. At first, workers are killed. It's not OOME, and we run worker with Xmx1g but memory raises 5~6G so it may be native memory area, we were doubting ZeroMQ. C -> D emits big tuple (about 100k) faster, and it could lead to a problem when tuples are sent to outside of worker. (Am I right? Or ZeroMQ handles inter-thread messages?) So we removed shuffle between C and D. (topology.optimize is true.) Now C and D is recognized by one bolt (number-C-D by storm UI). But modified topology has same behavior - uses lots of memory (similar to before) and killed. I'm wondering what does it change when we groups function into one, C-D. We expect that C emits message, and D (same executor or same worker) executes it, so there're no worker by worker interaction between C-D. Are we expecting wrong? If so please explain this behavior related to grouped function. Thanks in advance! Regards. Jungtaek Lim (HeartSaVioR) -- Name : 임 정택 Blog : http://www.heartsavior.net / http://dev.heartsavior.net Twitter : http://twitter.com/heartsavior LinkedIn : http://www.linkedin.com/in/heartsavior
