RE: Bolt slow down the emit speed of spout

2016-10-19 Thread Chien Le
Yes, the parallelism of reads from a Kafka topic is limited by number of 
partitions. There are other things you can tweak with kafkaspout parameters to 
increase batch read sizes, etc that might help each individual kafkaspout's 
performance.

-Chien

From: Chen Junfeng [mailto:k-2f...@hotmail.com]
Sent: Tuesday, October 18, 2016 8:25 PM
To: Jungtaek Lim <kabh...@gmail.com>; user@storm.apache.org
Subject: 答复: Bolt slow down the emit speed of spout

So the only method to improve kafkaspout performance is to increase the 
partition count of specific topic to increase the parallelism of kafkaspout?

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Jungtaek Lim<mailto:kabh...@gmail.com>
发送时间: 2016年10月19日 9:48
收件人: Chen Junfeng<mailto:k-2f...@hotmail.com>; Chien 
Le<mailto:chien...@ds-iq.com>; 
user@storm.apache.org<mailto:user@storm.apache.org>
主题: Re: Bolt slow down the emit speed of spout

You need to make it clear that you're using multiple 'topics' or multiple 
'partitions'.
There's no way to read partition in parallel, and it's a design of Kafka. If 
you mean topic, you can increase parallelism of Spout to same as partition 
count.

2016년 10월 19일 (수) 오전 10:42, Chen Junfeng 
<k-2f...@hotmail.com<mailto:k-2f...@hotmail.com>>님이 작성:
Thanks for your reply.

Maybe I am close to the reason. I found the cpu usage of spout worker is really 
very high, about 200% to 400% via ‘top’ command. But KafkaSpout with Storm 
seems not support using multi spouts instance to read one topic.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Jungtaek Lim<mailto:kabh...@gmail.com>
发送时间: 2016年10月19日 9:18
收件人: Chien Le<mailto:chien...@ds-iq.com>; 
user@storm.apache.org<mailto:user@storm.apache.org>
主题: Re: Bolt slow down the emit speed of spout

Could you change your bolt to no-op but ack and run again? It makes your 
benchmark more clearer.

And if you pick your number from the first 10 minutes, could you pick your 
number from another 10 minutes? It gets rid of worker / executor deploy / 
startup time, and warm-up of JVM.

- Jungtaek Lim (HeartSaVioR)
On Wed, 19 Oct 2016 at 10:00 AM Chen Junfeng 
<k-2f...@hotmail.com<mailto:k-2f...@hotmail.com>> wrote:
I used ‘dstat -a’ which measures cpu, memory, disk and network.  Disk should 
not be the bottleneck as my topology doesn’t have much disk operation. Log.info 
print tuple to disk every 1000 tuples.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Chien Le<mailto:chien...@ds-iq.com>
发送时间: 2016年10月19日 5:39
收件人: user@storm.apache.org<mailto:user@storm.apache.org>
主题: RE: Bolt slow down the emit speed of spout

dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?

Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your bottleneck. 
I would try using iostat -x 1 and see what your %util and svc times are to 
truly eliminate disk as the bottleneck.

-Chien

From: Chen Junfeng [mailto:k-2f...@hotmail.com<mailto:k-2f...@hotmail.com>]
Sent: Monday, October 17, 2016 11:30 PM
To: user@storm.apache.org<mailto:user@storm.apache.org>
Subject: Bolt slow down the emit speed of spout


I found a tough but interesting issue about Storm 1.0.2 to share with you.
Firstly I introduce my system structure. My topology has 471 workers, 6 kafka 
topic and some processor bolt. Backpressure has been disabled in topology by

conf.put(Config.TOPOLOGY_BACKPRESSURE_ENABLE, false);

If I submit KafkaSpout only, the statistic data from Storm UI is as below:
Topology stats

Window

Emitted

Transferred

Complete latency (ms)

Acked

Failed

10m 0s

875161429

0

0

875141691

3h 0m 0s

927117120

0

0

927117320

1d 0h 0m 0s

927117120

0

0

927117320

All time

927117120

0

0

927117320

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_144

2

2

128701261

0

0

128702858

0

Spout_JSCTGW_145

2

2

162347454

0

0

162347639

0

Spout_JSCTGW_146

2

2

135598494

0

0

135598608

0

Spout_JSCTGW_147

2

2

128307822

0

0

128306102

0

Spout_JSCTGW_148

2

2

160369513

0

0

160360423

0

Spout_JSCTGW_149

2

2

159836885

0

0

159826061

0



In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from kafka 
cluster.
Then I submit the topology with a simple PrintBolt, the 10min stat data is as 
following:
Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Error Host

Error Port

Spout_JSCTGW_144

2

2

113599789

113599789

0

113513630

0

Spout_JSCTGW_145

2

2

122501522

122501575

0

122659659

0

Spout_JSCTGW_146

2

2

91308915

91308915

0

91308652

0

Spout_JSCTGW_147

2

2

105029568

105029568

0

104968275

0

Spout_JSCTGW_148

2

2

115889172

115889178

0

115890165

0

Spout_JSCTGW_149

2

2

115185591

115185591

0

1151856

Re: Bolt slow down the emit speed of spout

2016-10-18 Thread Jungtaek Lim
You need to make it clear that you're using multiple 'topics' or multiple
'partitions'.
There's no way to read partition in parallel, and it's a design of Kafka.
If you mean topic, you can increase parallelism of Spout to same as
partition count.

2016년 10월 19일 (수) 오전 10:42, Chen Junfeng <k-2f...@hotmail.com>님이 작성:

> Thanks for your reply.
>
>
>
> Maybe I am close to the reason. I found the cpu usage of spout worker is
> really very high, about 200% to 400% via ‘top’ command. But KafkaSpout with
> Storm seems not support using multi spouts instance to read one topic.
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> *发件人: *Jungtaek Lim <kabh...@gmail.com>
> *发送时间: *2016年10月19日 9:18
> *收件人: *Chien Le <chien...@ds-iq.com>; user@storm.apache.org
> *主题: *Re: Bolt slow down the emit speed of spout
>
>
> Could you change your bolt to no-op but ack and run again? It makes your
> benchmark more clearer.
>
> And if you pick your number from the first 10 minutes, could you pick your
> number from another 10 minutes? It gets rid of worker / executor deploy /
> startup time, and warm-up of JVM.
>
> - Jungtaek Lim (HeartSaVioR)
> On Wed, 19 Oct 2016 at 10:00 AM Chen Junfeng <k-2f...@hotmail.com> wrote:
>
> I used ‘dstat -a’ which measures cpu, memory, disk and network.  Disk
> should not be the bottleneck as my topology doesn’t have much disk
> operation. Log.info print tuple to disk every 1000 tuples.
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> *发件人: *Chien Le <chien...@ds-iq.com>
> *发送时间: *2016年10月19日 5:39
> *收件人: *user@storm.apache.org
> *主题: *RE: Bolt slow down the emit speed of spout
>
>
>
> dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?
>
>
>
> Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your
> bottleneck. I would try using iostat -x 1 and see what your %util and svc
> times are to truly eliminate disk as the bottleneck.
>
>
>
> -Chien
>
>
>
> *From:* Chen Junfeng [mailto:k-2f...@hotmail.com]
> *Sent:* Monday, October 17, 2016 11:30 PM
> *To:* user@storm.apache.org
> *Subject:* Bolt slow down the emit speed of spout
>
>
>
>
>
> I found a tough but interesting issue about Storm 1.0.2 to share with you.
>
> Firstly I introduce my system structure. My topology has 471 workers, 6
> kafka topic and some processor bolt. Backpressure has been disabled in
> topology by
>
> conf.put(Config.*TOPOLOGY_BACKPRESSURE_ENABLE*, false);
>
>
>
> If I submit KafkaSpout only, the statistic data from Storm UI is as below:
>
> Topology stats
>
> Window
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> 10m 0s
>
> 875161429
>
> 0
>
> 0
>
> 875141691
>
> 3h 0m 0s
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> 1d 0h 0m 0s
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> All time
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> Spouts (10m 0s)
>
> Search:
>
> Id
>
> Executors
>
> Tasks
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> Spout_JSCTGW_144
>
> 2
>
> 2
>
> 128701261
>
> 0
>
> 0
>
> 128702858
>
> 0
>
> Spout_JSCTGW_145
>
> 2
>
> 2
>
> 162347454
>
> 0
>
> 0
>
> 162347639
>
> 0
>
> Spout_JSCTGW_146
>
> 2
>
> 2
>
> 135598494
>
> 0
>
> 0
>
> 135598608
>
> 0
>
> Spout_JSCTGW_147
>
> 2
>
> 2
>
> 128307822
>
> 0
>
> 0
>
> 128306102
>
> 0
>
> Spout_JSCTGW_148
>
> 2
>
> 2
>
> 160369513
>
> 0
>
> 0
>
> 160360423
>
> 0
>
> Spout_JSCTGW_149
>
> 2
>
> 2
>
> 159836885
>
> 0
>
> 0
>
> 159826061
>
> 0
>
>
>
>
>
> In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from
> kafka cluster.
>
> Then I submit the topology with a simple PrintBolt, the 10min stat data is
> as following:
>
> Spouts (10m 0s)
>
> Search:
>
> Id
>
> Executors
>
> Tasks
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> Error Host
>
> Error Port
>
> Spout_JSCTGW_144
>
> 2
>
> 2
>
> 113599789
>
> 113599789
>
> 0
>
> 113513630
>
> 0
>
> Spout_JSCTGW_145
>
> 2
>
> 2
>
> 122501522
>
> 122501575
>
> 0
>
>

答复: Bolt slow down the emit speed of spout

2016-10-18 Thread Chen Junfeng
Thanks for your reply.

Maybe I am close to the reason. I found the cpu usage of spout worker is really 
very high, about 200% to 400% via ‘top’ command. But KafkaSpout with Storm 
seems not support using multi spouts instance to read one topic.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Jungtaek Lim<mailto:kabh...@gmail.com>
发送时间: 2016年10月19日 9:18
收件人: Chien Le<mailto:chien...@ds-iq.com>; 
user@storm.apache.org<mailto:user@storm.apache.org>
主题: Re: Bolt slow down the emit speed of spout

Could you change your bolt to no-op but ack and run again? It makes your 
benchmark more clearer.

And if you pick your number from the first 10 minutes, could you pick your 
number from another 10 minutes? It gets rid of worker / executor deploy / 
startup time, and warm-up of JVM.

- Jungtaek Lim (HeartSaVioR)
On Wed, 19 Oct 2016 at 10:00 AM Chen Junfeng 
<k-2f...@hotmail.com<mailto:k-2f...@hotmail.com>> wrote:

I used ‘dstat -a’ which measures cpu, memory, disk and network.  Disk should 
not be the bottleneck as my topology doesn’t have much disk operation. Log.info 
print tuple to disk every 1000 tuples.



发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用



发件人: Chien Le<mailto:chien...@ds-iq.com>
发送时间: 2016年10月19日 5:39
收件人: user@storm.apache.org<mailto:user@storm.apache.org>
主题: RE: Bolt slow down the emit speed of spout



dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?



Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your bottleneck. 
I would try using iostat -x 1 and see what your %util and svc times are to 
truly eliminate disk as the bottleneck.



-Chien



From: Chen Junfeng [mailto:k-2f...@hotmail.com<mailto:k-2f...@hotmail.com>]
Sent: Monday, October 17, 2016 11:30 PM
To: user@storm.apache.org<mailto:user@storm.apache.org>
Subject: Bolt slow down the emit speed of spout





I found a tough but interesting issue about Storm 1.0.2 to share with you.

Firstly I introduce my system structure. My topology has 471 workers, 6 kafka 
topic and some processor bolt. Backpressure has been disabled in topology by

conf.put(Config.TOPOLOGY_BACKPRESSURE_ENABLE, false);



If I submit KafkaSpout only, the statistic data from Storm UI is as below:

Topology stats


Window


Emitted


Transferred


Complete latency (ms)


Acked


Failed


10m 0s


875161429


0


0


875141691


3h 0m 0s


927117120


0


0


927117320


1d 0h 0m 0s


927117120


0


0


927117320


All time


927117120


0


0


927117320


Spouts (10m 0s)


Search:


Id


Executors


Tasks


Emitted


Transferred


Complete latency (ms)


Acked


Failed


Spout_JSCTGW_144


2


2


128701261


0


0


128702858


0


Spout_JSCTGW_145


2


2


162347454


0


0


162347639


0


Spout_JSCTGW_146


2


2


135598494


0


0


135598608


0


Spout_JSCTGW_147


2


2


128307822


0


0


128306102


0


Spout_JSCTGW_148


2


2


160369513


0


0


160360423


0


Spout_JSCTGW_149


2


2


159836885


0


0


159826061


0






In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from kafka 
cluster.

Then I submit the topology with a simple PrintBolt, the 10min stat data is as 
following:

Spouts (10m 0s)


Search:


Id


Executors


Tasks


Emitted


Transferred


Complete latency (ms)


Acked


Failed


Error Host


Error Port


Spout_JSCTGW_144


2


2


113599789


113599789


0


113513630


0


Spout_JSCTGW_145


2


2


122501522


122501575


0


122659659


0


Spout_JSCTGW_146


2


2


91308915


91308915


0


91308652


0


Spout_JSCTGW_147


2


2


105029568


105029568


0


104968275


0


Spout_JSCTGW_148


2


2


115889172


115889178


0


115890165


0


Spout_JSCTGW_149


2


2


115185591


115185591


0


115185638


0


Showing 1 to 6 of 6 entries


663526019


Bolts (10m 0s)


Search:


Id


Executors


Tasks


Emitted


Transferred


Capacity (last 10m)


Execute latency (ms)


Executed


Process latency (ms)


Acked


PrintBolt


240


240


0


0


0.241


0.041


665063824


0


665,036,902




The printbolt contains nothing but a simple log.info<http://log.info> method, 
but the total number of emitted tuple of spout decrease to about 600 million. 
All of machines have 32 cores but the system average load never exceed 15. Also 
I have checked the network load. My network adapter has 10Gb bandwidth and the 
dstat -d command shows the max rec or send speed is 300MB/s at most , which 
doesn’t reach the limit of network adapter. The total number of executors is 
less than that of total workers, so performance should not be problem.



Then I do some more test,

Only one kafkaspout reading on topic



Spouts (10m 0s)


Search:


Id


Executors


Tasks


Emitted


Transferred


Complete latency (ms)


Acked


Failed


Spout_JSCTGW_145


2


2


148918096


0


0


148917805


0




One kafkaspout and one data processing bolt



Spouts (10m 0s)


Sea

Re: Bolt slow down the emit speed of spout

2016-10-18 Thread Jungtaek Lim
Could you change your bolt to no-op but ack and run again? It makes your
benchmark more clearer.

And if you pick your number from the first 10 minutes, could you pick your
number from another 10 minutes? It gets rid of worker / executor deploy /
startup time, and warm-up of JVM.

- Jungtaek Lim (HeartSaVioR)
On Wed, 19 Oct 2016 at 10:00 AM Chen Junfeng <k-2f...@hotmail.com> wrote:

> I used ‘dstat -a’ which measures cpu, memory, disk and network.  Disk
> should not be the bottleneck as my topology doesn’t have much disk
> operation. Log.info print tuple to disk every 1000 tuples.
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> *发件人: *Chien Le <chien...@ds-iq.com>
> *发送时间: *2016年10月19日 5:39
> *收件人: *user@storm.apache.org
> *主题: *RE: Bolt slow down the emit speed of spout
>
>
>
> dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?
>
>
>
> Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your
> bottleneck. I would try using iostat -x 1 and see what your %util and svc
> times are to truly eliminate disk as the bottleneck.
>
>
>
> -Chien
>
>
>
> *From:* Chen Junfeng [mailto:k-2f...@hotmail.com]
> *Sent:* Monday, October 17, 2016 11:30 PM
> *To:* user@storm.apache.org
> *Subject:* Bolt slow down the emit speed of spout
>
>
>
>
>
> I found a tough but interesting issue about Storm 1.0.2 to share with you.
>
> Firstly I introduce my system structure. My topology has 471 workers, 6
> kafka topic and some processor bolt. Backpressure has been disabled in
> topology by
>
> conf.put(Config.*TOPOLOGY_BACKPRESSURE_ENABLE*, false);
>
>
>
> If I submit KafkaSpout only, the statistic data from Storm UI is as below:
>
> Topology stats
>
> Window
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> 10m 0s
>
> 875161429
>
> 0
>
> 0
>
> 875141691
>
> 3h 0m 0s
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> 1d 0h 0m 0s
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> All time
>
> 927117120
>
> 0
>
> 0
>
> 927117320
>
> Spouts (10m 0s)
>
> Search:
>
> Id
>
> Executors
>
> Tasks
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> Spout_JSCTGW_144
>
> 2
>
> 2
>
> 128701261
>
> 0
>
> 0
>
> 128702858
>
> 0
>
> Spout_JSCTGW_145
>
> 2
>
> 2
>
> 162347454
>
> 0
>
> 0
>
> 162347639
>
> 0
>
> Spout_JSCTGW_146
>
> 2
>
> 2
>
> 135598494
>
> 0
>
> 0
>
> 135598608
>
> 0
>
> Spout_JSCTGW_147
>
> 2
>
> 2
>
> 128307822
>
> 0
>
> 0
>
> 128306102
>
> 0
>
> Spout_JSCTGW_148
>
> 2
>
> 2
>
> 160369513
>
> 0
>
> 0
>
> 160360423
>
> 0
>
> Spout_JSCTGW_149
>
> 2
>
> 2
>
> 159836885
>
> 0
>
> 0
>
> 159826061
>
> 0
>
>
>
>
>
> In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from
> kafka cluster.
>
> Then I submit the topology with a simple PrintBolt, the 10min stat data is
> as following:
>
> Spouts (10m 0s)
>
> Search:
>
> Id
>
> Executors
>
> Tasks
>
> Emitted
>
> Transferred
>
> Complete latency (ms)
>
> Acked
>
> Failed
>
> Error Host
>
> Error Port
>
> Spout_JSCTGW_144
>
> 2
>
> 2
>
> 113599789
>
> 113599789
>
> 0
>
> 113513630
>
> 0
>
> Spout_JSCTGW_145
>
> 2
>
> 2
>
> 122501522
>
> 122501575
>
> 0
>
> 122659659
>
> 0
>
> Spout_JSCTGW_146
>
> 2
>
> 2
>
> 91308915
>
> 91308915
>
> 0
>
> 91308652
>
> 0
>
> Spout_JSCTGW_147
>
> 2
>
> 2
>
> 105029568
>
> 105029568
>
> 0
>
> 104968275
>
> 0
>
> Spout_JSCTGW_148
>
> 2
>
> 2
>
> 115889172
>
> 115889178
>
> 0
>
> 115890165
>
> 0
>
> Spout_JSCTGW_149
>
> 2
>
> 2
>
> 115185591
>
> 115185591
>
> 0
>
> 115185638
>
> 0
>
> Showing 1 to 6 of 6 entries
>
> *663526019*
>
> Bolts (10m 0s)
>
> Search:
>
> Id
>
> Executors
>
> Tasks
>
> Emitted
>
> Transferred
>
> Capacity (last 10m)
>
> Execute latency (ms)
>
> Executed
>
> Process latency (ms)
>
> Acked
>
> PrintBolt
>
> 240
>
> 240
>
> 0
>
> 0
>
> 0.241
>
>

答复: Bolt slow down the emit speed of spout

2016-10-18 Thread Chen Junfeng
I used ‘dstat -a’ which measures cpu, memory, disk and network.  Disk should 
not be the bottleneck as my topology doesn’t have much disk operation. Log.info 
print tuple to disk every 1000 tuples.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Chien Le<mailto:chien...@ds-iq.com>
发送时间: 2016年10月19日 5:39
收件人: user@storm.apache.org<mailto:user@storm.apache.org>
主题: RE: Bolt slow down the emit speed of spout

dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?

Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your bottleneck. 
I would try using iostat -x 1 and see what your %util and svc times are to 
truly eliminate disk as the bottleneck.

-Chien

From: Chen Junfeng [mailto:k-2f...@hotmail.com]
Sent: Monday, October 17, 2016 11:30 PM
To: user@storm.apache.org
Subject: Bolt slow down the emit speed of spout


I found a tough but interesting issue about Storm 1.0.2 to share with you.
Firstly I introduce my system structure. My topology has 471 workers, 6 kafka 
topic and some processor bolt. Backpressure has been disabled in topology by

conf.put(Config.TOPOLOGY_BACKPRESSURE_ENABLE, false);

If I submit KafkaSpout only, the statistic data from Storm UI is as below:
Topology stats

Window

Emitted

Transferred

Complete latency (ms)

Acked

Failed

10m 0s

875161429

0

0

875141691

3h 0m 0s

927117120

0

0

927117320

1d 0h 0m 0s

927117120

0

0

927117320

All time

927117120

0

0

927117320

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_144

2

2

128701261

0

0

128702858

0

Spout_JSCTGW_145

2

2

162347454

0

0

162347639

0

Spout_JSCTGW_146

2

2

135598494

0

0

135598608

0

Spout_JSCTGW_147

2

2

128307822

0

0

128306102

0

Spout_JSCTGW_148

2

2

160369513

0

0

160360423

0

Spout_JSCTGW_149

2

2

159836885

0

0

159826061

0



In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from kafka 
cluster.
Then I submit the topology with a simple PrintBolt, the 10min stat data is as 
following:
Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Error Host

Error Port

Spout_JSCTGW_144

2

2

113599789

113599789

0

113513630

0

Spout_JSCTGW_145

2

2

122501522

122501575

0

122659659

0

Spout_JSCTGW_146

2

2

91308915

91308915

0

91308652

0

Spout_JSCTGW_147

2

2

105029568

105029568

0

104968275

0

Spout_JSCTGW_148

2

2

115889172

115889178

0

115890165

0

Spout_JSCTGW_149

2

2

115185591

115185591

0

115185638

0

Showing 1 to 6 of 6 entries

663526019

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

Process latency (ms)

Acked

PrintBolt

240

240

0

0

0.241

0.041

665063824

0

665,036,902


The printbolt contains nothing but a simple log.info method, but the total 
number of emitted tuple of spout decrease to about 600 million. All of machines 
have 32 cores but the system average load never exceed 15. Also I have checked 
the network load. My network adapter has 10Gb bandwidth and the dstat -d 
command shows the max rec or send speed is 300MB/s at most , which doesn’t 
reach the limit of network adapter. The total number of executors is less than 
that of total workers, so performance should not be problem.

Then I do some more test,
Only one kafkaspout reading on topic

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

148918096

0

0

148917805

0


One kafkaspout and one data processing bolt

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

106481751

106481751

0

106367801

0

Showing 1 to 1 of 1 entries

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

AtomPluginBolt

1000

1000

132752203

0

0.019

0.067

106519630



Even I only read one kafka topic, the emit speed is also slowed down.

Does anyone meet the same problem?




RE: Bolt slow down the emit speed of spout

2016-10-18 Thread Chien Le
dstat -d is measuring your disk throughput, yes? Did you mean dstat -n?

Otherwise, dstat -d showing 300MB/s doesn't eliminate disk as your bottleneck. 
I would try using iostat -x 1 and see what your %util and svc times are to 
truly eliminate disk as the bottleneck.

-Chien

From: Chen Junfeng [mailto:k-2f...@hotmail.com]
Sent: Monday, October 17, 2016 11:30 PM
To: user@storm.apache.org
Subject: Bolt slow down the emit speed of spout


I found a tough but interesting issue about Storm 1.0.2 to share with you.
Firstly I introduce my system structure. My topology has 471 workers, 6 kafka 
topic and some processor bolt. Backpressure has been disabled in topology by

conf.put(Config.TOPOLOGY_BACKPRESSURE_ENABLE, false);

If I submit KafkaSpout only, the statistic data from Storm UI is as below:
Topology stats

Window

Emitted

Transferred

Complete latency (ms)

Acked

Failed

10m 0s

875161429

0

0

875141691

3h 0m 0s

927117120

0

0

927117320

1d 0h 0m 0s

927117120

0

0

927117320

All time

927117120

0

0

927117320

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_144

2

2

128701261

0

0

128702858

0

Spout_JSCTGW_145

2

2

162347454

0

0

162347639

0

Spout_JSCTGW_146

2

2

135598494

0

0

135598608

0

Spout_JSCTGW_147

2

2

128307822

0

0

128306102

0

Spout_JSCTGW_148

2

2

160369513

0

0

160360423

0

Spout_JSCTGW_149

2

2

159836885

0

0

159826061

0



In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from kafka 
cluster.
Then I submit the topology with a simple PrintBolt, the 10min stat data is as 
following:
Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Error Host

Error Port

Spout_JSCTGW_144

2

2

113599789

113599789

0

113513630

0

Spout_JSCTGW_145

2

2

122501522

122501575

0

122659659

0

Spout_JSCTGW_146

2

2

91308915

91308915

0

91308652

0

Spout_JSCTGW_147

2

2

105029568

105029568

0

104968275

0

Spout_JSCTGW_148

2

2

115889172

115889178

0

115890165

0

Spout_JSCTGW_149

2

2

115185591

115185591

0

115185638

0

Showing 1 to 6 of 6 entries

663526019

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

Process latency (ms)

Acked

PrintBolt

240

240

0

0

0.241

0.041

665063824

0

665,036,902


The printbolt contains nothing but a simple log.info method, but the total 
number of emitted tuple of spout decrease to about 600 million. All of machines 
have 32 cores but the system average load never exceed 15. Also I have checked 
the network load. My network adapter has 10Gb bandwidth and the dstat -d 
command shows the max rec or send speed is 300MB/s at most , which doesn’t 
reach the limit of network adapter. The total number of executors is less than 
that of total workers, so performance should not be problem.

Then I do some more test,
Only one kafkaspout reading on topic

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

148918096

0

0

148917805

0


One kafkaspout and one data processing bolt

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

106481751

106481751

0

106367801

0

Showing 1 to 1 of 1 entries

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

AtomPluginBolt

1000

1000

132752203

0

0.019

0.067

106519630



Even I only read one kafka topic, the emit speed is also slowed down.

Does anyone meet the same problem?



Bolt slow down the emit speed of spout

2016-10-18 Thread Chen Junfeng

I found a tough but interesting issue about Storm 1.0.2 to share with you.
Firstly I introduce my system structure. My topology has 471 workers, 6 kafka 
topic and some processor bolt. Backpressure has been disabled in topology by

conf.put(Config.TOPOLOGY_BACKPRESSURE_ENABLE, false);

If I submit KafkaSpout only, the statistic data from Storm UI is as below:
Topology stats

Window

Emitted

Transferred

Complete latency (ms)

Acked

Failed

10m 0s

875161429

0

0

875141691

3h 0m 0s

927117120

0

0

927117320

1d 0h 0m 0s

927117120

0

0

927117320

All time

927117120

0

0

927117320

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_144

2

2

128701261

0

0

128702858

0

Spout_JSCTGW_145

2

2

162347454

0

0

162347639

0

Spout_JSCTGW_146

2

2

135598494

0

0

135598608

0

Spout_JSCTGW_147

2

2

128307822

0

0

128306102

0

Spout_JSCTGW_148

2

2

160369513

0

0

160360423

0

Spout_JSCTGW_149

2

2

159836885

0

0

159826061

0



In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from kafka 
cluster.
Then I submit the topology with a simple PrintBolt, the 10min stat data is as 
following:
Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Error Host

Error Port

Spout_JSCTGW_144

2

2

113599789

113599789

0

113513630

0

Spout_JSCTGW_145

2

2

122501522

122501575

0

122659659

0

Spout_JSCTGW_146

2

2

91308915

91308915

0

91308652

0

Spout_JSCTGW_147

2

2

105029568

105029568

0

104968275

0

Spout_JSCTGW_148

2

2

115889172

115889178

0

115890165

0

Spout_JSCTGW_149

2

2

115185591

115185591

0

115185638

0

Showing 1 to 6 of 6 entries

663526019

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

Process latency (ms)

Acked

PrintBolt

240

240

0

0

0.241

0.041

665063824

0

665,036,902


The printbolt contains nothing but a simple log.info method, but the total 
number of emitted tuple of spout decrease to about 600 million. All of machines 
have 32 cores but the system average load never exceed 15. Also I have checked 
the network load. My network adapter has 10Gb bandwidth and the dstat -d 
command shows the max rec or send speed is 300MB/s at most , which doesn’t 
reach the limit of network adapter. The total number of executors is less than 
that of total workers, so performance should not be problem.

Then I do some more test,
Only one kafkaspout reading on topic

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

148918096

0

0

148917805

0


One kafkaspout and one data processing bolt

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

106481751

106481751

0

106367801

0

Showing 1 to 1 of 1 entries

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

AtomPluginBolt

1000

1000

132752203

0

0.019

0.067

106519630



Even I only read one kafka topic, the emit speed is also slowed down.

Does anyone meet the same problem?