Thanks for sharing your benchmark.
Hortonworks does similar benchmark with 1.0.1, so I guess you might also
want to read this page, too.
http://hortonworks.com/blog/microbenchmarking-storm-1-0-performance/

The cost of routing and transferring tuple, and handling ACK is not free
(some works are held in the same thread which emits the tuple), so what
you're seeing is natural. Normally this doesn't make problems since it's
already fast enough to meet the requirement, but it heavily affects the result
of micro benchmark.

As you may see from the page, from v1.0.1 performance is greatly improved.
I think there're still more rooms to improve performance, but what area
Storm is really lacking is easier usage. Storm doesn't have high-level API
(Storm has Trident but it forces users to live with micro-batch.) and SQL
support (recently it has been trends of streaming frameworks). Ideally if
we can address both of things together it should be great, but if we can't,
IMHO, improving usage is a top priority.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 10월 18일 (화) 오후 3:30, Chen Junfeng <k-2f...@hotmail.com>님이 작성:



I found a tough but interesting issue about Storm 1.0.2 to share with you.

Firstly I introduce my system structure. My topology has 471 workers, 6
kafka topic and some processor bolt. Backpressure has been disabled in
topology by

conf.put(Config.*TOPOLOGY_BACKPRESSURE_ENABLE*, false);



If I submit KafkaSpout only, the statistic data from Storm UI is as below:

Topology stats

Window

Emitted

Transferred

Complete latency (ms)

Acked

Failed

10m 0s

875161429

0

0

875141691

3h 0m 0s

927117120

0

0

927117320

1d 0h 0m 0s

927117120

0

0

927117320

All time

927117120

0

0

927117320

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_144

2

2

128701261

0

0

128702858

0

Spout_JSCTGW_145

2

2

162347454

0

0

162347639

0

Spout_JSCTGW_146

2

2

135598494

0

0

135598608

0

Spout_JSCTGW_147

2

2

128307822

0

0

128306102

0

Spout_JSCTGW_148

2

2

160369513

0

0

160360423

0

Spout_JSCTGW_149

2

2

159836885

0

0

159826061

0





In 10 minutes time, 6 kafkaSpout read more than 800 million tuple from
kafka cluster.

Then I submit the topology with a simple PrintBolt, the 10min stat data is
as following:

Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Error Host

Error Port

Spout_JSCTGW_144

2

2

113599789

113599789

0

113513630

0

Spout_JSCTGW_145

2

2

122501522

122501575

0

122659659

0

Spout_JSCTGW_146

2

2

91308915

91308915

0

91308652

0

Spout_JSCTGW_147

2

2

105029568

105029568

0

104968275

0

Spout_JSCTGW_148

2

2

115889172

115889178

0

115890165

0

Spout_JSCTGW_149

2

2

115185591

115185591

0

115185638

0

Showing 1 to 6 of 6 entries

*663526019*

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

Process latency (ms)

Acked

PrintBolt

240

240

0

0

0.241

0.041

665063824

0

665,036,902



The printbolt contains nothing but a simple log.info method, *but the total
number of emitted tuple of spout decrease to about 600 million*. All of
machines have 32 cores but the system average load never exceed 15. Also I
have checked the network load. My network adapter has 10Gb bandwidth and
the dstat -d command shows the max rec or send speed is 300MB/s at most ,
which doesn’t reach the limit of network adapter. The total number of
executors is less than that of total workers, so performance should not be
problem.



Then I do some more test,

Only one kafkaspout reading on topic



Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

148918096

0

0

148917805

0



One kafkaspout and one data processing bolt



Spouts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Complete latency (ms)

Acked

Failed

Spout_JSCTGW_145

2

2

106481751

106481751

0

106367801

0

Showing 1 to 1 of 1 entries

Bolts (10m 0s)

Search:

Id

Executors

Tasks

Emitted

Transferred

Capacity (last 10m)

Execute latency (ms)

Executed

AtomPluginBolt

1000

1000

132752203

0

0.019

0.067

106519630





Even I only read one kafka topic, the emit speed is also slowed down.



Does anyone meet the same problem?

Reply via email to