[ 
https://issues.apache.org/jira/browse/STORM-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated STORM-1772:
-------------------------------
    Description: 
Would be very useful to have some simple reference topologies included with 
Storm that can be used to measure performance both by devs during development 
(to start with) and perhaps also on a real storm cluster (subsequently). 

To start with, the goal is to put the focus on the performance characteristics 
of individual building blocks such as specifics bolts, spouts,  grouping 
options, queues, etc. So, initially biased towards micro-benchmarking but 
subsequently we could add higher level ones too.

Although there is a storm benchmarking tool (originally written by Intel?) that 
can be used, and i have personally used it, its better for this to be 
integrated into Storm proper and also maintained by devs as storm evolves. 

On a side note, in some instances I have noticed (to my surprise) that the perf 
numbers change when the topologies written for Intel benchmark when rewritten 
without the required wrappers so that they runs directly under Storm.

Have a few topologies in mind for measuring each of these:

# *Queuing and Spout Emit Performance:* A topology with a Generator Spout but 
no bolts.
# *Queuing & Grouping performance*:   Generator Spout -> A grouping method -> 
DevNull Bolt
# *Hdfs Bolt:*    Generator Spout ->  Hdfs Bolt
# *Hdfs Spout:*   Hdfs Spout ->  DevNull Botl
# *Kafka Spout:*   Kafka Spout ->  DevNull Bolt 
# *Simple Data Movement*: Kafka Spout -> Hdfs Bolt

Shall add these for Storm core first. Then we can have the same for Trident 
also.

  was:
Would be very useful to have some simple reference topologies included with 
Storm that can be used to measure performance that can be used both by devs 
during development (to start with) and perhaps also on a real storm cluster 
(subsequently). 

To start with, the goal is to put the focus on the performance characteristics 
of individual building blocks such as specifics bolts, spouts,  grouping 
options, queues, etc. So, initially biased towards micro-benchmarking but 
subsequently we could add higher level ones too.

Although there is a storm benchmarking tool (originally written by Intel?) that 
can be used, and i have personally used, its better for this to be integrated 
into Storm proper and also maintained by devs as storm evolves. 

On a side note, in some instances I have noticed (to my surprise) that the perf 
numbers change when the topologies written for Intel benchmark when rewritten 
without the required wrappers so that they runs directly under Storm.

Have a few topologies in mind for measuring each of these:

# *Queuing and Spout Emit Performance:* A topology with a Generator Spout but 
no bolts.
# *Queuing & Grouping performance*:   Generator Spout -> A grouping method -> 
DevNull Bolt
# *Hdfs Bolt:*    Generator Spout ->  Hdfs Bolt
# *Hdfs Spout:*   Hdfs Spout ->  DevNull Botl
# *Kafka Spout:*   Kafka Spout ->  DevNull Bolt 
# *Simple Data Movement*: Kafka Spout -> Hdfs Bolt

Shall add these for Storm core first. Then we can have the same for Trident 
also.


> Create topologies for measuring performance
> -------------------------------------------
>
>                 Key: STORM-1772
>                 URL: https://issues.apache.org/jira/browse/STORM-1772
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Roshan Naik
>
> Would be very useful to have some simple reference topologies included with 
> Storm that can be used to measure performance both by devs during development 
> (to start with) and perhaps also on a real storm cluster (subsequently). 
> To start with, the goal is to put the focus on the performance 
> characteristics of individual building blocks such as specifics bolts, 
> spouts,  grouping options, queues, etc. So, initially biased towards 
> micro-benchmarking but subsequently we could add higher level ones too.
> Although there is a storm benchmarking tool (originally written by Intel?) 
> that can be used, and i have personally used it, its better for this to be 
> integrated into Storm proper and also maintained by devs as storm evolves. 
> On a side note, in some instances I have noticed (to my surprise) that the 
> perf numbers change when the topologies written for Intel benchmark when 
> rewritten without the required wrappers so that they runs directly under 
> Storm.
> Have a few topologies in mind for measuring each of these:
> # *Queuing and Spout Emit Performance:* A topology with a Generator Spout but 
> no bolts.
> # *Queuing & Grouping performance*:   Generator Spout -> A grouping method -> 
> DevNull Bolt
> # *Hdfs Bolt:*    Generator Spout ->  Hdfs Bolt
> # *Hdfs Spout:*   Hdfs Spout ->  DevNull Botl
> # *Kafka Spout:*   Kafka Spout ->  DevNull Bolt 
> # *Simple Data Movement*: Kafka Spout -> Hdfs Bolt
> Shall add these for Storm core first. Then we can have the same for Trident 
> also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to