If you've built it on top of storm then "external" modules may be a good
location.

On Thu, May 28, 2015 at 10:42 AM, Matthias J. Sax <
[email protected]> wrote:

> Hi Bobby,
>
> I never thought about it. But if the community is interested in it, I
> would be happy to contribute it. :)
>
> However, I am not super familiar with the actual structure of Storm's
> code base and I would need some pointers to integrate in into the system
> correctly and nicely.
>
> I claim, to understand the internals of Storm quite well, however, I
> have more a user perspective on the system so far.
>
> If I should work on it, it might be a good idea to open a JIRA and
> assign it to me, and we can take it from there?
>
>
> -Matthias
>
>
>
> On 05/28/2015 03:20 PM, Bobby Evans wrote:
> > Have you thought about contributing this back to storm itself?  From
> what I have read and a quick pass through the code it looks like from a
> user perspective you replace one builder with another.  From a code
> perspective it looks like you replace the fields grouping with one that
> understands the batching semantics, and wrap the bolts/spouts with
> batch/unbatch logic.  This feels like something that could easily fit into
> storm with minor modification and give users more control over latency vs.
> throughput in their topologies.  Making it an official part of storm too,
> would allow us to update the metrics system to understand the batching and
> display results on a per tuple basis instead of on a per batch basis.
> >  - Bobby
> >
> >
> >
> >      On Thursday, May 28, 2015 5:54 AM, Matthias J. Sax <
> [email protected]> wrote:
> >
> >
> >  Hi Manu,
> >
> > please find a simple benchmark evaluation on Storm 0.9.3 using the
> > following links (it's to much content to attach to this Email).
> >
> >
> https://www2.informatik.hu-berlin.de/~saxmatti/storm-aeolus-benchmark/batchingBenchmark-spout-batching-0.pdf
> >
> > The files shows the result for batch sizes 0 to 4. You can replace the
> > last "0" by values up to 16 to get result for higher batch sizes.
> >
> > What you can basically observe, it that the maximum achieved data rate
> > in the non-batching case is about 250.000 tuple per second (tps) while a
> > batch size of about 30 increases it to 2.000.000 tps (with high
> > fluctuation; that decreases with even higher batch sizes).
> >
> > The benchmark uses a single spout (dop=1) and single bolt (dop=1) and
> > measure the output/input rate (in tps) as well as network traffic (in
> > KB/s) for different batch sizes.
> >
> > The spout emits simple single attribute tuples (type Integer) and is
> > configured to emit with a dedicated (stable) output rate. We did
> > multiple runs in the benchmark combining different output rates (from
> > 200.000 tps to 2.000.000 tps in steps of 200.000) with different batch
> > sizes (from 1 to 80).
> >
> > Each run used a different configures spout output rate and
> > consists of 4 plots showing measures network traffic and output/input
> > rate for spout and bolt. The plots might be hard to read (they are
> > design for ourself only, and not for publishing). If you have questions
> > about them, please let me know.
> >
> > We run the experiment in our local cluster. Each node has two Xeon
> > E5-2620 2GHz with 6 cores and 24GB main memory. The nodes a connected
> > via 1Gbit Ethernet (10Gbit Switch).
> >
> > The code and scripts for running the benchmark are on github, too.
> > Please refer to the maven module "monitoring". So you should be able to
> > run the benchmark on your own hardware.
> >
> > -Matthias
> >
> >
> >
> > On 05/28/2015 08:44 AM, Manu Zhang wrote:
> >> Hi Matthias,
> >>
> >> The project looks interesting. Any detailed performance data compared
> with
> >> latest storm versions (0.9.3 / 0.9.4) ?
> >>
> >> Thanks,
> >> Manu Zhang
> >>
> >> On Tue, May 26, 2015 at 11:52 PM, Matthias J. Sax <
> >> [email protected]> wrote:
> >>
> >>> Dear Storm community,
> >>>
> >>> we would like to share our project Aeolus with you. While the project
> is
> >>> not finished, our first component --- a transparent batching layer ---
> >>> is available now.
> >>>
> >>> Aeolus' batching component, is a transparent layer that can increase
> >>> Storm's throughput by an order of magnitude while keeping
> tuple-by-tuple
> >>> processing semantics. Batching happens transparent to the system and
> the
> >>> user code. Thus, it can be used without changing existing code.
> >>>
> >>> Aeolus is available using Apache License 2.0 and would be happy to any
> >>> feedback. If you like to try it out, you can download Aeolus from our
> >>> git repository:
> >>>         https://github.com/mjsax/aeolus
> >>>
> >>>
> >>> Happy hacking,
> >>>   Matthias
> >>>
> >>>
> >>
> >
> >
> >
> >
>
>

Reply via email to