You can see more here: https://github.com/apache/storm/tree/master/external
On Wed, Jun 3, 2015 at 4:43 PM, Matthias J. Sax < [email protected]> wrote: > Thanks for the input. > > Currently, everything is written in Java (I am not familiar with Clojure > -- maybe a good way to get started with it though ;)). Nathan just > mentioned that the code could be included into "external" modules. Thus, > it might be the easiest way to put it there. What are those external > module Nathan is referring to? > > I am just wondering how deep the integration in the system should be. If > a deeper integration is the better solution, we should follow this path. > > You are the experts. What is the better solution? > > -Matthias > > > > On 06/03/2015 09:19 PM, Bobby Evans wrote: > > Sorry I didn't respond sooner, thing are rather busy :). You should be > able to file JIRA yourself if you want to, it is open to anyone. Storm has > not documented the code base very well. The core part of storm is in the > storm-core sub project. It has both java and clojure code in it. The > clojure code is where most everything happens. The daemons are located > under storm-core/src/clj/backtype/storm/daemon. worker.clj and > executor.clj are probable the places that you would want to update metrics > and routing. The code that creates the topology is in java. > > - Bobby > > > > > > > > On Thursday, May 28, 2015 9:46 AM, Matthias J. Sax < > [email protected]> wrote: > > > > > > Hi Bobby, > > > > I never thought about it. But if the community is interested in it, I > > would be happy to contribute it. :) > > > > However, I am not super familiar with the actual structure of Storm's > > code base and I would need some pointers to integrate in into the system > > correctly and nicely. > > > > I claim, to understand the internals of Storm quite well, however, I > > have more a user perspective on the system so far. > > > > If I should work on it, it might be a good idea to open a JIRA and > > assign it to me, and we can take it from there? > > > > > > -Matthias > > > > > > > > On 05/28/2015 03:20 PM, Bobby Evans wrote: > >> Have you thought about contributing this back to storm itself? From > what I have read and a quick pass through the code it looks like from a > user perspective you replace one builder with another. From a code > perspective it looks like you replace the fields grouping with one that > understands the batching semantics, and wrap the bolts/spouts with > batch/unbatch logic. This feels like something that could easily fit into > storm with minor modification and give users more control over latency vs. > throughput in their topologies. Making it an official part of storm too, > would allow us to update the metrics system to understand the batching and > display results on a per tuple basis instead of on a per batch basis. > >> - Bobby > >> > >> > >> > >> On Thursday, May 28, 2015 5:54 AM, Matthias J. Sax < > [email protected]> wrote: > >> > >> > >> Hi Manu, > >> > >> please find a simple benchmark evaluation on Storm 0.9.3 using the > >> following links (it's to much content to attach to this Email). > >> > >> > https://www2.informatik.hu-berlin.de/~saxmatti/storm-aeolus-benchmark/batchingBenchmark-spout-batching-0.pdf > >> > >> The files shows the result for batch sizes 0 to 4. You can replace the > >> last "0" by values up to 16 to get result for higher batch sizes. > >> > >> What you can basically observe, it that the maximum achieved data rate > >> in the non-batching case is about 250.000 tuple per second (tps) while a > >> batch size of about 30 increases it to 2.000.000 tps (with high > >> fluctuation; that decreases with even higher batch sizes). > >> > >> The benchmark uses a single spout (dop=1) and single bolt (dop=1) and > >> measure the output/input rate (in tps) as well as network traffic (in > >> KB/s) for different batch sizes. > >> > >> The spout emits simple single attribute tuples (type Integer) and is > >> configured to emit with a dedicated (stable) output rate. We did > >> multiple runs in the benchmark combining different output rates (from > >> 200.000 tps to 2.000.000 tps in steps of 200.000) with different batch > >> sizes (from 1 to 80). > >> > >> Each run used a different configures spout output rate and > >> consists of 4 plots showing measures network traffic and output/input > >> rate for spout and bolt. The plots might be hard to read (they are > >> design for ourself only, and not for publishing). If you have questions > >> about them, please let me know. > >> > >> We run the experiment in our local cluster. Each node has two Xeon > >> E5-2620 2GHz with 6 cores and 24GB main memory. The nodes a connected > >> via 1Gbit Ethernet (10Gbit Switch). > >> > >> The code and scripts for running the benchmark are on github, too. > >> Please refer to the maven module "monitoring". So you should be able to > >> run the benchmark on your own hardware. > >> > >> -Matthias > >> > >> > >> > >> On 05/28/2015 08:44 AM, Manu Zhang wrote: > >>> Hi Matthias, > >>> > >>> The project looks interesting. Any detailed performance data compared > with > >>> latest storm versions (0.9.3 / 0.9.4) ? > >>> > >>> Thanks, > >>> Manu Zhang > >>> > >>> On Tue, May 26, 2015 at 11:52 PM, Matthias J. Sax < > >>> [email protected]> wrote: > >>> > >>>> Dear Storm community, > >>>> > >>>> we would like to share our project Aeolus with you. While the project > is > >>>> not finished, our first component --- a transparent batching layer --- > >>>> is available now. > >>>> > >>>> Aeolus' batching component, is a transparent layer that can increase > >>>> Storm's throughput by an order of magnitude while keeping > tuple-by-tuple > >>>> processing semantics. Batching happens transparent to the system and > the > >>>> user code. Thus, it can be used without changing existing code. > >>>> > >>>> Aeolus is available using Apache License 2.0 and would be happy to any > >>>> feedback. If you like to try it out, you can download Aeolus from our > >>>> git repository: > >>>> https://github.com/mjsax/aeolus > >>>> > >>>> > >>>> Happy hacking, > >>>> Matthias > >>>> > >>>> > >>> > >> > >> > >> > >> > > > > > > > > > >
