Regardless of what IBM does, there is clearly a lot that Storm is not doing
for performance. This is largely by design for simplicity. For instance,
storm could use reflection and byte code engineering to merge bolts while
still allowing rearrangement and rebalancing. Likewise, when throughput
gets high enough that some batching is worthwhile, technologies like
empirical schema, column orientation, bit weaving and zero serialization
formats[1,2] could all be used to move records.  Together, these would give
several orders of magnitude speedup. The fact that these have not yet been
worth looking at for Storm users indicate that the virtues and challenges
of Storm probably live elsewhere.

The right thing to do is stick to Storm's knitting and make it the best
that it can be within its own self-concept.  No need to ignore other
offerings, but considering of alternatives should be limited to finding
ways to improve Storm.

As I see it, the thing driving Storm adoption is reasonably flexible
parallel execution of streams with a very simple programming and fail-over
model.  Performance is not a major issue for most needs since a million
tuples per second is larger than what most people need and that is pretty
easy to achieve with Storm.

[1] See Apache Drill for empirical schema, on-the fly code generation and
columnar zero serialization format

[2] See h2o for magical compression of columnar data with resulting monster
speed




On Mon, May 12, 2014 at 12:09 PM, Klausen Schaefersinho <
[email protected]> wrote:

> Hi,
>
> my guess is that 40k are per CPU or so... for sure not for an entire
> cluster.
>
>
> On Mon, May 12, 2014 at 4:46 PM, Marc Vaillant <[email protected]>wrote:
>
>> To play devil's advocate, if you believe the stream performance gains,
>> then the 40k will likely pay for itself in needing to deploy a fraction
>> of the resources for the same throughput.
>>
>> On Mon, May 12, 2014 at 09:02:53AM -0400, John Welcher wrote:
>> > Hi
>> >
>> > Streams also cost 40,000 US while Storm is free.
>> >
>> > John
>> >
>> >
>> > On Mon, May 12, 2014 at 3:49 AM, Klausen Schaefersinho <
>> > [email protected]> wrote:
>> >
>> >     Hi,
>> >
>> >     I found some interesting comparison of IBM Stream and Storm:
>> >
>> >     https://www.ibmdw.net/streamsdev/2014/04/22/streams-apache-storm/
>> >
>> >     It also includes an interesting comparison between ZeroMQ and the
>> Netty
>> >     Performance.
>> >
>> >
>> >     Cheers,
>> >
>> >     Klaus
>> >
>> >
>>
>
>

Reply via email to