Re: Storm 2.0 blogs ?

Roshan Naik Thu, 24 Jan 2019 22:19:28 -0800

 Here is a snippet for the rearchitecture. Keeping it short here as follow up 
blogs will have more details:

1) New High Performance Core:Storm 2.0 introduces a new core, designed to push 
boundaries on throughput, latency and energy consumption while maintaining 
backward compatibility. It features a leaner threading model, a blazing fast 
messaging subsystem and a lightweight back pressure model.
The new engine was motivated by the observation that existing hardware remains 
capable of performing much better than what the best streaming engines deliver. 
Storm 2.0 is the first streaming engine capable of breaking the 1 microsecond 
latency barrier for transfers between two operators. It can sustain very high 
throughputs and also deliver better energy efficiency. Details on the new 
architecture and its performance will be covered in upcoming blogs.

    On Wednesday, January 23, 2019, 10:01:56 AM PST, Stig Rohde Døssing 
<stigdoess...@gmail.com> wrote:  

 We should probably also highlight in the release notes that Java 8 is now
the minimum.

Here is the blurb for storm-kafka-client:

# Kafka integration changes

## Removal of storm-kafka
The most significant change to Storm's Kafka integration since 1.x, is that
storm-kafka has been removed. The module was deprecated a while back, due
to Kafka's deprecation of the underlying client library. Users will have to
move to the storm-kafka-client module, which uses Kafka's ´kafka-clients´
library for integration.

For the most part, the migration to storm-kafka-client is straightforward.
The documentation for storm-kafka-client contains a helpful mapping between
the old and new spout configurations. If you are using any of the
storm-kafka spouts, you will need to migrate offset checkpoints to the new
spout, to avoid the new spout starting from scratch on your partitions. You
can find a helper tool to do this at
https://github.com/apache/storm/tree/master/external/storm-kafka-migration.
You should stop your topology, run the migration tool, then redeploy your
topology with the storm-kafka-client spout.

## Move to using the KafkaConsumer.assign API
Storm-kafka-client in 1.x allowed you to use Kafka's own mechanism to
manage which spout tasks were responsible for which partitions. This
mechanism was a poor fit for Storm, and was deprecated in 1.2.0. It has
been removed entirely in 2.0
https://issues.apache.org/jira/browse/STORM-2542.

The storm-kafka-client Subscription interface has also been removed. It
offered too limited control over the subscription behavior. It has been
replaced with the TopicFilter and ManualPartitioner interfaces. Unless you
were using a custom Subscription implementation, this will likely not
affect you. If you were using a custom Subscription, the storm-kafka-client
documentation describes how to customize assignment
https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md#manual-partition-assigment-advanced
.

## Other highlights
* The KafkaBolt now allows you to specify a callback that will be called
when a batch is written to Kafka
https://issues.apache.org/jira/browse/STORM-3175.
* The FirstPollOffsetStrategy behavior has been made consistent between the
non-Trident and Trident spouts. It is now always the case that
EARLIEST/LATEST only take effect on topology redeploy, and not when a
worker restarts https://issues.apache.org/jira/browse/STORM-2990.
* Storm-kafka-client now has a transactional non-opaque Trident spout
https://issues.apache.org/jira/browse/STORM-2974.
* There is a new examples module for storm-kafka-client at
https://github.com/apache/storm/tree/master/examples/storm-kafka-client-examples
.
* Deprecated methods in KafkaSpoutConfig have been removed. If you are
using one of the deprecated methods, check the Javadoc for the latest 1.2.x
release, which describes the replacement for each method.

Den ons. 23. jan. 2019 kl. 15.54 skrev P. Taylor Goetz <ptgo...@gmail.com>:

> If you need to format (e.g. code examples, etc.) then markdown is fine. So
> is plain text.
>
> -Taylor
>
> > On Jan 23, 2019, at 5:24 AM, Stig Rohde Døssing <stigdoess...@gmail.com>
> wrote:
> >
> > I'll write something for 5 - Kafka related changes.
> >
> > We have dropped Druid support, so 13 should be only Kinesis.
> >
> > Which format should the blurbs be written in? (Markdown?)
> >
> > Den ons. 23. jan. 2019 kl. 08.57 skrev Roshan Naik
> > <roshan_n...@yahoo.com.invalid>:
> >
> >> Like Taylor’s suggestion of collectively contributing small blurbs on
> >> features for the Release announcement. Thats the first thing people
> look on
> >> hearing a release announcement. The jira list is not very
> understandable at
> >> first glance.
> >>
> >>
> >> Based on suggestions so far and a quick scan of the jiras in release
> notes
> >> here is a draft list in no particular order. I am sure I am missing a
> few
> >> impt ones. This can be pruned or modified as needed:
> >>
> >> 1- Re-architecture - [Roshan]
> >> 2- Windowing enhancements
> >> 3- SQL enhancements
> >> 4- Metrics
> >> 5- Kafka related changes
> >> 6- Security (nimbus admin groups, delegation tokens, optional
> >> impersonation)
> >> 7- PMML (Machine Learning) support.
> >> 8- Streams API
> >> 9- Module restructuring & dependency mitigation
> >> 10- Java porting
> >> 11- DRPC cmd line
> >> 12- Lambda support
> >> 13- New spouts: Kinesis & Druid ?
> >> 14- Changes to deployment and cli submission
> >> 15- RAS changes
> >> 16- Trident enhancements
> >> 17- New Admin cmds to debug cluster state
> >> 18 ... others ?
> >>
> >> Please pick the topics you can contribute blurbs for. I have put my name
> >> against one. It will help Taylor aggregate them and do the necessary
> final
> >> edits.
> >>
> >>
> >> -Roshan
> >>
> >>
> >>
> >>
> >>
>
>

Re: Storm 2.0 blogs ?

Reply via email to