I can understand why some would not want to mix the two APIs as they each stand for a different concept. I also have found in my own experience the streamlet API to be limiting in some cases. For example I couldn't find a way to implement a specific grouping between Streamlets in a case where I wanted fine grained control on what data was sent over different instances of a Streamlet (of course this is probably part of the abstraction). I like the low level control you have with the spout and bolt implementations and think it would be nice to have the flexibility to choose when you want to take fine grained control if using the Streamlet API.
On Wed, Sep 19, 2018 at 12:22 PM Ning Wang <[email protected]> wrote: > Hi, all, > > We had a discussion in this PR but I am feeling that it would be good to > gather more thoughts from other devs/users as well. > > > https://github.com/apache/incubator-heron/pull/3029#pullrequestreview-156614156 > > > During Twitter internal onboarding of Streamlet API, I started to consider > supporting low level Bolt and Spout in Streamlet API. I totally understand > the concerns that Neng and Jerry raised in the PR that the Streamlet API is > not pure with Bolt/Spout support because it would expose low level things. > However I am still feeling that the advantages is way more than the > disadvantages with the support. The following are my comments in the RP: > > ======== > > Here are my thoughts: > > Streamlet is not really the abstraction. My feeling is that Streamlet is > good at the DAG layer but not flexible enough in the low level (operators). > I would think it is like Scala vs Java(not the same, just some idea). Scala > has the nice functional API but it is pretty useless in real life if > procedural code is not allowed/supported. > > Two reasons: > > 1. Migration is one major reason. There are quite some existing > topologies written in low level API (for heron and storm). Streamlet is > only friendly to new users, existing code such as KafkaSpout (it is > spout, > but same issue) in storm and some ML bolts has to be rewritten to take > the > readability/maintainability advantages. > 2. Bolt/Spout are more flexible. They can do a lot more than a function > provided by Streamlet API (initialization, config, checkpoint, etc). For > examples, the stateful processing and component configs, they are not > supported currently by Streamlet and if we add the features, it is > likely > user has to provide extra functions as parameters and the Streamlet API > would became more and more complicated. Streamlet API will evolve but > supporting Bolt/Spout could give us a lot room to design a clean API. > > ======== >
