@bill As far as I know, people are still doing the same thing with summingbird. And absolutely `Yes`, we should allow some kind of mechanism to specify resources for operators.
@sanjeev Thx for the clarification. Based on my experience with Twitter topologies, users would like to specify resources for simple operators if the topology is really critical to them. So I personally prefer a generic enough way that user can specify resources for any operator. On Thu, Sep 21, 2017 at 16:05 Sanjeev Kulkarni <[email protected]> wrote: > Neng, > https://github.com/twitter/heron/pull/2334 > provides this abstraction. > The issue however is the follows. In Spout/Bolt world, every component is > explicitly named by the topology writer and thus all resources can be > specified on a per component basis. However in the dsl world, a) the > operators themselves dont have name and b) optimizations can squish the > operators into single physical operator. One possibility would be to add a > name optionally to the operator(like map(mapfn, name), but that seems too > cumbersome/kludgy) > > On Thu, Sep 21, 2017 at 3:57 PM, Neng Lu <[email protected]> wrote: > > > Just add some thoughts here: for ordinary heron topologies, the > definition > > of a heron job and the request of resources usage for each component are > > separated: `TopologyBuilder` for job definition, `Config` for resource > > requirement. > > > > In the dsl case, if we could also do something similar that separates the > > dsl job creation and resources request, it would be really good. With > this > > separation, people has the flexibility of providing different configs for > > the same job. > > > > > > On Wed, Sep 20, 2017 at 1:48 PM, Sanjeev Kulkarni <[email protected]> > > wrote: > > > > > Hi folks, > > > One of the great features of the lower level spout/bolt interface in > > Heron > > > is the ability to specify resources needed on a per component basis. > This > > > feature is very helpful for tuning large topologies and is heavily used > > > inside Twitter. > > > Currently the DSL does not have this flexibility. I wanted to get > > opinions > > > about how we can add this. > > > There are probably several ways to do it. I'm listing a few approaches > > that > > > have come to my mind. Please feel free to add more. > > > 1) Currently some of our operators are simple(like flatMap, map, filter > > > operators), others are a little complicated(like transform where users > > can > > > perform setup/cleanup). We can take the approach of adding the ability > to > > > specify resources only for complex operators. Thus transform could have > > two > > > variants. The current one which just takes a transform function and > > another > > > that takes in a resource parameter as well. The rest of other > > > operators(map/flatmap/filter, etc) will remain the same. The advantage > of > > > this is that the interface explosion is minimal and controlled. The > cons > > is > > > that if you need to control the resources of a particular operator, you > > are > > > forced to use transform. > > > 2) Another approach would be to add a variant that takes in a Resource > > > parameter to all operators. Pros is that this gives fine grained > control > > to > > > all operators. Cons is the interface blow up. > > > > > > Thoughts? > > > > > >
