Why is it bad to keep some state per command session? It's easy to know when the state can be discarded -- as soon as we're done reading the file or when the socket closes.
I think it's fairly intuitive and readable; it's routine for e.g., scripts to modify interpreter state. In straight-line code, which is all we're ever going to have in the control protocol, this is very easy to reason about. --Ari On Tue, Sep 21, 2010 at 11:17 AM, Bill Graham <[email protected]> wrote: > +1 on staying stateless. > > I think the challenge we're facing is that we're trying to support a > syntax that is simple and readable and can be done with a single line > (i.e. for the initial_adaptors file, the telnet API, the command line, > etc), but the configs can potentially be not-so-simple. > > For example, here's how you might configure the JMS adaptor which used > dependency injection. That's a lot for a single line and there's > nowhere to add new global configs in front of the adaptor specific > configs without breaking things. > > add jms.JMSAdaptor jms-events > failover:(tcp://jms-host.foo.com:61616,tcp://jms-host.foo.com:61616) > -q some.queue.name -s "id_type IN ('162')" -x > org.apache.hadoop.chukwa.datacollection.adaptor. > jms.JMSMessagePropertyTransformer -p > "event_time,id_type,id,srcurl,xref,xrq,title -r event_time,id_type,id" > 0 > > What if we were to adopt a few flags into the syntax: > > add [name =] <adaptor_class_name> <datatype> [--tags <tags>] > [--adaptor-params <adaptor specific params>|--adaptor-config-file > <file>] > <initial offset> > > The '--*' flags could be reserved. This would allow us to keep with a > one-line syntax where that approach works, but allow for expansion. > Also, if an adaptor config got to complex, those configs could be > specified in a file if needed. > > > On Mon, Sep 20, 2010 at 9:52 PM, Jerome Boulon <[email protected]> wrote: >> Hi, >> If I had to implement this, I will add an extra parameter >> (?extraParams=xyz). >> The adaptorImp will be the only one responsible for parsing this adaptor’s >> specific info. >> I don’t think that we could/should add new complexity in the parsing. >> The same think should be done for getCurrentStatus(), a public result, that >> is the same for all adaptors in order to know if the adaptors is working or >> not and a private section that will give extra information. >> >> Also, moving to a json input should simplify everything. >> /Jerome. >> >> On 9/20/10 5:15 PM, "Bill Graham" <[email protected]> wrote: >> >> I'd like to hear Ari's take on this, but this does feel a bit hacky to >> me. Plus, it would put the responsibility of parsing tags on each >> adaptor impl and would require a refactor of how each one currently >> parses args. >> >> Actually, we might be able to intercept the call to parseArgs in >> AbstractAdaptor and pull out the tags if they exist and pass the rest >> to the subclass, which would be none the wiser. Not the cleanest, but >> at lease not as intrusive on the adaptor implementations. >> >> Ari, also what about the getCurrentStatus() method? I'd think all the >> impls would somehow need to incorporate tags into that response as >> well, since AFAIR that's what's used to do Adaptor SerDe with the >> checkpoints file. >> >> >> On Mon, Sep 20, 2010 at 3:57 PM, Eric Yang <[email protected]> wrote: >>> Hi Bill, >>> >>> This might be hacky but it should be possible to have adaptor specific >>> params to include tags. Ari, what do you think? >>> >>> Regards, >>> Eric >>> >>> On 9/20/10 2:58 PM, "Bill Graham" <[email protected]> wrote: >>> >>> Hi, >>> >>> In CHUKWA-515 we discussed the possibility being able to add an >>> adaptor bound to a given cluster: >>> >>> >>> https://issues.apache.org/jira/browse/CHUKWA-515?focusedCommentId=12905811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12905811 >>> >>> I can actually see this being useful, especially now that it's easier >>> to add/remove agents with the Adaptor REST API. Looking into the code >>> it doesn't seem like it would be that hard to do, but I want to make >>> sure I'm not overlooking anything. >>> >>> It seems like we could support this with a few small changes: >>> >>> - Add the concept of tags to the Adaptor interface. >>> - AbstractAdator would support a getTags method which would return the >>> union of tags set on the Adaptor and the default tags on the >>> DataFactory. >>> - Internal tag implementations on each would change to store tags in >>> maps, instead of concat'ed strings. This would allow for a "last in >>> wins" type of functionality so tags could be overriden. This assumes >>> of course that there should never be more than one of the same tag key >>> value, which I _assume_ is the case. >>> - The ChunkImpl constructor will call getTags on the agent, instead of >>> getDefaultTags the data factory. >>> >>> The trickiest part as I see it is figuring out how to change the add >>> adaptor string syntax in ChukwaAgetn.processAddCommandE in a way that >>> both makes sense and doesn't break things. In it's current form it >>> doesn't have room for easy expansion except at the end of the line: >>> >>> add [name =] <adaptor_class_name> <datatype> <adaptor specific params> >>> <initial offset> >>> >>> Any thoughts or suggestions? There's also a potential gotcha with all >>> the impls of Adaptor.parseArgs either breaking or needing to >>> change.... >>> >>> thanks, >>> Bill >>> >>> >> >> >> > -- Ari Rabkin [email protected] UC Berkeley Computer Science Department
