Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-14 Thread Kyle Richardson
I'm +1 on the current proposal. I like Nick's syntax and agree with Jon's 
enabled property. I also like the idea of a path property for HDFS.

-Kyle

> On Jan 14, 2017, at 10:51 AM, Casey Stella  wrote:
> 
> I'm +1 on an explicit enabled property and a filter (or when) property. I
> think we are zeroing in on a decent design, so that is good.
> 
> To recap, what I am +1 on is Nick's proposed syntax with the following
> modifications:
> 1. An explicit enabled field
> 2. A default on for unspecified to match current semantics
> 
> Casey
>> On Sat, Jan 14, 2017 at 10:45 zeo...@gmail.com  wrote:
>> 
>> This has the additional benefit of doing something like below when you want
>> to temporarily disable the hdfs writer, but don't want to remove the
>> settings.  This removes the need to store the path and batchSize (and many
>> additional settings) somewhere else so they can be brought back in when you
>> want to re-enable it, which is a nice workflow attribute for the end user:
>> 
>> {
>>   'elasticsearch': {
>>  'enabled': 'true',
>>  'index': 'foo',
>>  'batchSize': 100,
>>},
>>   'hdfs': {
>>  'enabled': 'false',
>>  'path': '/foo/bar/...',
>>  'batchSize': 100,
>>},
>>   'solr': {
>>  'enabled': 'false'
>>}
>> }
>> 
>> Jon
>> 
>>> On Sat, Jan 14, 2017 at 9:24 AM zeo...@gmail.com  wrote:
>>> 
>>> I similarly have a concern there because I prefer being as explicit as
>>> possible, which makes things easier to pick up for new users.  Using my
>>> example from earlier this could look like specifying while(false), but an
>>> even better and more obvious approach may be to use enabled(false).  So
>> the
>>> current simple default would be:
>>> 
>>> {
>>>   'elasticsearch': { 'enabled': 'true' },
>>>   'hdfs': { 'enabled': 'true' },
>>>   'solr': { enabled': 'false' }
>>> }
>>> 
>>> And to use ES with some overrides but not HDFS or solr it would look
>> like:
>>> 
>>> {
>>>   'elasticsearch': {
>>>  'enabled': 'true',
>>>  'index': 'foo',
>>>  'batchSize': 100
>>>},
>>>   'hdfs': {
>>>  'enabled': 'false'
>>>},
>>>   'solr': {
>>>  'enabled': 'false'
>>>}
>>> }
>>> 
>>> Jon
>>> 
>>> On Fri, Jan 13, 2017 at 10:21 PM Casey Stella 
>> wrote:
>>> 
>>> One thing that I thought of that I very strenuous do not like in Nick's
>>> proposal is that if a writer config is not specified then it is turned
>> off
>>> (I think; if I misunderstood let me know). In the situation where we
>> have a
>>> new sensor, right now if there are no index config and no enrichment
>>> config, it still passes through to the index using defaults. In this new
>>> scheme it would not. This changes the default semantics for the system
>> and
>>> I think it changes it for the worse.
>>> 
>>> I would strongly prefer a on-by-default indexing config as we have now.
 On Fri, Jan 13, 2017 at 17:13 Casey Stella  wrote:
 
 One thing that I really like about Nick's suggestion is that it allows
 writer-specific configs in a clear and simple way.  It is more complex
>>> for
 the default case (all writers write to indices named the same thing
>> with
>>> a
 fixed batch size), which I do not like, but maybe it's worth the
>>> compromise
 to make it less complex for the advanced case.
 
 Thanks a lot for the suggestion, Nick, it's interesting;  I'm beginning
>>> to
 lean your way.
 
 On Fri, Jan 13, 2017 at 2:51 PM, zeo...@gmail.com 
 wrote:
 
 I like the suggestions you made, Nick.  The only thing I would add is
>>> that
 it's also nice to see an explicit when(false), as people newer to the
 platform may not know where to expect configs for the different
>> writers.
 Being able to do it either way, which I think is already assumed in
>> your
 model, would make sense.  I would just suggest that, if we support but
>>> are
 disabling a writer, that the platform inserts a default when(false) to
>> be
 explicit.
 
 Jon
 
 On Fri, Jan 13, 2017 at 11:59 AM Casey Stella 
>>> wrote:
 
> Let me noodle on this over the weekend.  Your syntax is looking less
> onerous to me and I like the following statement from Otto: "In the
>>> end,
> each write destination ‘type’ will need it’s own configuration.  This
>>> is
 an
> extension point."
> 
> I may come around to your way of thinking.
> 
> On Fri, Jan 13, 2017 at 11:57 AM, Otto Fowler <
>> ottobackwa...@gmail.com
 
> wrote:
> 
>> In the end, each write destination ‘type’ will need it’s own
>> configuration.  This is an extension point.
>> {
>> HDFS:{
>> outputAdapters:[
>> {name: avro,
>> settings:{
>> avro stuff….
>> when:{
>> },
>> {
>> name: sequence file,
>> …..
>> 
>> or some such.
>> 
>> 
>> 

Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-14 Thread Casey Stella
I'm +1 on an explicit enabled property and a filter (or when) property. I
think we are zeroing in on a decent design, so that is good.

To recap, what I am +1 on is Nick's proposed syntax with the following
modifications:
1. An explicit enabled field
2. A default on for unspecified to match current semantics

Casey
On Sat, Jan 14, 2017 at 10:45 zeo...@gmail.com  wrote:

> This has the additional benefit of doing something like below when you want
> to temporarily disable the hdfs writer, but don't want to remove the
> settings.  This removes the need to store the path and batchSize (and many
> additional settings) somewhere else so they can be brought back in when you
> want to re-enable it, which is a nice workflow attribute for the end user:
>
> {
>'elasticsearch': {
>   'enabled': 'true',
>   'index': 'foo',
>   'batchSize': 100,
> },
>'hdfs': {
>   'enabled': 'false',
>   'path': '/foo/bar/...',
>   'batchSize': 100,
> },
>'solr': {
>   'enabled': 'false'
> }
> }
>
> Jon
>
> On Sat, Jan 14, 2017 at 9:24 AM zeo...@gmail.com  wrote:
>
> > I similarly have a concern there because I prefer being as explicit as
> > possible, which makes things easier to pick up for new users.  Using my
> > example from earlier this could look like specifying while(false), but an
> > even better and more obvious approach may be to use enabled(false).  So
> the
> > current simple default would be:
> >
> > {
> >'elasticsearch': { 'enabled': 'true' },
> >'hdfs': { 'enabled': 'true' },
> >'solr': { enabled': 'false' }
> > }
> >
> > And to use ES with some overrides but not HDFS or solr it would look
> like:
> >
> > {
> >'elasticsearch': {
> >   'enabled': 'true',
> >   'index': 'foo',
> >   'batchSize': 100
> > },
> >'hdfs': {
> >   'enabled': 'false'
> > },
> >'solr': {
> >   'enabled': 'false'
> > }
> > }
> >
> > Jon
> >
> > On Fri, Jan 13, 2017 at 10:21 PM Casey Stella 
> wrote:
> >
> > One thing that I thought of that I very strenuous do not like in Nick's
> > proposal is that if a writer config is not specified then it is turned
> off
> > (I think; if I misunderstood let me know). In the situation where we
> have a
> > new sensor, right now if there are no index config and no enrichment
> > config, it still passes through to the index using defaults. In this new
> > scheme it would not. This changes the default semantics for the system
> and
> > I think it changes it for the worse.
> >
> > I would strongly prefer a on-by-default indexing config as we have now.
> > On Fri, Jan 13, 2017 at 17:13 Casey Stella  wrote:
> >
> > > One thing that I really like about Nick's suggestion is that it allows
> > > writer-specific configs in a clear and simple way.  It is more complex
> > for
> > > the default case (all writers write to indices named the same thing
> with
> > a
> > > fixed batch size), which I do not like, but maybe it's worth the
> > compromise
> > > to make it less complex for the advanced case.
> > >
> > > Thanks a lot for the suggestion, Nick, it's interesting;  I'm beginning
> > to
> > > lean your way.
> > >
> > > On Fri, Jan 13, 2017 at 2:51 PM, zeo...@gmail.com 
> > > wrote:
> > >
> > > I like the suggestions you made, Nick.  The only thing I would add is
> > that
> > > it's also nice to see an explicit when(false), as people newer to the
> > > platform may not know where to expect configs for the different
> writers.
> > > Being able to do it either way, which I think is already assumed in
> your
> > > model, would make sense.  I would just suggest that, if we support but
> > are
> > > disabling a writer, that the platform inserts a default when(false) to
> be
> > > explicit.
> > >
> > > Jon
> > >
> > > On Fri, Jan 13, 2017 at 11:59 AM Casey Stella 
> > wrote:
> > >
> > > > Let me noodle on this over the weekend.  Your syntax is looking less
> > > > onerous to me and I like the following statement from Otto: "In the
> > end,
> > > > each write destination ‘type’ will need it’s own configuration.  This
> > is
> > > an
> > > > extension point."
> > > >
> > > > I may come around to your way of thinking.
> > > >
> > > > On Fri, Jan 13, 2017 at 11:57 AM, Otto Fowler <
> ottobackwa...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > In the end, each write destination ‘type’ will need it’s own
> > > > > configuration.  This is an extension point.
> > > > > {
> > > > > HDFS:{
> > > > > outputAdapters:[
> > > > > {name: avro,
> > > > > settings:{
> > > > > avro stuff….
> > > > > when:{
> > > > > },
> > > > > {
> > > > >  name: sequence file,
> > > > >  …..
> > > > >
> > > > > or some such.
> > > > >
> > > > >
> > > > > On January 13, 2017 at 11:51:15, Nick Allen (n...@nickallen.org)
> > > wrote:
> > > > >
> > > > > I will add also that instead of global overrides, like index, we
> > should
> > > > use
> > 

Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-14 Thread zeo...@gmail.com
This has the additional benefit of doing something like below when you want
to temporarily disable the hdfs writer, but don't want to remove the
settings.  This removes the need to store the path and batchSize (and many
additional settings) somewhere else so they can be brought back in when you
want to re-enable it, which is a nice workflow attribute for the end user:

{
   'elasticsearch': {
  'enabled': 'true',
  'index': 'foo',
  'batchSize': 100,
},
   'hdfs': {
  'enabled': 'false',
  'path': '/foo/bar/...',
  'batchSize': 100,
},
   'solr': {
  'enabled': 'false'
}
}

Jon

On Sat, Jan 14, 2017 at 9:24 AM zeo...@gmail.com  wrote:

> I similarly have a concern there because I prefer being as explicit as
> possible, which makes things easier to pick up for new users.  Using my
> example from earlier this could look like specifying while(false), but an
> even better and more obvious approach may be to use enabled(false).  So the
> current simple default would be:
>
> {
>'elasticsearch': { 'enabled': 'true' },
>'hdfs': { 'enabled': 'true' },
>'solr': { enabled': 'false' }
> }
>
> And to use ES with some overrides but not HDFS or solr it would look like:
>
> {
>'elasticsearch': {
>   'enabled': 'true',
>   'index': 'foo',
>   'batchSize': 100
> },
>'hdfs': {
>   'enabled': 'false'
> },
>'solr': {
>   'enabled': 'false'
> }
> }
>
> Jon
>
> On Fri, Jan 13, 2017 at 10:21 PM Casey Stella  wrote:
>
> One thing that I thought of that I very strenuous do not like in Nick's
> proposal is that if a writer config is not specified then it is turned off
> (I think; if I misunderstood let me know). In the situation where we have a
> new sensor, right now if there are no index config and no enrichment
> config, it still passes through to the index using defaults. In this new
> scheme it would not. This changes the default semantics for the system and
> I think it changes it for the worse.
>
> I would strongly prefer a on-by-default indexing config as we have now.
> On Fri, Jan 13, 2017 at 17:13 Casey Stella  wrote:
>
> > One thing that I really like about Nick's suggestion is that it allows
> > writer-specific configs in a clear and simple way.  It is more complex
> for
> > the default case (all writers write to indices named the same thing with
> a
> > fixed batch size), which I do not like, but maybe it's worth the
> compromise
> > to make it less complex for the advanced case.
> >
> > Thanks a lot for the suggestion, Nick, it's interesting;  I'm beginning
> to
> > lean your way.
> >
> > On Fri, Jan 13, 2017 at 2:51 PM, zeo...@gmail.com 
> > wrote:
> >
> > I like the suggestions you made, Nick.  The only thing I would add is
> that
> > it's also nice to see an explicit when(false), as people newer to the
> > platform may not know where to expect configs for the different writers.
> > Being able to do it either way, which I think is already assumed in your
> > model, would make sense.  I would just suggest that, if we support but
> are
> > disabling a writer, that the platform inserts a default when(false) to be
> > explicit.
> >
> > Jon
> >
> > On Fri, Jan 13, 2017 at 11:59 AM Casey Stella 
> wrote:
> >
> > > Let me noodle on this over the weekend.  Your syntax is looking less
> > > onerous to me and I like the following statement from Otto: "In the
> end,
> > > each write destination ‘type’ will need it’s own configuration.  This
> is
> > an
> > > extension point."
> > >
> > > I may come around to your way of thinking.
> > >
> > > On Fri, Jan 13, 2017 at 11:57 AM, Otto Fowler  >
> > > wrote:
> > >
> > > > In the end, each write destination ‘type’ will need it’s own
> > > > configuration.  This is an extension point.
> > > > {
> > > > HDFS:{
> > > > outputAdapters:[
> > > > {name: avro,
> > > > settings:{
> > > > avro stuff….
> > > > when:{
> > > > },
> > > > {
> > > >  name: sequence file,
> > > >  …..
> > > >
> > > > or some such.
> > > >
> > > >
> > > > On January 13, 2017 at 11:51:15, Nick Allen (n...@nickallen.org)
> > wrote:
> > > >
> > > > I will add also that instead of global overrides, like index, we
> should
> > > use
> > > > configuration key names that are more appropriate to the output.
> > > >
> > > > For example, does 'index' really make sense for HDFS? Or would 'path'
> > be
> > > > more appropriate?
> > > >
> > > > {
> > > > 'elasticsearch': {
> > > > 'index': 'foo',
> > > > 'batchSize': 1
> > > > },
> > > > 'hdfs': {
> > > > 'path': '/foo/bar/...',
> > > > 'batchSize': 100
> > > > }
> > > > }
> > > >
> > > > Ok, I've said my peace. Thanks for the effort in summarizing all
> this,
> > > > Casey.
> > > >
> > > >
> > > > On Fri, Jan 13, 2017 at 11:42 AM, Nick Allen 
> > wrote:
> > > >
> > > > > Nick's concerns about my suggestion were that it was overly complex
> > 

Re: [DISCUSS] Turning off indexing writers feature discussion

2017-01-14 Thread zeo...@gmail.com
I similarly have a concern there because I prefer being as explicit as
possible, which makes things easier to pick up for new users.  Using my
example from earlier this could look like specifying while(false), but an
even better and more obvious approach may be to use enabled(false).  So the
current simple default would be:

{
   'elasticsearch': { 'enabled': 'true' },
   'hdfs': { 'enabled': 'true' },
   'solr': { enabled': 'false' }
}

And to use ES with some overrides but not HDFS or solr it would look like:

{
   'elasticsearch': {
  'enabled': 'true',
  'index': 'foo',
  'batchSize': 100
},
   'hdfs': {
  'enabled': 'false'
},
   'solr': {
  'enabled': 'false'
}
}

Jon

On Fri, Jan 13, 2017 at 10:21 PM Casey Stella  wrote:

> One thing that I thought of that I very strenuous do not like in Nick's
> proposal is that if a writer config is not specified then it is turned off
> (I think; if I misunderstood let me know). In the situation where we have a
> new sensor, right now if there are no index config and no enrichment
> config, it still passes through to the index using defaults. In this new
> scheme it would not. This changes the default semantics for the system and
> I think it changes it for the worse.
>
> I would strongly prefer a on-by-default indexing config as we have now.
> On Fri, Jan 13, 2017 at 17:13 Casey Stella  wrote:
>
> > One thing that I really like about Nick's suggestion is that it allows
> > writer-specific configs in a clear and simple way.  It is more complex
> for
> > the default case (all writers write to indices named the same thing with
> a
> > fixed batch size), which I do not like, but maybe it's worth the
> compromise
> > to make it less complex for the advanced case.
> >
> > Thanks a lot for the suggestion, Nick, it's interesting;  I'm beginning
> to
> > lean your way.
> >
> > On Fri, Jan 13, 2017 at 2:51 PM, zeo...@gmail.com 
> > wrote:
> >
> > I like the suggestions you made, Nick.  The only thing I would add is
> that
> > it's also nice to see an explicit when(false), as people newer to the
> > platform may not know where to expect configs for the different writers.
> > Being able to do it either way, which I think is already assumed in your
> > model, would make sense.  I would just suggest that, if we support but
> are
> > disabling a writer, that the platform inserts a default when(false) to be
> > explicit.
> >
> > Jon
> >
> > On Fri, Jan 13, 2017 at 11:59 AM Casey Stella 
> wrote:
> >
> > > Let me noodle on this over the weekend.  Your syntax is looking less
> > > onerous to me and I like the following statement from Otto: "In the
> end,
> > > each write destination ‘type’ will need it’s own configuration.  This
> is
> > an
> > > extension point."
> > >
> > > I may come around to your way of thinking.
> > >
> > > On Fri, Jan 13, 2017 at 11:57 AM, Otto Fowler  >
> > > wrote:
> > >
> > > > In the end, each write destination ‘type’ will need it’s own
> > > > configuration.  This is an extension point.
> > > > {
> > > > HDFS:{
> > > > outputAdapters:[
> > > > {name: avro,
> > > > settings:{
> > > > avro stuff….
> > > > when:{
> > > > },
> > > > {
> > > >  name: sequence file,
> > > >  …..
> > > >
> > > > or some such.
> > > >
> > > >
> > > > On January 13, 2017 at 11:51:15, Nick Allen (n...@nickallen.org)
> > wrote:
> > > >
> > > > I will add also that instead of global overrides, like index, we
> should
> > > use
> > > > configuration key names that are more appropriate to the output.
> > > >
> > > > For example, does 'index' really make sense for HDFS? Or would 'path'
> > be
> > > > more appropriate?
> > > >
> > > > {
> > > > 'elasticsearch': {
> > > > 'index': 'foo',
> > > > 'batchSize': 1
> > > > },
> > > > 'hdfs': {
> > > > 'path': '/foo/bar/...',
> > > > 'batchSize': 100
> > > > }
> > > > }
> > > >
> > > > Ok, I've said my peace. Thanks for the effort in summarizing all
> this,
> > > > Casey.
> > > >
> > > >
> > > > On Fri, Jan 13, 2017 at 11:42 AM, Nick Allen 
> > wrote:
> > > >
> > > > > Nick's concerns about my suggestion were that it was overly complex
> > and
> > > > >> hard to grok and that we could dispense with backwards
> compatibility
> > > and
> > > > >> make people do a bit more work on the default case for the
> benefits
> > > of a
> > > > >> simpler advanced case. (Nick, make sure I don't misstate your
> > > position)
> > > > >
> > > > >
> > > > > I will add is that in my mind, the majority case would be a user
> > > > > specifying the outputs, but not things like 'batchSize' or 'when'.
> I
> > > > think
> > > > > in the majority case, the user would accept whatever the default
> > batch
> > > > size
> > > > > is.
> > > > >
> > > > > Here are alternatives suggestions for all the examples that you
> > > provided
> > > > > previously.
> > > > >
> > > > > Base Case
> > > > >
> > > > > - The user