Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Manikumar
We can limit substitution mechanism only for password config types and JAAS
config.
We may not want to use to for all config properties.

On Sat, Apr 14, 2018 at 9:21 AM, Colin McCabe  wrote:

> On Fri, Apr 13, 2018, at 10:30, Rajini Sivaram wrote:
> > Hi Colin,
> >
> > JAAS configuration can be provided in a separate file, but that has been
> > the cause of various problems in itself. The configuration option
> > `sasl.jaas.config` was added to overcome this. This is already a dynamic
> > configuration option stored in ZooKeeper since we allow listeners to be
> > added dynamically. Going forward, the property should be the preferred
> way
> > to configure SASL. I don't think we should allow any form of
> configuration
> > substitutions for JAAS that don't make sense for regular configs. And if
> we
> > are going to have a substitution mechanism, e.g. for password configs,
> then
> > it makes sense to allow for SSL as well as SASL.
> >
> > So the question really is what forms of substitution, if any, make
> sense. I
> > agree that use of system properties and environment variables are not a
> > good idea, but should references to files be allowed? Sounds like that
> is a
> > bad idea too from your experience. Does it make sense to have a
> extensible
> > substitution mechanism to allow users to integrate with password vaults
> or
> > other sources of config values?
>
> Hmm.  It seems like in order to authenticate with any kind of password
> vault, you would need a JAAS configuration, right?  So you can't really
> store your JAAS config in a password vault, although there may be other
> things you could usefully store there.  That's why I think JAAS really is a
> special case here, worth considering separately.  Storing your private key
> in a local file seems like a reasonable idea; storing the value of
> max.poll.records in a file does not seem that useful or reasonable.
>
> best,
> Colin
>
> >
> >
> > On Fri, Apr 13, 2018 at 5:56 PM, Colin McCabe 
> wrote:
> >
> > > I think we need to be a very careful here.  Configuration complexity
> can
> > > get out of control very quickly.  There are also some conflicting goals
> > > here.
> > >
> > > As much as possible, we want the configuration to be a full
> description of
> > > what the broker is going to do.  If the configuration pulls in
> environment
> > > variables, system properties, local files, or other aspects of the
> local
> > > system environment, it is no longer a complete description  of what the
> > > broker is going to do.  Instead, you have to know about the full UNIX
> > > environment to understand what is going on.  This makes deployments
> less
> > > repeatable and will lead to hard-to-track-down problems if one node
> has a
> > > different set of environment variables than the others, etc.
> > >
> > > We want it to be easy to roll out a new configuration to all brokers
> > > without restarting them all.  We should expect that in the future,
> more and
> > > more configurations will be KIP-226 style dynamic configurations that
> are
> > > stored in ZooKeeper and centrally rolled out to all brokers without a
> > > restart.  If we have to restart processes with different environment or
> > > system properties, or change local files, in order to reconfigure, we
> can't
> > > accomplish this goal.  As much as possible, the centrally managed
> > > configurations should not refer to local system properties.
> > >
> > > Configurations should be loaded efficiently.  But if loading the
> > > configuration requires opening and reading local files, it could get
> > > extremely slow.  I saw this problem firsthand in Hadoop, where invoking
> > > "new Configuration()" causes dozens of XML files to be loaded and
> parsed.
> > > Also, keep in mind that things other than the broker need to load
> > > configurations.  Every client and every tool needs to perform the same
> > > process.
> > >
> > > If configuration keys can reference and include other configuration
> keys,
> > > renaming or deprecating something becomes even harder than it is now.
> And
> > > if one configuration key changes because it is a dynamic configuration
> key,
> > > should all the configuration keys that included that one change as
> well?
> > > This feature simply doesn't work well with our other features.
> > >
> > > It seems like most of these problems can be solved better and more
> easily
> > > outside Kafka.  For example, it's straightforward to write a bash
> script
> > > that examines some environment variables, constructs a Kafka
> configuration
> > > file and then runs the Kafka broker with that file.  This also makes it
> > > straightforward to set configuration keys in tandem, if that's what you
> > > want.
> > >
> > > I think we should focus just on what JAAS needs, which seems very
> > > different than what other configurations need.  In the specific case of
> > > JAAS, it makes sense to consider loading stuff from a separate file, 

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Colin McCabe
On Fri, Apr 13, 2018, at 10:30, Rajini Sivaram wrote:
> Hi Colin,
> 
> JAAS configuration can be provided in a separate file, but that has been
> the cause of various problems in itself. The configuration option
> `sasl.jaas.config` was added to overcome this. This is already a dynamic
> configuration option stored in ZooKeeper since we allow listeners to be
> added dynamically. Going forward, the property should be the preferred way
> to configure SASL. I don't think we should allow any form of configuration
> substitutions for JAAS that don't make sense for regular configs. And if we
> are going to have a substitution mechanism, e.g. for password configs, then
> it makes sense to allow for SSL as well as SASL.
> 
> So the question really is what forms of substitution, if any, make sense. I
> agree that use of system properties and environment variables are not a
> good idea, but should references to files be allowed? Sounds like that is a
> bad idea too from your experience. Does it make sense to have a extensible
> substitution mechanism to allow users to integrate with password vaults or
> other sources of config values?

Hmm.  It seems like in order to authenticate with any kind of password vault, 
you would need a JAAS configuration, right?  So you can't really store your 
JAAS config in a password vault, although there may be other things you could 
usefully store there.  That's why I think JAAS really is a special case here, 
worth considering separately.  Storing your private key in a local file seems 
like a reasonable idea; storing the value of max.poll.records in a file does 
not seem that useful or reasonable.

best,
Colin

> 
> 
> On Fri, Apr 13, 2018 at 5:56 PM, Colin McCabe  wrote:
> 
> > I think we need to be a very careful here.  Configuration complexity can
> > get out of control very quickly.  There are also some conflicting goals
> > here.
> >
> > As much as possible, we want the configuration to be a full description of
> > what the broker is going to do.  If the configuration pulls in environment
> > variables, system properties, local files, or other aspects of the local
> > system environment, it is no longer a complete description  of what the
> > broker is going to do.  Instead, you have to know about the full UNIX
> > environment to understand what is going on.  This makes deployments less
> > repeatable and will lead to hard-to-track-down problems if one node has a
> > different set of environment variables than the others, etc.
> >
> > We want it to be easy to roll out a new configuration to all brokers
> > without restarting them all.  We should expect that in the future, more and
> > more configurations will be KIP-226 style dynamic configurations that are
> > stored in ZooKeeper and centrally rolled out to all brokers without a
> > restart.  If we have to restart processes with different environment or
> > system properties, or change local files, in order to reconfigure, we can't
> > accomplish this goal.  As much as possible, the centrally managed
> > configurations should not refer to local system properties.
> >
> > Configurations should be loaded efficiently.  But if loading the
> > configuration requires opening and reading local files, it could get
> > extremely slow.  I saw this problem firsthand in Hadoop, where invoking
> > "new Configuration()" causes dozens of XML files to be loaded and parsed.
> > Also, keep in mind that things other than the broker need to load
> > configurations.  Every client and every tool needs to perform the same
> > process.
> >
> > If configuration keys can reference and include other configuration keys,
> > renaming or deprecating something becomes even harder than it is now.  And
> > if one configuration key changes because it is a dynamic configuration key,
> > should all the configuration keys that included that one change as well?
> > This feature simply doesn't work well with our other features.
> >
> > It seems like most of these problems can be solved better and more easily
> > outside Kafka.  For example, it's straightforward to write a bash script
> > that examines some environment variables, constructs a Kafka configuration
> > file and then runs the Kafka broker with that file.  This also makes it
> > straightforward to set configuration keys in tandem, if that's what you
> > want.
> >
> > I think we should focus just on what JAAS needs, which seems very
> > different than what other configurations need.  In the specific case of
> > JAAS, it makes sense to consider loading stuff from a separate file, to
> > avoid having credentials stored in the properties file.  (But I thought we
> > already had a way to do that?)
> >
> > best,
> > Colin
> >
> >
> > On Fri, Apr 13, 2018, at 07:16, Rajini Sivaram wrote:
> > > Hi Ron,
> > >
> > > I think we should be able to process substitutions for both static JAAS
> > > configuration file as well as `sasl.jaas.config` property. We load the
> > > configuration using 

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Ron Dagostino
Thanks, Colin, for your comments.  I appreciate the point that substitution
can be over-applied.  I did not have a specific requirement for the
defaultKey= and fromValueOfKey modifiers; I included them because they
intuitively felt like they would be useful and made the feature more
powerful.  Given a lack of a hard requirement and your well-taken point
about dependency graphs potentially causing problems, I wonder if
eliminating the ability to refer to other keys by removing the
defaultKey= and fromValueOfKey modifiers would be wise.  It would keep
things simple while still providing the ability to inject configuration.

How this feature coexists with dynamic configuration is worth more
discussion.  I recently specified that values are not read twice:

"Once an instance of SubstitutableValues retrieves an underlying value and
its calculated value -- whether different from the raw underlying value due
to a substitution or not -- is determined, the instance of
SubstitutableValues will not retrieve the underlying value again; the
calculated value will be used if it is referred to. This means if the
underlying values are expected to change then to see those changes a
new instance of SubstitutableValues must be allocated."

I stated this because of the dependency graph issue you asked about:

"If one configuration key changes because it is a dynamic configuration
key, should all the configuration keys that included that one change as
well?"

My solution was that if anything changed, and you wanted to see those
changes, you had to invalidate everything and start from the beginning
again because the implementation was not going to track the dependencies.

If we eliminate dependencies then I think we have more flexibility in
dealing with the possibility that underlying configuration might change.
It becomes allowable to re-calculate a value, for example.  There are still
some subtleties -- when is a value to be recalculated?  Should it be done
explicitly, so we have to ask for it to be recalculated?  Should we allow a
value to be tagged with a "lifetime" value so a calculated value can be
discarded after a certain amount of time?  I don't know what the right
answer is, but eliminating dependencies does give us more options here.

Ron





On Fri, Apr 13, 2018 at 1:30 PM, Rajini Sivaram 
wrote:

> Hi Colin,
>
> JAAS configuration can be provided in a separate file, but that has been
> the cause of various problems in itself. The configuration option
> `sasl.jaas.config` was added to overcome this. This is already a dynamic
> configuration option stored in ZooKeeper since we allow listeners to be
> added dynamically. Going forward, the property should be the preferred way
> to configure SASL. I don't think we should allow any form of configuration
> substitutions for JAAS that don't make sense for regular configs. And if we
> are going to have a substitution mechanism, e.g. for password configs, then
> it makes sense to allow for SSL as well as SASL.
>
> So the question really is what forms of substitution, if any, make sense. I
> agree that use of system properties and environment variables are not a
> good idea, but should references to files be allowed? Sounds like that is a
> bad idea too from your experience. Does it make sense to have a extensible
> substitution mechanism to allow users to integrate with password vaults or
> other sources of config values?
>
>
> On Fri, Apr 13, 2018 at 5:56 PM, Colin McCabe  wrote:
>
> > I think we need to be a very careful here.  Configuration complexity can
> > get out of control very quickly.  There are also some conflicting goals
> > here.
> >
> > As much as possible, we want the configuration to be a full description
> of
> > what the broker is going to do.  If the configuration pulls in
> environment
> > variables, system properties, local files, or other aspects of the local
> > system environment, it is no longer a complete description  of what the
> > broker is going to do.  Instead, you have to know about the full UNIX
> > environment to understand what is going on.  This makes deployments less
> > repeatable and will lead to hard-to-track-down problems if one node has a
> > different set of environment variables than the others, etc.
> >
> > We want it to be easy to roll out a new configuration to all brokers
> > without restarting them all.  We should expect that in the future, more
> and
> > more configurations will be KIP-226 style dynamic configurations that are
> > stored in ZooKeeper and centrally rolled out to all brokers without a
> > restart.  If we have to restart processes with different environment or
> > system properties, or change local files, in order to reconfigure, we
> can't
> > accomplish this goal.  As much as possible, the centrally managed
> > configurations should not refer to local system properties.
> >
> > Configurations should be loaded efficiently.  But if loading the
> > configuration 

Re: [DISCUSS] KIP-279: Fix log divergence between leader and follower after fast leader fail over

2018-04-13 Thread Anna Povzner
Thanks everyone for the feedback. I will start a voting thread tomorrow
morning if there are no more comments.

Regards,
Anna


On Wed, Apr 11, 2018 at 3:14 PM, Jun Rao  wrote:

> Hi, Anna,
>
> Thanks for the KIP. Looks good to me.
>
> Great point on bounding the cleaning point in a compacted topic by high
> watermark. Filed https://issues.apache.org/jira/browse/KAFKA-6780 to track
> it.
>
> Jun
>
>
> On Thu, Apr 5, 2018 at 12:17 PM, Anna Povzner  wrote:
>
> > Hi,
> >
> >
> > I just created KIP-279 to fix edge cases of log divergence for both clean
> > and unclean leader election configs.
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 279%3A+Fix+log+divergence+between+leader+and+follower+
> > after+fast+leader+fail+over
> >
> >
> > The KIP is basically a follow up to KIP-101, and proposes a slight
> > extension to the replication protocol to fix edge cases where logs can
> > diverge due to fast leader fail over.
> >
> >
> > Feedback and suggestions are welcome!
> >
> >
> > Thanks,
> >
> > Anna
> >
>


Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

2018-04-13 Thread Ted Yu
Dong:
Yes, that answers my question.

Thanks

On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin  wrote:

> Hey Ted,
>
> Thanks for your comments. With the proposed solution in the KIP, the memory
> is only allocated once for the given message, which is the same as the
> existing implementation. The serialized message will be moved from
> per-topic queue to per-partition queue without incurring additional memory
> overhead. Does this address your question?
>
> Thanks,
> Dong
>
>
> On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu  wrote:
>
> > Looks like per-topic queue is introduced.
> >
> > In terms of memory consumption, how does the KIP allocate memory
> > between per-topic
> > queue and per-partition queue ?
> >
> > Thanks
> >
> > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin  wrote:
> >
> > > Hi all,
> > >
> > > I have created KIP-286: producer.send() should not block on metadata
> > > update. See
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > > .
> > >
> > > The KIP intends to improve user-experience of producer.send() when
> > metadata
> > > is not available. It is related but different from the previous
> > discussion
> > > in KAFKA-3539 in the sense that user still has the option of letting
> > > producer.send() block on full producer queue.
> > >
> > > Comments are welcome!
> > >
> > > Thanks,
> > > Dong
> > >
> >
>


Re: [DISCUSS] KIP-286: producer.send() should not block on metadata update

2018-04-13 Thread Becket Qin
Thanks for the KIP, Dong.

In the current threading model, compression is done by the user threads,
therefore the producer sender thread can focus on IO. With the proposed
changes, does that mean the producer sender thread will have to do all the
compression as well? Would this become a performance bottleneck?

Thanks,

Jiangjie (Becket) Qin

On Thu, Apr 12, 2018 at 1:41 AM, Dong Lin  wrote:

> Hey Ted,
>
> Thanks for your comments. With the proposed solution in the KIP, the memory
> is only allocated once for the given message, which is the same as the
> existing implementation. The serialized message will be moved from
> per-topic queue to per-partition queue without incurring additional memory
> overhead. Does this address your question?
>
> Thanks,
> Dong
>
>
> On Wed, Apr 11, 2018 at 9:36 PM, Ted Yu  wrote:
>
> > Looks like per-topic queue is introduced.
> >
> > In terms of memory consumption, how does the KIP allocate memory
> > between per-topic
> > queue and per-partition queue ?
> >
> > Thanks
> >
> > On Wed, Apr 11, 2018 at 8:50 PM, Dong Lin  wrote:
> >
> > > Hi all,
> > >
> > > I have created KIP-286: producer.send() should not block on metadata
> > > update. See
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 286%3A+producer.send%28%29+should+not+block+on+metadata+update
> > > .
> > >
> > > The KIP intends to improve user-experience of producer.send() when
> > metadata
> > > is not available. It is related but different from the previous
> > discussion
> > > in KAFKA-3539 in the sense that user still has the option of letting
> > > producer.send() block on full producer queue.
> > >
> > > Comments are welcome!
> > >
> > > Thanks,
> > > Dong
> > >
> >
>


[jira] [Resolved] (KAFKA-2661) Add a unit test for disconnecting idle socket connections

2018-04-13 Thread Manikumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-2661.
--
Resolution: Fixed

SelectorTest.testCloseConnectionInClosingState/testCloseOldestConnection tests 
covers unit test case for idle connections.

> Add a unit test for disconnecting idle socket connections 
> --
>
> Key: KAFKA-2661
> URL: https://issues.apache.org/jira/browse/KAFKA-2661
> Project: Kafka
>  Issue Type: Test
>  Components: core
>Reporter: Jun Rao
>Priority: Major
>
> The logic for disconnecting idle connections is now moved to Selector. We 
> just need to add a unit test to verify that it works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Rajini Sivaram
Hi Colin,

JAAS configuration can be provided in a separate file, but that has been
the cause of various problems in itself. The configuration option
`sasl.jaas.config` was added to overcome this. This is already a dynamic
configuration option stored in ZooKeeper since we allow listeners to be
added dynamically. Going forward, the property should be the preferred way
to configure SASL. I don't think we should allow any form of configuration
substitutions for JAAS that don't make sense for regular configs. And if we
are going to have a substitution mechanism, e.g. for password configs, then
it makes sense to allow for SSL as well as SASL.

So the question really is what forms of substitution, if any, make sense. I
agree that use of system properties and environment variables are not a
good idea, but should references to files be allowed? Sounds like that is a
bad idea too from your experience. Does it make sense to have a extensible
substitution mechanism to allow users to integrate with password vaults or
other sources of config values?


On Fri, Apr 13, 2018 at 5:56 PM, Colin McCabe  wrote:

> I think we need to be a very careful here.  Configuration complexity can
> get out of control very quickly.  There are also some conflicting goals
> here.
>
> As much as possible, we want the configuration to be a full description of
> what the broker is going to do.  If the configuration pulls in environment
> variables, system properties, local files, or other aspects of the local
> system environment, it is no longer a complete description  of what the
> broker is going to do.  Instead, you have to know about the full UNIX
> environment to understand what is going on.  This makes deployments less
> repeatable and will lead to hard-to-track-down problems if one node has a
> different set of environment variables than the others, etc.
>
> We want it to be easy to roll out a new configuration to all brokers
> without restarting them all.  We should expect that in the future, more and
> more configurations will be KIP-226 style dynamic configurations that are
> stored in ZooKeeper and centrally rolled out to all brokers without a
> restart.  If we have to restart processes with different environment or
> system properties, or change local files, in order to reconfigure, we can't
> accomplish this goal.  As much as possible, the centrally managed
> configurations should not refer to local system properties.
>
> Configurations should be loaded efficiently.  But if loading the
> configuration requires opening and reading local files, it could get
> extremely slow.  I saw this problem firsthand in Hadoop, where invoking
> "new Configuration()" causes dozens of XML files to be loaded and parsed.
> Also, keep in mind that things other than the broker need to load
> configurations.  Every client and every tool needs to perform the same
> process.
>
> If configuration keys can reference and include other configuration keys,
> renaming or deprecating something becomes even harder than it is now.  And
> if one configuration key changes because it is a dynamic configuration key,
> should all the configuration keys that included that one change as well?
> This feature simply doesn't work well with our other features.
>
> It seems like most of these problems can be solved better and more easily
> outside Kafka.  For example, it's straightforward to write a bash script
> that examines some environment variables, constructs a Kafka configuration
> file and then runs the Kafka broker with that file.  This also makes it
> straightforward to set configuration keys in tandem, if that's what you
> want.
>
> I think we should focus just on what JAAS needs, which seems very
> different than what other configurations need.  In the specific case of
> JAAS, it makes sense to consider loading stuff from a separate file, to
> avoid having credentials stored in the properties file.  (But I thought we
> already had a way to do that?)
>
> best,
> Colin
>
>
> On Fri, Apr 13, 2018, at 07:16, Rajini Sivaram wrote:
> > Hi Ron,
> >
> > I think we should be able to process substitutions for both static JAAS
> > configuration file as well as `sasl.jaas.config` property. We load the
> > configuration using org.apache.kafka.common.security.
> > JaasContext.loadXXXContext() and that would be a good place to do any
> > substitution. The method has access to the producer/consumer/broker
> configs
> > as well in case we want keys to refer to these.
> >
> > On Fri, Apr 13, 2018 at 2:15 PM, Ron Dagostino 
> wrote:
> >
> > > Hi Rajini.  Regarding processing the sasl.jaas.config value up-front,
> there
> > > are a couple of things that occur to me about it.  First, the older
> way of
> > > storing the JAAS config in a separate file is still supported (and is
> at
> > > this time the prevalent mechanism on the broker side since
> sasl.jaas.config
> > > support for brokers was only recently added via KIP 226).  We could
> 

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Colin McCabe
I think we need to be a very careful here.  Configuration complexity can get 
out of control very quickly.  There are also some conflicting goals here.

As much as possible, we want the configuration to be a full description of what 
the broker is going to do.  If the configuration pulls in environment 
variables, system properties, local files, or other aspects of the local system 
environment, it is no longer a complete description  of what the broker is 
going to do.  Instead, you have to know about the full UNIX environment to 
understand what is going on.  This makes deployments less repeatable and will 
lead to hard-to-track-down problems if one node has a different set of 
environment variables than the others, etc.

We want it to be easy to roll out a new configuration to all brokers without 
restarting them all.  We should expect that in the future, more and more 
configurations will be KIP-226 style dynamic configurations that are stored in 
ZooKeeper and centrally rolled out to all brokers without a restart.  If we 
have to restart processes with different environment or system properties, or 
change local files, in order to reconfigure, we can't accomplish this goal.  As 
much as possible, the centrally managed configurations should not refer to 
local system properties.

Configurations should be loaded efficiently.  But if loading the configuration 
requires opening and reading local files, it could get extremely slow.  I saw 
this problem firsthand in Hadoop, where invoking "new Configuration()" causes 
dozens of XML files to be loaded and parsed.  Also, keep in mind that things 
other than the broker need to load configurations.  Every client and every tool 
needs to perform the same process.

If configuration keys can reference and include other configuration keys, 
renaming or deprecating something becomes even harder than it is now.  And if 
one configuration key changes because it is a dynamic configuration key, should 
all the configuration keys that included that one change as well?  This feature 
simply doesn't work well with our other features.

It seems like most of these problems can be solved better and more easily 
outside Kafka.  For example, it's straightforward to write a bash script that 
examines some environment variables, constructs a Kafka configuration file and 
then runs the Kafka broker with that file.  This also makes it straightforward 
to set configuration keys in tandem, if that's what you want.

I think we should focus just on what JAAS needs, which seems very different 
than what other configurations need.  In the specific case of JAAS, it makes 
sense to consider loading stuff from a separate file, to avoid having 
credentials stored in the properties file.  (But I thought we already had a way 
to do that?)

best,
Colin


On Fri, Apr 13, 2018, at 07:16, Rajini Sivaram wrote:
> Hi Ron,
> 
> I think we should be able to process substitutions for both static JAAS
> configuration file as well as `sasl.jaas.config` property. We load the
> configuration using org.apache.kafka.common.security.
> JaasContext.loadXXXContext() and that would be a good place to do any
> substitution. The method has access to the producer/consumer/broker configs
> as well in case we want keys to refer to these.
> 
> On Fri, Apr 13, 2018 at 2:15 PM, Ron Dagostino  wrote:
> 
> > Hi Rajini.  Regarding processing the sasl.jaas.config value up-front, there
> > are a couple of things that occur to me about it.  First, the older way of
> > storing the JAAS config in a separate file is still supported (and is at
> > this time the prevalent mechanism on the broker side since sasl.jaas.config
> > support for brokers was only recently added via KIP 226).  We could state
> > that substitution is only supported via sasl.jaas.config and force people
> > to convert over to get substitution functionality, but that wouldn't be
> > necessary if we let the login module do the substitution later on.
> >
> > The second thing that occurs to me is related to namespacing and creates
> > tension with the first point above.  If we refer to the "fubar" key in the
> > config, is that key a JAAS module option or is it a value in the
> > cluster/producer/consumer config?  It would be very positive if we could
> > eliminate namespacing entirely such that when you reference another key it
> > is always very clear what is being referred to -- i.e. always a key in the
> > cluster/producer/consumer config.  Otherwise the docs have to spell out
> > when it is one versus the other.
> >
> > That is a good point about being able to provide substitution support to
> > SASL/GSSAPI (which relies upon login module code that we do not control) if
> > we choose the simple, consistent way of doing things up front.
> >
> > You asked if there are OAuth use cases that require substitutions to be
> > performed in a login module that cannot be done if the substitution is
> > performed when the configuration is parsed.  I 

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Rajini Sivaram
Hi Ron,

I think we should be able to process substitutions for both static JAAS
configuration file as well as `sasl.jaas.config` property. We load the
configuration using org.apache.kafka.common.security.
JaasContext.loadXXXContext() and that would be a good place to do any
substitution. The method has access to the producer/consumer/broker configs
as well in case we want keys to refer to these.

On Fri, Apr 13, 2018 at 2:15 PM, Ron Dagostino  wrote:

> Hi Rajini.  Regarding processing the sasl.jaas.config value up-front, there
> are a couple of things that occur to me about it.  First, the older way of
> storing the JAAS config in a separate file is still supported (and is at
> this time the prevalent mechanism on the broker side since sasl.jaas.config
> support for brokers was only recently added via KIP 226).  We could state
> that substitution is only supported via sasl.jaas.config and force people
> to convert over to get substitution functionality, but that wouldn't be
> necessary if we let the login module do the substitution later on.
>
> The second thing that occurs to me is related to namespacing and creates
> tension with the first point above.  If we refer to the "fubar" key in the
> config, is that key a JAAS module option or is it a value in the
> cluster/producer/consumer config?  It would be very positive if we could
> eliminate namespacing entirely such that when you reference another key it
> is always very clear what is being referred to -- i.e. always a key in the
> cluster/producer/consumer config.  Otherwise the docs have to spell out
> when it is one versus the other.
>
> That is a good point about being able to provide substitution support to
> SASL/GSSAPI (which relies upon login module code that we do not control) if
> we choose the simple, consistent way of doing things up front.
>
> You asked if there are OAuth use cases that require substitutions to be
> performed in a login module that cannot be done if the substitution is
> performed when the configuration is parsed.  I don't think so, no; the
> timing should not matter.
>
> I hadn't thought about the listener prefix issue.  I don't know that area
> of the code very well, but I have looked enough to guess that the
> underlying "originals" map in AbstractConfig is what we would want to use
> when making a reference to something.  It would eliminate the listener
> prefix namespacing confusion if we always refer to the key as originally
> provided.
>
> I'm willing to go with doing substitution once, up-front, at the
> cluster/producer/consumer config level, and supporting substitution for
> JAAS configs only when provided via sasl.jaas.config.  I'm willing to try
> the coding to introduce it at that level -- tentative given my
> unfamiliarity with the code and its subtleties, but willing to try.  Let me
> chew on it for a day or two and see what I can make happen.  In case you
> want to try as well, you can pull the current implementation (which I think
> is in good shape and might only need cosmetic/stylistic code review changes
> as opposed to wholesale API adjustments) from the KAFKA-6664 branch of
> https://github.com/rondagostino/kafka.git.
>
> Ron
>
> On Fri, Apr 13, 2018 at 7:49 AM, Rajini Sivaram 
> wrote:
>
> > Hi Ron,
> >
> > Thanks for the notes and KIP update.
> >
> > Handling `sasl.jaas.config` as a special case is fine, but it would be
> > better if we can do any substitutions before we create a `Configuration`
> > object rather than expect the login module to do the substitution. That
> > way, we will have a consistent substitution format for all login modules
> > including built-in ones. And since we have users who already have their
> own
> > login modules (before KIP-86), they will benefit from substitution too
> > without adding additional code to the login module, But you have thought
> > about this more in the context of OAuth, so this is more of a question.
> Are
> > there use cases that require substitutions to be performed in a login
> > module that cannot be done if the substitution is performed when the
> > configuration is parsed?
> >
> > The ability to refer to other keys is generally quite useful. But as you
> > said, "*there is a namespacing of sorts going on*". Even with regular
> > configs, we have listener prefix, which is also a "*namespacing of
> sorts"*.
> > Our current config framework doesn't represent these well. As you already
> > noticed before, there is magic that removes prefixes, flattening the
> > namespace. Perhaps that is not an issue if we want to allow references to
> > keys that are in the global namespace (non-listener-prefixed) as well.
> But
> > we probably want to make sure namespaces are handled consistently for `
> > sasl.jaas.config` and other configs.
> >
> >
> >
> > On Tue, Apr 10, 2018 at 3:41 AM, Ron Dagostino 
> wrote:
> >
> > > Hi folks.  I updated KIP 269 to help clarify some of the issues
> mentioned

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Ron Dagostino
Hi Rajini.  Regarding processing the sasl.jaas.config value up-front, there
are a couple of things that occur to me about it.  First, the older way of
storing the JAAS config in a separate file is still supported (and is at
this time the prevalent mechanism on the broker side since sasl.jaas.config
support for brokers was only recently added via KIP 226).  We could state
that substitution is only supported via sasl.jaas.config and force people
to convert over to get substitution functionality, but that wouldn't be
necessary if we let the login module do the substitution later on.

The second thing that occurs to me is related to namespacing and creates
tension with the first point above.  If we refer to the "fubar" key in the
config, is that key a JAAS module option or is it a value in the
cluster/producer/consumer config?  It would be very positive if we could
eliminate namespacing entirely such that when you reference another key it
is always very clear what is being referred to -- i.e. always a key in the
cluster/producer/consumer config.  Otherwise the docs have to spell out
when it is one versus the other.

That is a good point about being able to provide substitution support to
SASL/GSSAPI (which relies upon login module code that we do not control) if
we choose the simple, consistent way of doing things up front.

You asked if there are OAuth use cases that require substitutions to be
performed in a login module that cannot be done if the substitution is
performed when the configuration is parsed.  I don't think so, no; the
timing should not matter.

I hadn't thought about the listener prefix issue.  I don't know that area
of the code very well, but I have looked enough to guess that the
underlying "originals" map in AbstractConfig is what we would want to use
when making a reference to something.  It would eliminate the listener
prefix namespacing confusion if we always refer to the key as originally
provided.

I'm willing to go with doing substitution once, up-front, at the
cluster/producer/consumer config level, and supporting substitution for
JAAS configs only when provided via sasl.jaas.config.  I'm willing to try
the coding to introduce it at that level -- tentative given my
unfamiliarity with the code and its subtleties, but willing to try.  Let me
chew on it for a day or two and see what I can make happen.  In case you
want to try as well, you can pull the current implementation (which I think
is in good shape and might only need cosmetic/stylistic code review changes
as opposed to wholesale API adjustments) from the KAFKA-6664 branch of
https://github.com/rondagostino/kafka.git.

Ron

On Fri, Apr 13, 2018 at 7:49 AM, Rajini Sivaram 
wrote:

> Hi Ron,
>
> Thanks for the notes and KIP update.
>
> Handling `sasl.jaas.config` as a special case is fine, but it would be
> better if we can do any substitutions before we create a `Configuration`
> object rather than expect the login module to do the substitution. That
> way, we will have a consistent substitution format for all login modules
> including built-in ones. And since we have users who already have their own
> login modules (before KIP-86), they will benefit from substitution too
> without adding additional code to the login module, But you have thought
> about this more in the context of OAuth, so this is more of a question. Are
> there use cases that require substitutions to be performed in a login
> module that cannot be done if the substitution is performed when the
> configuration is parsed?
>
> The ability to refer to other keys is generally quite useful. But as you
> said, "*there is a namespacing of sorts going on*". Even with regular
> configs, we have listener prefix, which is also a "*namespacing of sorts"*.
> Our current config framework doesn't represent these well. As you already
> noticed before, there is magic that removes prefixes, flattening the
> namespace. Perhaps that is not an issue if we want to allow references to
> keys that are in the global namespace (non-listener-prefixed) as well. But
> we probably want to make sure namespaces are handled consistently for `
> sasl.jaas.config` and other configs.
>
>
>
> On Tue, Apr 10, 2018 at 3:41 AM, Ron Dagostino  wrote:
>
> > Hi folks.  I updated KIP 269 to help clarify some of the issues mentioned
> > previously.  In particular, I added a new single-method UnderlyingValues
> > interface to make it clear how data is to be provided to
> > SubstitutableValues, and I added information about if/how the underlying
> > values might be re-read in case they can potentially change (basically an
> > instance of SubstitutableValues never re-reads anything, so if the
> > underlying values are expected to change a new instance of
> > SubstitutableValues must be allocated in order to have any chance of
> seeing
> > those changes).  I kept the KIP focused on the same JAAS use case for
> now,
> > but these additions/clarifications should help 

Re: [DISCUSS] KIP-269: Substitution Within Configuration Values

2018-04-13 Thread Rajini Sivaram
Hi Ron,

Thanks for the notes and KIP update.

Handling `sasl.jaas.config` as a special case is fine, but it would be
better if we can do any substitutions before we create a `Configuration`
object rather than expect the login module to do the substitution. That
way, we will have a consistent substitution format for all login modules
including built-in ones. And since we have users who already have their own
login modules (before KIP-86), they will benefit from substitution too
without adding additional code to the login module, But you have thought
about this more in the context of OAuth, so this is more of a question. Are
there use cases that require substitutions to be performed in a login
module that cannot be done if the substitution is performed when the
configuration is parsed?

The ability to refer to other keys is generally quite useful. But as you
said, "*there is a namespacing of sorts going on*". Even with regular
configs, we have listener prefix, which is also a "*namespacing of sorts"*.
Our current config framework doesn't represent these well. As you already
noticed before, there is magic that removes prefixes, flattening the
namespace. Perhaps that is not an issue if we want to allow references to
keys that are in the global namespace (non-listener-prefixed) as well. But
we probably want to make sure namespaces are handled consistently for `
sasl.jaas.config` and other configs.



On Tue, Apr 10, 2018 at 3:41 AM, Ron Dagostino  wrote:

> Hi folks.  I updated KIP 269 to help clarify some of the issues mentioned
> previously.  In particular, I added a new single-method UnderlyingValues
> interface to make it clear how data is to be provided to
> SubstitutableValues, and I added information about if/how the underlying
> values might be re-read in case they can potentially change (basically an
> instance of SubstitutableValues never re-reads anything, so if the
> underlying values are expected to change a new instance of
> SubstitutableValues must be allocated in order to have any chance of seeing
> those changes).  I kept the KIP focused on the same JAAS use case for now,
> but these additions/clarifications should help if we want to expand the
> scope to cluster/producer/consumer configs.
>
> Ron
>
> On Mon, Apr 9, 2018 at 11:22 AM, Ron Dagostino  wrote:
>
> > Hi folks.  Here is a summary of where I think we stand on this KIP and
> > what I believe it means for how we move forward.
> >
> >
> >- There is some desire to use substitution more broadly beyond just
> >JAAS module options.  Specifically, cluster/producer/consumer config
> values
> >such as ssl.keystore.password are places where substitution adds value
> >(dormant KIP 76
> > 76+Enable+getting+password+from+executable+rather+than+
> passing+as+plaintext+in+config+files>
> >was an attempt to add value here in the past).
> >- *More broad review of this KIP is needed given the potential for its
> >expanded scope*
> >- If substitution is applied more broadly, then the sasl.jaas.config
> >value should not have substitution performed on it at the same times
> as
> >other cluster/producer/consumer configs; that value should be passed
> >unchanged to the login module where substitution can be applied later.
> >- There are some adjustments to this KIP that should be made to
> >reflect the possibility of more broad use:
> >
> >
> >1. The use of delimiters that trigger a substitution attempt but that
> >   fail to parse should result in the text being passed through
> unchanged
> >   instead of raising an exception
> >   2. The application of substitution should generally be on an opt-in
> >   basis
> >   3. The implicit fact that substitution was associated with a
> >   namespace of sorts (i.e. saying that a default came from a
> particular key
> >   meant a JAAS module option) needs to be made explicit.  The
> namespace is
> >   defined by the Map that is passed into the SubstitutableValues()
> constructor
> >   4. It is not clear to me if the Map that is passed into the
> >   SubstitutableValues() constructor can be relied upon to contain
> only String
> >   values in the context of cluster/producer/consumer configs.  The
> >   AbstractConfig's so-called "originals" map seems to support values
> of type
> >   String, Boolean, Password, Integer, Short, Long, Number, List, and
> Class.
> >   It is not difficult to support non-String values in the Map that
> is passed
> >   to the SubstitutableValues() constructor, so I can explicitly add
> support
> >   for that.
> >
> > I don't think these changes impact usage in a JAAS context, so they do no
> > harm to the original use case while increasing the potential for more
> broad
> > use of the concept.
> >
> >
> > Ron
> >
> >
> > On Sun, Apr 8, 2018 at 8:46 PM, Ron Dagostino