[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863279#comment-16863279 ] Adar Dembo commented on KUDU-1948: -- Thanks, [~acelyc111]. I think we should leave this open until we plumb the configuration file into clients. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Yingchun Lai >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862737#comment-16862737 ] Yingchun Lai commented on KUDU-1948: In 3b58cfb3bd7ff39ed3f7382d8cca5e00d44d9c2d, I add cluster name resolver for CLI tools, and in dfd516dd0697e30eb810b249bb87ad9358bb8545, I added some docs for it. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Yingchun Lai >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860131#comment-16860131 ] Adar Dembo commented on KUDU-1948: -- Thanks [~acelyc111]! Could you convert that into a change to a file in {{docs/}}? Maybe to {{administration.adoc}}? > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Yingchun Lai >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859818#comment-16859818 ] Yingchun Lai commented on KUDU-1948: Now we have added a config file for Kudu CLI tools, we can use CLI tool like "kudu tserver list @cluster_name" to access a Kudu cluster alternatively. The 'cluster_name' is configured in a YAML format config file ${KUDU_CONFIG}/kudurc, its content is like: {code:java} clusters_info: cluster_name1: master_addresses: ip1:port1,ip2:port2,ip3:port3 cluster_name2: master_addresses: ip4:port4{code} When we use CLI tools, if the master_addresses section is start with a character '@', this tool will treat the following string as a cluster name, and then try to parse the config file mentioned above, use the master_addresses value of this cluster to access. On the other hand, if the master_addresses section is NOT start with a character '@', this tool will treat it as master addresses directly as before. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Yingchun Lai >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851045#comment-16851045 ] Adar Dembo commented on KUDU-1948: -- [~acelyc111] merged a minimal config implementation in 3b58cfb3b. I'm leaving this open because he's going to write some docs that explain the config file format and how it works. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815242#comment-16815242 ] Yingchun Lai commented on KUDU-1948: [~tlipcon] Do you have some advice? > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811697#comment-16811697 ] Grant Henke commented on KUDU-1948: --- I am onboard with everything proposed. I think I am okay with a default client config path too assuming it can be overridden. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806236#comment-16806236 ] Adar Dembo commented on KUDU-1948: -- [~acelyc111] sorry for not responding earlier; I'm hoping other people chime in so we can see whether there is a rough consensus for your proposal. I for one am on board with YAML parsing, but somewhat hesitant about whether the CLI should automatically opt into the client config, and am curious to hear what others think about it. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800423#comment-16800423 ] Yingchun Lai commented on KUDU-1948: Some questions by gflags: * Seemed use `–flagfile file_path` doesn't reduce the command line length much * Command line arguments order may looks odd. e.g. {code:java} kudu table rename old_name --flagfile /path/to/cluster_one new_name{code} I agree that comments in JSON are unsupported or hacked(On my own branch, I use an extra "comment" field in deed). So my points are: * Configuration file is for CLI tool, not for client library, we can add a cluster name resolve feature for it. * Do not change the old command line style, i.e. keep cluster name as a required argument follow the action name. * Use YAML as the config file format(We have to introduce a third-party YAML parser for it). > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800185#comment-16800185 ] Adar Dembo commented on KUDU-1948: -- [~acelyc111] couldn't you do cluster name resolution via different flag files? For example: File named 'cluster_one' with contents: {noformat} --master_addresses=host1,host2,host3 {noformat} File named 'cluster_two' with contents: {noformat} --master_addresses=host4,host5,host6 {noformat} Then switching between clusters become: {noformat} kudu table list --flagfile /path/to/cluster_one {noformat} Or: {noformat} kudu table list --flagfile /path/to/cluster_two {noformat} >From reading through Todd and Dan's past comments, it sounds like there's >still some uncertainty as to whether CLI tools should automatically opt into >client configs from a well-known location or not. If not, then I think flag >files, although clunkier than pure cluster name resolution, get us 80% of the >way there. What do you think? {quote} We can reuse most code of JsonReader, and introduce a new class JsonFileReader to read configurations from a JSON config file, place it in a path like $KUDU_HOME, so it's not needed to add any new gflags. {quote} I would strongly recommend against using JSON for configuration because you can't use comments, and comments really important for config file maintenance. Some JSON parsers support comments, and there are hacks (i.e. include a "comment" field in objects that is ignored), but by and large it's not universal and therefore rare. This is one of the reasons that Todd originally suggested YAML, and I'd be fine with that or any other format that supported commenting. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799491#comment-16799491 ] Yingchun Lai commented on KUDU-1948: Agree with [~danburkert] . But I want to introduce a simple cluster name resolver for CLI tools. CLI tool is an application right? I think it's reasonable to introduce a simple configuration file for it. As a Kudu administrator, I'm boring to type multi-masters ip:port when I use CLI tools to access a cluster, they are long and not easy to remember, instead, it's easy to use it like: {code:java} kudu table list cluster_name{code} We can use either master address list or cluster name to access a cluster. Of course, there should be a way to distinguish them: eg. * cluster name should not contain any ':' or ',' * master address list string must contain ':' or ',' * default port should not omit master in address list string We can reuse most code of JsonReader, and introduce a new class JsonFileReader to read configurations from a JSON config file, place it in a path like $KUDU_HOME, so it's not needed to add any new gflags. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361544#comment-16361544 ] Dan Burkert commented on KUDU-1948: --- I want to chime in here since I've traditionally played the devil's advocate position that Kudu should _not_ have client configs. The 'guiding principal' behind this argument is that libraries should not include a configuration framework*. A configuration framework should purely be the concern of the end-user application. This argument is muddied somewhat by a combination of factors: * The split between application and library isn't always clear. In Kudu's case it's clear that the master and tserver processes are applications, while the clients are libraries. The provided CLI tools are less clear, but in this context I consider them applications (and indeed, they already ship with the gflags configuration framework). * The JVM and associated ecosystem has historically done a poor job at distinguishing between applications and libraries. JAR files are meant to serve both purposes, and as a result they do each badly***. * Hadoop and associated ecosystem has historically done a poor job at distinguishing between applications and libraries. The Hadoop Configuration class/framework is used pervasively which leads to a host of issues. > If supported by kudu-spark this would help reduce the friction to > reading/writing Kudu data – just put in your table name and go! Client configs are often used as a poor substitute for service discovery**. Although not widely recognized as such, Hadoop _already has_ a service discovery component: the Hive MetaStore. It's on the Kudu road map to integrate with the HMS, at which point Spark and other users can discover Kudu tables along with the necessary information to connect (eg master addresses) there. Note that the same guiding principal applies to service discovery: only applications should be using them; libraries should never, for instance, have a built-in HMS or Zookeeper or etcd connection. * In this context, 'configuration framework' means something that picks up config properties from well known locations on disk, or from the environment, or from a database/zookeeper, or more generally anything not passed explicitly to the library through an API. Not included under 'configuration framework' is APIs for passing configuration into the library, including builders and un-typed map style APIs. ** They are a poor substitute because they are not centrally managed, so changes must be pushed separately to every client configuration copy. Vendors have papered over this by making it easy with the equivalent of a distributed scp, but the fundamental crappiness of the solution remains. *** This is why, I'm convinced, patterns like DI flourish in Java. They are over-engineered band-aids which address the symptoms of failing to keep the lines between library and application clean. I'm fully aware of how absurd it is to suggest adding _yet another_ responsibility to the HMS at which it will inevitably be pretty poor at, but the fact of the matter is that the HMS already serves this role. In my opinion it's better to acknowledge that the HMS serves this role and work towards improving its suitability than to indirectly paper over the issue with client-side configs. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361317#comment-16361317 ] Jeremy Beard commented on KUDU-1948: If supported by kudu-spark this would help reduce the friction to reading/writing Kudu data -- just put in your table name and go! It would be good for Envelope too where there's currently a lot of incentive to roll your own client config file for import by each pipeline config file, in order to avoid hard-coding the Kudu master addresses all over the place. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke >Priority: Major > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319476#comment-16319476 ] Todd Lipcon commented on KUDU-1948: --- https://kudu.apache.org/community.html > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319352#comment-16319352 ] Min Du commented on KUDU-1948: -- Hi Todd, Thank you for your prompt response. I agree that this ticket may not be the right place. Could you please provide a link for the user mailing list ? (Sorry I am not familiar with Kudu reporting places. ) Thanks a lot. Cheers, Min > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319342#comment-16319342 ] Todd Lipcon commented on KUDU-1948: --- Hi Min. I think it would be best to ask this question on the user mailing list. This ticket is for cluster-wide configurations whereas timeouts should be a client-specific setting. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon >Assignee: Grant Henke > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933592#comment-15933592 ] Adar Dembo commented on KUDU-1948: -- I have a few questions: * Does a configuration file preclude client APIs? That is, if there's a file-based mechanism for specifying something like require_authentication, does that mean there's no corresponding API call for it? I'd argue we need both; API for completeness (and consistency with existing API options like master_addresses) and config file for simplicity. * If an option can be specified via both client API and config file, which takes precedence? I'd argue that the client API takes precedence. * require_authentication and require_encryption could be viewed as application-specific. Suppose the server's rpc_authentication is set to 'optional'. This means applications get to choose whether authentication is a requirement for them or not, right? > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details
[ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931961#comment-15931961 ] Todd Lipcon commented on KUDU-1948: --- I chatted offline with [~danburkert] about this for a few minutes last week. Our proposal was something like the following: - the client builder API would continue to have no "default" behavior. But it would gain a new call something like: {code} new KuduClientBuilder().loadConfigurationForCluster("my-cluster") {code} This would have the effect of looking in various locations for a configured cluster called 'my-cluster': - $KUDUCONFIG - $HOME/.kudurc - /etc/kudu/kudurc These would be some simple files (perhaps YAML) that look like: {code} clusters: my-cluster: masters: - foo1.example.com - foo2.example.com - foo3.example.com require_authentication: true require_encryption: true master_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM" tserver_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM" other-cluster: masters: - other.example.com {code} We also established some guiding principals: - we should use these files only for configurations that we'd expect the _operator_ to be setting (eg security policies) and not for anything we expect that different applications would want to configure differently (eg timeouts) - all configs should be clearly scoped per-cluster (to preserve the ability to do cross-cluster applications without gymnastics) - these files should _only_ be read from the client, and not from servers - these files should be referenced only when an API explicitly references them (eg the "loadConfigurationForCluster()" API). We should avoid implicit behavior in library code. -- Command line tools like 'kudu table list' could potentially be more implicit, or they could take a cluster identifier. All the above is just a brainstorm/draft, subject to change of course. When we get to actually implementing this we should transfer everything into a google doc, do normal design/review process, etc. > Client-side configuration of cluster details > > > Key: KUDU-1948 > URL: https://issues.apache.org/jira/browse/KUDU-1948 > Project: Kudu > Issue Type: New Feature > Components: client, security >Affects Versions: 1.3.0 >Reporter: Todd Lipcon > > In the beginning, Kudu clients were configured with only the address of the > single Kudu master. This was nice and simple, and there was no need for a > client "configuration file". > Then, we added multi-masters, and the client API had to take a list of master > addresses. This wasn't awful, but started to be a bit aggravating when trying > to use tools on a multi-master cluster (who wants to type out three long > hostnames in a 'ksck' command line every time?). > Now with security, we have a couple more bits of configuration for the > client. Namely: > - "require SSL" and "require authentication" booleans -- necessary to prevent > MITM downgrade attacks > - custom Kerberos principal -- if the server wants to use a principal other > than 'kudu/@REALM' then the client needs to know to expect it and fetch > the appropriate service ticket. (Note this isn't yet supported but would like > to be!) > In the future, there are other items that might be best specified as part of > a client configuration as well (e.g. CA cert for BYO PKI, wire compression > options, etc). > For the above use cases it would be nicer to allow the various options to be > specified in a configuration file rather than adding specific APIs for all > options. -- This message was sent by Atlassian JIRA (v6.3.15#6346)