[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-06-13 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863279#comment-16863279
 ] 

Adar Dembo commented on KUDU-1948:
--

Thanks, [~acelyc111]. I think we should leave this open until we plumb the 
configuration file into clients.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Yingchun Lai
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-06-12 Thread Yingchun Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862737#comment-16862737
 ] 

Yingchun Lai commented on KUDU-1948:


In 3b58cfb3bd7ff39ed3f7382d8cca5e00d44d9c2d, I add cluster name resolver for 
CLI tools, and in dfd516dd0697e30eb810b249bb87ad9358bb8545, I added some docs 
for it.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Yingchun Lai
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-06-10 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860131#comment-16860131
 ] 

Adar Dembo commented on KUDU-1948:
--

Thanks [~acelyc111]! Could you convert that into a change to a file in 
{{docs/}}? Maybe to {{administration.adoc}}?

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Yingchun Lai
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-06-10 Thread Yingchun Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859818#comment-16859818
 ] 

Yingchun Lai commented on KUDU-1948:


Now we have added a config file for Kudu CLI tools, we can use CLI tool like 
"kudu tserver list @cluster_name" to access a Kudu cluster alternatively.

The 'cluster_name' is configured in a YAML format config file 
${KUDU_CONFIG}/kudurc, its content is like:

 
{code:java}
clusters_info:
  cluster_name1:
 master_addresses: ip1:port1,ip2:port2,ip3:port3
  cluster_name2:
 master_addresses: ip4:port4{code}
 

When we use CLI tools, if the master_addresses  section is start with a 
character '@', this tool will treat the following string as a cluster name, and 
then try to parse the config file mentioned above, use the master_addresses 
value of this cluster to access. On the other hand, if the master_addresses  
section is NOT start with a character '@', this tool will treat it as master 
addresses directly as before.

 

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Yingchun Lai
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-05-29 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851045#comment-16851045
 ] 

Adar Dembo commented on KUDU-1948:
--

[~acelyc111] merged a minimal config implementation in 3b58cfb3b. I'm leaving 
this open because he's going to write some docs that explain the config file 
format and how it works.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-04-11 Thread Yingchun Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815242#comment-16815242
 ] 

Yingchun Lai commented on KUDU-1948:


[~tlipcon] Do you have some advice?

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-04-07 Thread Grant Henke (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811697#comment-16811697
 ] 

Grant Henke commented on KUDU-1948:
---

I am onboard with everything proposed. I think I am okay with a default client 
config path too assuming it can be overridden. 

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-03-31 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806236#comment-16806236
 ] 

Adar Dembo commented on KUDU-1948:
--

[~acelyc111] sorry for not responding earlier; I'm hoping other people chime in 
so we can see whether there is a rough consensus for your proposal.

I for one am on board with YAML parsing, but somewhat hesitant about whether 
the CLI should automatically opt into the client config, and am curious to hear 
what others think about it.

 

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-03-24 Thread Yingchun Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800423#comment-16800423
 ] 

Yingchun Lai commented on KUDU-1948:


Some questions by gflags:
 * Seemed use  `–flagfile file_path` doesn't reduce the command line length much
 * Command line arguments order may looks odd. e.g. 
{code:java}
kudu table rename old_name --flagfile /path/to/cluster_one new_name{code}

I agree that comments in JSON are unsupported or hacked(On my own branch, I use 
an extra "comment" field in deed).

So my points are:
 * Configuration file is for CLI tool, not for client library, we can add a 
cluster name resolve feature for it.
 * Do not change the old command line style, i.e. keep cluster name as a 
required argument follow the action name.
 * Use YAML as the config file format(We have to introduce a third-party YAML 
parser for it).

 

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-03-24 Thread Adar Dembo (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800185#comment-16800185
 ] 

Adar Dembo commented on KUDU-1948:
--

[~acelyc111] couldn't you do cluster name resolution via different flag files? 
For example:

File named 'cluster_one' with contents:
{noformat}
--master_addresses=host1,host2,host3
{noformat}

File named 'cluster_two' with contents:
{noformat}
--master_addresses=host4,host5,host6
{noformat}

Then switching between clusters become:
{noformat}
kudu table list --flagfile /path/to/cluster_one
{noformat}

Or:
{noformat}
kudu table list --flagfile /path/to/cluster_two
{noformat}

>From reading through Todd and Dan's past comments, it sounds like there's 
>still some uncertainty as to whether CLI tools should automatically opt into 
>client configs from a well-known location or not. If not, then I think flag 
>files, although clunkier than pure cluster name resolution, get us 80% of the 
>way there. What do you think?

{quote}
We can reuse most code of JsonReader, and introduce a new class JsonFileReader 
to read configurations from a JSON config file, place it in a path like 
$KUDU_HOME, so it's not needed to add any new gflags.
{quote}

I would strongly recommend against using JSON for configuration because you 
can't use comments, and comments really important for config file maintenance. 
Some JSON parsers support comments, and there are hacks (i.e. include a 
"comment" field in objects that is ignored), but by and large it's not 
universal and therefore rare. This is one of the reasons that Todd originally 
suggested YAML, and I'd be fine with that or any other format that supported 
commenting.


> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2019-03-22 Thread Yingchun Lai (JIRA)


[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799491#comment-16799491
 ] 

Yingchun Lai commented on KUDU-1948:


Agree with [~danburkert] .

But I want to introduce a simple cluster name resolver for CLI tools. CLI tool 
is an application right? I think it's reasonable to introduce a simple 
configuration file for it.

As a Kudu administrator, I'm boring to type multi-masters ip:port when I use 
CLI tools to access a cluster, they are long and not easy to remember, instead, 
it's easy to use it like:

 
{code:java}
kudu table list cluster_name{code}
We can use either master address list or cluster name to access a cluster.

Of course, there should be a way to distinguish them:

eg.
 * cluster name should not contain any ':' or ','
 * master address list string must contain ':' or ','
 * default port should not omit master in address list string 

We can reuse most code of JsonReader, and introduce a new class JsonFileReader 
to read configurations from a JSON config file, place it in a path like 
$KUDU_HOME, so it's not needed to add any new gflags.

 

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2018-02-12 Thread Dan Burkert (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361544#comment-16361544
 ] 

Dan Burkert commented on KUDU-1948:
---

I want to chime in here since I've traditionally played the devil's advocate 
position that Kudu should _not_ have client configs. The 'guiding principal' 
behind this argument is that libraries should not include a configuration 
framework*.  A configuration framework should purely be the concern of the 
end-user application.

This argument is muddied somewhat by a combination of factors:
 * The split between application and library isn't always clear.  In Kudu's 
case it's clear that the master and tserver processes are applications, while 
the clients are libraries.  The provided CLI tools are less clear, but in this 
context I consider them applications (and indeed, they already ship with the 
gflags configuration framework).
 * The JVM and associated ecosystem has historically done a poor job at 
distinguishing between applications and libraries.  JAR files are meant to 
serve both purposes, and as a result they do each badly***.
 * Hadoop and associated ecosystem has historically done a poor job at 
distinguishing between applications and libraries.  The Hadoop Configuration 
class/framework is used pervasively which leads to a host of issues.

> If supported by kudu-spark this would help reduce the friction to 
> reading/writing Kudu data – just put in your table name and go!

Client configs are often used as a poor substitute for service discovery**.  
Although not widely recognized as such, Hadoop _already has_ a service 
discovery component: the Hive MetaStore.  It's on the Kudu road map to 
integrate with the HMS, at which point Spark and other users can discover Kudu 
tables along with the necessary information to connect (eg master addresses) 
there.  Note that the same guiding principal applies to service discovery: only 
applications should be using them; libraries should never, for instance, have a 
built-in HMS or Zookeeper or etcd connection.

 

* In this context, 'configuration framework' means something that picks up 
config properties from well known locations on disk, or from the environment, 
or from a database/zookeeper, or more generally anything not passed explicitly 
to the library through an API.  Not included under 'configuration framework' is 
APIs for passing configuration into the library, including builders and 
un-typed map style APIs.

** They are a poor substitute because they are not centrally managed, so 
changes must be pushed separately to every client configuration copy.  Vendors 
have papered over this by making it easy with the equivalent of a distributed 
scp, but the fundamental crappiness of the solution remains.

*** This is why, I'm convinced, patterns like DI flourish in Java.  They are 
over-engineered band-aids which address the symptoms of failing to keep the 
lines between library and application clean.

 I'm fully aware of how absurd it is to suggest adding _yet another_ 
responsibility to the HMS at which it will inevitably be pretty poor at, but 
the fact of the matter is that the HMS already serves this role.  In my opinion 
it's better to acknowledge that the HMS serves this role and work towards 
improving its suitability than to indirectly paper over the issue with 
client-side configs.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases 

[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2018-02-12 Thread Jeremy Beard (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361317#comment-16361317
 ] 

Jeremy Beard commented on KUDU-1948:


If supported by kudu-spark this would help reduce the friction to 
reading/writing Kudu data -- just put in your table name and go!

It would be good for Envelope too where there's currently a lot of incentive to 
roll your own client config file for import by each pipeline config file, in 
order to avoid hard-coding the Kudu master addresses all over the place.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>Priority: Major
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2018-01-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319476#comment-16319476
 ] 

Todd Lipcon commented on KUDU-1948:
---

https://kudu.apache.org/community.html

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2018-01-09 Thread Min Du (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319352#comment-16319352
 ] 

Min Du commented on KUDU-1948:
--

Hi Todd, 

Thank you for your prompt response. 

I agree that this ticket may not be the right place. Could you please provide a 
link for the user mailing list ?
(Sorry I am not familiar with Kudu reporting places. )

Thanks a lot. 

Cheers,
Min

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2018-01-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319342#comment-16319342
 ] 

Todd Lipcon commented on KUDU-1948:
---

Hi Min. I think it would be best to ask this question on the user mailing list. 
This ticket is for cluster-wide configurations whereas timeouts should be a 
client-specific setting.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>Assignee: Grant Henke
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2017-03-20 Thread Adar Dembo (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933592#comment-15933592
 ] 

Adar Dembo commented on KUDU-1948:
--

I have a few questions:
* Does a configuration file preclude client APIs? That is, if there's a 
file-based mechanism for specifying something like require_authentication, does 
that mean there's no corresponding API call for it? I'd argue we need both; API 
for completeness (and consistency with existing API options like 
master_addresses) and config file for simplicity.
* If an option can be specified via both client API and config file, which 
takes precedence? I'd argue that the client API takes precedence.
* require_authentication and require_encryption could be viewed as 
application-specific. Suppose the server's rpc_authentication is set to 
'optional'. This means applications get to choose whether authentication is a 
requirement for them or not, right?


> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

2017-03-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931961#comment-15931961
 ] 

Todd Lipcon commented on KUDU-1948:
---

I chatted offline with [~danburkert] about this for a few minutes last week. 
Our proposal was something like the following:

- the client builder API would continue to have no "default" behavior. But it 
would gain a new call something like:
{code}
new KuduClientBuilder().loadConfigurationForCluster("my-cluster")
{code}

This would have the effect of looking in various locations for a configured 
cluster called 'my-cluster':
- $KUDUCONFIG
- $HOME/.kudurc
- /etc/kudu/kudurc

These would be some simple files (perhaps YAML) that look like:

{code}
clusters:
  my-cluster:
masters:
  - foo1.example.com
  - foo2.example.com
  - foo3.example.com
require_authentication: true
require_encryption: true
master_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM"
tserver_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM"
  other-cluster:
masters:
  - other.example.com
{code}

We also established some guiding principals:

- we should use these files only for configurations that we'd expect the 
_operator_ to be setting (eg security policies) and not for anything we expect 
that different applications would want to configure differently (eg timeouts)
- all configs should be clearly scoped per-cluster (to preserve the ability to 
do cross-cluster applications without gymnastics)
- these files should _only_ be read from the client, and not from servers
- these files should be referenced only when an API explicitly references them 
(eg the "loadConfigurationForCluster()" API). We should avoid implicit behavior 
in library code.
-- Command line tools like 'kudu table list' could potentially be more 
implicit, or they could take a cluster identifier.


All the above is just a brainstorm/draft, subject to change of course. When we 
get to actually implementing this we should transfer everything into a google 
doc, do normal design/review process, etc.

> Client-side configuration of cluster details
> 
>
> Key: KUDU-1948
> URL: https://issues.apache.org/jira/browse/KUDU-1948
> Project: Kudu
>  Issue Type: New Feature
>  Components: client, security
>Affects Versions: 1.3.0
>Reporter: Todd Lipcon
>
> In the beginning, Kudu clients were configured with only the address of the 
> single Kudu master. This was nice and simple, and there was no need for a 
> client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master 
> addresses. This wasn't awful, but started to be a bit aggravating when trying 
> to use tools on a multi-master cluster (who wants to type out three long 
> hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the 
> client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent 
> MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other 
> than 'kudu/@REALM' then the client needs to know to expect it and fetch 
> the appropriate service ticket. (Note this isn't yet supported but would like 
> to be!)
> In the future, there are other items that might be best specified as part of 
> a client configuration as well (e.g. CA cert for BYO PKI, wire compression 
> options, etc).
> For the above use cases it would be nicer to allow the various options to be 
> specified in a configuration file rather than adding specific APIs for all 
> options.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)