[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-09 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431361#comment-16431361
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Updated branch to include additional logging and a Warning for seed lists >= 20

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-06 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428457#comment-16428457
 ] 

Romain Hardouin commented on CASSANDRA-14361:
-

{quote}Caching behavior remains the same, given operators relying on hostnames
{quote}
What I meant is that having this feature could motivate operators to use DNS. 
So they must be aware of this setting and set it explicitely. 

I've read Oracle documentation but Java security file is not very explicit:
{noformat}
# default value is forever (FOREVER). For security reasons, this
# caching is made forever when a security manager is set. When a security
# manager is not set, the default behavior in this implementation
# is to cache for 30 seconds.
#
# NOTE: setting this to anything other than the default value can have
#   serious security implications. Do not set it unless
#   you are sure you are not exposed to DNS spoofing attack.
#
#networkaddress.cache.ttl=-1
{noformat}

"{{default value is forever (FOREVER)}}" is misleading.
That's why having CASSANDRA-14364 is nice.

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425814#comment-16425814
 ] 

Jon Haddad commented on CASSANDRA-14361:


{quote}
Would it be worth adding some info/debug level logging around what IPs got 
resolved for each time getSeeds() gets called?
{quote}

Yes, I think that's a good idea.  Maybe WARN if we're getting more than 20 
seeds back.  Under that I don't think the optimization matters, above that, 
it's probably not great.

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425809#comment-16425809
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

I would argue you need to be intentional about your DNS records, but I'm 
willing to concede that it moves config further away from the system which can 
hide the reason for behavior.

Would it be worth adding some info/debug level logging around what IPs got 
resolved for each time getSeeds() gets called?

At least that way the behavior of what gets looked up is observable and the 
answer to any potential WTFs is in the logs. 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425723#comment-16425723
 ] 

Jon Haddad commented on CASSANDRA-14361:


I agree with you, you could *explicitly* put 1000 seeds in there.  Would you 
expect there to be 1000 potential seeds in your gossip list by putting 1 thing 
in the seed list?  People don't even mostly know about this behavior.  I'm not 
sure it's optimal, and I'm not arguing against changing it, I'm just pointing 
it out and saying it's worth a discussion.  A point against it being, the new 
behavior might be a bit surprising.

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425713#comment-16425713
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Currently nothing stops users adding as many seeds as they want to the current 
implementation. So this change doesn't make the situation any worse. 

I do agree with you that gossip convergence based on the information provided 
in the seed list needs to be further explored, but I don't think this ticket is 
the right place to address it.

I would also argue that the seed providers only job is to act as an external 
oracle about membership independent of Gossip. Based on the current interface 
contract the seed provider should be able to return 1 or 1000 seeds, then 
whatever consumes that seeds list needs to make the correct decision about how 
to use that list.

However that's just my opinion and one I'm not inclined to argue that 
vigorously :) 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425664#comment-16425664
 ] 

Jon Haddad commented on CASSANDRA-14361:


Well, the behavior can be a bit surprising, and possibly could affect gossip 
convergence.  The seed gossip round is to accelerate convergence time, although 
I've never seen it proven that it works as advertised.  

It might be a good idea to cap the number of seeds we allow gossip to use 
there.  There has been talk of a changing the seed round behavior in 
CASSANDRA-9206, might be worth a read.

Adding [~jasobrown].

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425647#comment-16425647
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

{quote}

JVM property networkaddress.cache.ttl must be set otherwise operators 
will have to do a rolling restart of the cluster each time the seed list 
changes (unless default is not -1 on their platforms).

{quote}

Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].

{quote}

This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]

{quote}

Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior. 

 

 

 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425610#comment-16425610
 ] 

Jon Haddad commented on CASSANDRA-14361:


{quote}
I can't imagine any scenario where that is a sane choice. Even when that choice 
has been made, it only impacts the first startup of Cassandra and would not be 
on any critical path.
{quote}

This would also affect the third gossip round: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425244#comment-16425244
 ] 

Romain Hardouin commented on CASSANDRA-14361:
-

JVM property {{networkaddress.cache.ttl}} must be set otherwise operators will 
have to do a rolling restart of the cluster each time the seed list changes 
(unless default is not {{-1}} on their platforms).

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org