[jira] [Commented] (FLINK-9103) SSL verification on TaskManager when parallelism > 1

2018-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437853#comment-16437853
 ] 

ASF GitHub Bot commented on FLINK-9103:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5789


> SSL verification on TaskManager when parallelism > 1
> 
>
> Key: FLINK-9103
> URL: https://issues.apache.org/jira/browse/FLINK-9103
> Project: Flink
>  Issue Type: Bug
>  Components: Docker, Network, Security
>Affects Versions: 1.4.0
>Reporter: Edward Rojas
>Assignee: Edward Rojas
>Priority: Major
> Attachments: job.log, task0.log
>
>
> In dynamic environments like Kubernetes, the SSL certificates can be 
> generated to use only the DNS addresses for validation of the identity of 
> servers, given that the IP can change eventually.
>  
> In this cases when executing Jobs with Parallelism set to 1, the SSL 
> validations are good and the Jobmanager can communicate with Task manager and 
> vice versa.
>  
> But with parallelism set to more than 1, SSL validation fails when Task 
> Managers communicate to each other as it seems to try to validate against IP 
> address:
> Caused by: java.security.cert.CertificateException: No subject alternative 
> names matching IP address 172.xx.xxx.xxx found 
> at sun.security.util.HostnameChecker.matchIP(HostnameChecker.java:168) 
> at sun.security.util.HostnameChecker.match(HostnameChecker.java:94) 
> at 
> sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
>  
> at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1601)
>  
> ... 21 more 
>  
> From the logs, it seems the task managers register successfully its full 
> address to Netty, but still the IP is used.
>  
> Attached pertinent logs from JobManager and a TaskManager. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9103) SSL verification on TaskManager when parallelism > 1

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419213#comment-16419213
 ] 

ASF GitHub Bot commented on FLINK-9103:
---

GitHub user EAlexRojas opened a pull request:

https://github.com/apache/flink/pull/5789

[FLINK-9103] Using CanonicalHostName instead of IP for SSL connection on 
NettyClient

## What is the purpose of the change

This pull request makes the NettyClient use the CanonicalHostName instead 
of the IP address for SSL communication. That way dynamic environments like 
kubernetes can be fully supported as certificates with wildcard DNS can be used.


## Brief change log

- Use CanonicalHostName instead of HostNameAddress to identify the server 
on the NettyClient


## Verifying this change

This change is already covered by existing tests, such as:

NettyClientServerSslTest (org.apache.flink.runtime.io.network.netty)
   - testValidSslConnection
   - testSslHandshakeError 

Also manually verified the change by running a 4 node kubernetes cluster 
with 1 JobManagers and 3 TaskManagers, using wildcard DNS certificates and 
executing a stateful streaming program with parallelism set to 2 and verifying 
that all nodes are able to communicate to each other successfully. 

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency):  no
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? not applicable


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/EAlexRojas/flink release-1.4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5789.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5789


commit 202672da7901fe7df912e6a057d6d0c29ccaf0fd
Author: EAlexRojas 
Date:   2018-03-29T14:01:24Z

Using CanonicalHostName instead of IP for SSL coonection on NettyClient




> SSL verification on TaskManager when parallelism > 1
> 
>
> Key: FLINK-9103
> URL: https://issues.apache.org/jira/browse/FLINK-9103
> Project: Flink
>  Issue Type: Bug
>  Components: Docker, Security
>Affects Versions: 1.4.0
>Reporter: Edward Rojas
>Priority: Major
> Attachments: job.log, task0.log
>
>
> In dynamic environments like Kubernetes, the SSL certificates can be 
> generated to use only the DNS addresses for validation of the identity of 
> servers, given that the IP can change eventually.
>  
> In this cases when executing Jobs with Parallelism set to 1, the SSL 
> validations are good and the Jobmanager can communicate with Task manager and 
> vice versa.
>  
> But with parallelism set to more than 1, SSL validation fails when Task 
> Managers communicate to each other as it seems to try to validate against IP 
> address:
> Caused by: java.security.cert.CertificateException: No subject alternative 
> names matching IP address 172.xx.xxx.xxx found 
> at sun.security.util.HostnameChecker.matchIP(HostnameChecker.java:168) 
> at sun.security.util.HostnameChecker.match(HostnameChecker.java:94) 
> at 
> sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252)
>  
> at 
> sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
>  
> at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1601)
>  
> ... 21 more 
>  
> From the logs, it seems the task managers register successfully its full 
> address to Netty, but still the IP is used.
>  
> Attached pertinent logs from JobManager and a TaskManager. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)