[jira] [Comment Edited] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate
[ https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974621#comment-16974621 ] Jeremy edited comment on SPARK-29884 at 11/14/19 9:34 PM: -- After doing some debugging it seams like this might be in fabric k8s client. It tries to use .kube/config even if it gets all the parameters is needs from arguments. was (Author: jeremyjjbrown): After doing some debugging it seams like this might be in fabric k8s client. I tries to use .kube/config even if it gets all the parameters is needs from arguments. > spark-submit to kuberentes can not parse valid ca certificate > - > > Key: SPARK-29884 > URL: https://issues.apache.org/jira/browse/SPARK-29884 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 > Environment: A kuberentes cluster that has been in use for over 2 > years and handles large amounts of production payloads. >Reporter: Jeremy >Priority: Major > > spark submit can not be used to to schedule to kuberentes with oauth token > and cacert > {code:java} > spark-submit \ > --deploy-mode cluster \ > --class org.apache.spark.examples.SparkPi \ > --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ > --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ > --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ > --conf > spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt > \ > --conf spark.kubernetes.namespace=here-olp-3dds-sit \ > --conf spark.executor.instances=1 \ > --conf spark.app.name=spark-pi \ > --conf > spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 > \ > --conf > spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 > \ > local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar > {code} > returns > {code:java} > log4j:WARN No appenders could be found for logger > (io.fabric8.kubernetes.client.Config). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Exception in thread "main" > io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) > at > org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.security.cert.CertificateException: Could not parse > certificate: java.io.IOException: Empty input > at > sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) > at > java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) > at > io.fabric8.kubernetes.client.utils.H
[jira] [Commented] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate
[ https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974621#comment-16974621 ] Jeremy commented on SPARK-29884: After doing some debugging it seams like this might be in fabric k8s client. I tries to use .kube/config even if it gets all the parameters is needs from arguments. > spark-submit to kuberentes can not parse valid ca certificate > - > > Key: SPARK-29884 > URL: https://issues.apache.org/jira/browse/SPARK-29884 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 > Environment: A kuberentes cluster that has been in use for over 2 > years and handles large amounts of production payloads. >Reporter: Jeremy >Priority: Major > > spark submit can not be used to to schedule to kuberentes with oauth token > and cacert > {code:java} > spark-submit \ > --deploy-mode cluster \ > --class org.apache.spark.examples.SparkPi \ > --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ > --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ > --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ > --conf > spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt > \ > --conf spark.kubernetes.namespace=here-olp-3dds-sit \ > --conf spark.executor.instances=1 \ > --conf spark.app.name=spark-pi \ > --conf > spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 > \ > --conf > spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 > \ > local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar > {code} > returns > {code:java} > log4j:WARN No appenders could be found for logger > (io.fabric8.kubernetes.client.Config). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Exception in thread "main" > io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) > at > org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.security.cert.CertificateException: Could not parse > certificate: java.io.IOException: Empty input > at > sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) > at > java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78) > ... 13 more > Caused by: java.io.IOException: Empty input > at > sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106) > ... 19 more > {code} >
[jira] [Updated] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate
[ https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy updated SPARK-29884: --- Summary: spark-submit to kuberentes can not parse valid ca certificate (was: spark-Submit to kuberentes can not parse valid ca certificate) > spark-submit to kuberentes can not parse valid ca certificate > - > > Key: SPARK-29884 > URL: https://issues.apache.org/jira/browse/SPARK-29884 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 > Environment: A kuberentes cluster that has been in use for over 2 > years and handles large amounts of production payloads. >Reporter: Jeremy >Priority: Major > > spark submit can not be used to to schedule to kuberentes with oauth token > and cacert > {code:java} > spark-submit \ > --deploy-mode cluster \ > --class org.apache.spark.examples.SparkPi \ > --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ > --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ > --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ > --conf > spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt > \ > --conf spark.kubernetes.namespace=here-olp-3dds-sit \ > --conf spark.executor.instances=1 \ > --conf spark.app.name=spark-pi \ > --conf > spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 > \ > --conf > spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 > \ > local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar > {code} > returns > {code:java} > log4j:WARN No appenders could be found for logger > (io.fabric8.kubernetes.client.Config). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Exception in thread "main" > io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) > at > org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.security.cert.CertificateException: Could not parse > certificate: java.io.IOException: Empty input > at > sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) > at > java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) > at > io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) > at > io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78) > ... 13 more > Caused by: java.io.IOException: Empty input > at > sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106) > ... 19 more > {code} > The cacert and token are both valid and work even with curl > {code:java} > cur
[jira] [Updated] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate
[ https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy updated SPARK-29884: --- Description: spark submit can not be used to to schedule to kuberentes with oauth token and cacert {code:java} spark-submit \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt \ --conf spark.kubernetes.namespace=here-olp-3dds-sit \ --conf spark.executor.instances=1 \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \ local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar {code} returns {code:java} log4j:WARN No appenders could be found for logger (io.fabric8.kubernetes.client.Config). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.security.cert.CertificateException: Could not parse certificate: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) at java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78) ... 13 more Caused by: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106) ... 19 more {code} The cacert and token are both valid and work even with curl {code:java} curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H "Authorization: bearer $TOKEN" -v https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods -o out % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.117.233.37:443... * TCP_NODELAY set * Connected to api.borg-dev-1-aws-eu-west-1.k8s.in.here.com (10.117.233.37) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt CApath: none } [5 bytes data] * TLSv1.3 (OUT), TLS handshak
[jira] [Updated] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate
[ https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy updated SPARK-29884: --- Description: spark submit can not be used to to schedule to kuberentes with oauth token and cacert {code:java} spark-submit \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt \ --conf spark.kubernetes.namespace=here-olp-3dds-sit \ --conf spark.executor.instances=1 \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \ local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar {code} returns {code:java} log4j:WARN No appenders could be found for logger (io.fabric8.kubernetes.client.Config). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.security.cert.CertificateException: Could not parse certificate: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) at java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78) ... 13 more Caused by: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106) ... 19 more {code} The cacert and token are both valid and work even with curl {code:java} curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H "Authorization: bearer $TOKEN" -v https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods -o out % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.117.233.37:443... * TCP_NODELAY set * Connected to api.borg-dev-1-aws-eu-west-1.k8s.in.here.com (10.117.233.37) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt CApath: none } [5 bytes data] * TLSv1.3 (OUT), TLS handshak
[jira] [Created] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate
Jeremy created SPARK-29884: -- Summary: spark-Submit to kuberentes can not parse valid ca certificate Key: SPARK-29884 URL: https://issues.apache.org/jira/browse/SPARK-29884 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 2.4.4 Environment: A kuberentes cluster that has been in use for over 2 years and handles large amounts of production payloads. Reporter: Jeremy spark submit can not be used to to schedule to kuberentes with oauth token and cacert {code:java} spark-submit \ --deploy-mode cluster \ --class org.apache.spark.examples.SparkPi \ --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \ --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt \ --conf spark.kubernetes.namespace=here-olp-3dds-sit \ --conf spark.executor.instances=1 \ --conf spark.app.name=spark-pi \ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \ local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar {code} returns {code:java} log4j:WARN No appenders could be found for logger (io.fabric8.kubernetes.client.Config). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.security.cert.CertificateException: Could not parse certificate: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110) at java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104) at io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128) at io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122) at io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78) ... 13 more Caused by: java.io.IOException: Empty input at sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106) ... 19 more {code} The cacert and token are both valid and work even with curl {code:java} curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H "Authorization: bearer $TOKEN" -v https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods -o out % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.117.233.37:443... *
[jira] [Commented] (SPARK-10408) Autoencoder
[ https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007535#comment-16007535 ] Jeremy commented on SPARK-10408: I mentioned this in the PR, but I also want to mention here that I'll do a code review within the next few days. > Autoencoder > --- > > Key: SPARK-10408 > URL: https://issues.apache.org/jira/browse/SPARK-10408 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 1.5.0 >Reporter: Alexander Ulanov >Assignee: Alexander Ulanov > > Goal: Implement various types of autoencoders > Requirements: > 1)Basic (deep) autoencoder that supports different types of inputs: binary, > real in [0..1]. real in [-inf, +inf] > 2)Sparse autoencoder i.e. L1 regularization. It should be added as a feature > to the MLP and then used here > 3)Denoising autoencoder > 4)Stacked autoencoder for pre-training of deep networks. It should support > arbitrary network layers > References: > 1. Vincent, Pascal, et al. "Extracting and composing robust features with > denoising autoencoders." Proceedings of the 25th international conference on > Machine learning. ACM, 2008. > http://www.iro.umontreal.ca/~vincentp/Publications/denoising_autoencoders_tr1316.pdf > > 2. > http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Rifai_455.pdf, > 3. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. > (2010). Stacked denoising autoencoders: Learning useful representations in a > deep network with a local denoising criterion. Journal of Machine Learning > Research, 11(3371–3408). > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.3484&rep=rep1&type=pdf > 4, 5, 6. Bengio, Yoshua, et al. "Greedy layer-wise training of deep > networks." Advances in neural information processing systems 19 (2007): 153. > http://www.iro.umontreal.ca/~lisa/pointeurs/dbn_supervised_tr1282.pdf -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260328#comment-15260328 ] Jeremy commented on SPARK-11834: The upside to this change is that users who set a threshold won't silently clear it by also calling thresholds. Users should never call this anyways, as they'll only be doing binary classification - it's unlikely that anybody has thresholds in their current usage pattern. The main downside that I see is that it would need to be changed back after multi is added. If that event is far away this change may save a few users some confusion, but if it's imminent then it would be work to undo this change quite soon. Regardless, the change is only consequential in a pretty isolated case that involves using the function incorrectly - quite minor either way. > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241583#comment-15241583 ] Jeremy commented on SPARK-11834: To follow up, both setThreshold() and setThresholds() clear any value in the other threshold, and so checkThresholdConsistensy() will never be called. And so thresholds is being successfully ignored (though if set, it will clear the value for threshold, and the user will not see this). > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241511#comment-15241511 ] Jeremy edited comment on SPARK-11834 at 4/14/16 5:16 PM: - Looking into this JIRA, checkThresholdConsistency() should be called in getThreshold() in ml.classification.LogisticRegression.scala, but I can set inconsistent values with setThreshold() and setThresholds() and get predictions that run cleanly and that are consistent with the value for setThreshold(). Checking the documentation, it has been updated to reflect that only binary classes are supported. Pictures here: https://goo.gl/wCpJzx was (Author: jeremynixon): Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set inconsistent values with setThreshold() and setThresholds() and get predictions that run cleanly and that are consistent with the value for setThreshold(). Checking the documentation, it has been updated to reflect that only binary classes are supported. Pictures here: https://goo.gl/wCpJzx > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241511#comment-15241511 ] Jeremy edited comment on SPARK-11834 at 4/14/16 5:10 PM: - Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set inconsistent values with setThreshold() and setThresholds() and get predictions that run cleanly and that are consistent with the value for setThreshold(). Checking the documentation, it has been updated to reflect that only binary classes are supported. Pictures here: https://goo.gl/wCpJzx was (Author: jeremynixon): Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set inconsistent values with setThreshold() and setThresholds() and get predictions that run cleanly and that are consistent with the value for setThreshold(). Checking the documentation, it has been updated to reflect that only binary classes are supported. > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241511#comment-15241511 ] Jeremy edited comment on SPARK-11834 at 4/14/16 5:03 PM: - Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set inconsistent values with setThreshold() and setThresholds() and get predictions that run cleanly and that are consistent with the value for setThreshold(). Checking the documentation, it has been updated to reflect that only binary classes are supported. was (Author: jeremynixon): Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation
[ https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241511#comment-15241511 ] Jeremy commented on SPARK-11834: Looking into this JIRA, checkThresholdConsistency() and validateParams() are not called in ml.classification.LogisticRegression.scala, and so I can set > Ignore thresholds in LogisticRegression and update documentation > > > Key: SPARK-11834 > URL: https://issues.apache.org/jira/browse/SPARK-11834 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Affects Versions: 1.6.0 >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Minor > > ml.LogisticRegression does not support multiclass yet. So we should ignore > `thresholds` and update the documentation. In the next release, we can do > SPARK-11543. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13706) Python Example for Train Validation Split Missing
[ https://issues.apache.org/jira/browse/SPARK-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy updated SPARK-13706: --- Description: An example of how to use TrainValidationSplit in pyspark needs to be added. Should be consistent with the current examples. I'll submit a PR. (was: And example of how to use TrainValidationSplit in pyspark needs to be added. Should be consistent with the current examples. I'll submit a PR.) > Python Example for Train Validation Split Missing > - > > Key: SPARK-13706 > URL: https://issues.apache.org/jira/browse/SPARK-13706 > Project: Spark > Issue Type: Bug > Components: ML, MLlib, PySpark >Reporter: Jeremy >Priority: Minor > Original Estimate: 2h > Remaining Estimate: 2h > > An example of how to use TrainValidationSplit in pyspark needs to be added. > Should be consistent with the current examples. I'll submit a PR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13706) Python Example for Train Validation Split Missing
Jeremy created SPARK-13706: -- Summary: Python Example for Train Validation Split Missing Key: SPARK-13706 URL: https://issues.apache.org/jira/browse/SPARK-13706 Project: Spark Issue Type: Bug Components: ML, MLlib, PySpark Reporter: Jeremy Priority: Minor And example of how to use TrainValidationSplit in pyspark needs to be added. Should be consistent with the current examples. I'll submit a PR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12877) TrainValidationSplit is missing in pyspark.ml.tuning
[ https://issues.apache.org/jira/browse/SPARK-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178867#comment-15178867 ] Jeremy commented on SPARK-12877: Hi Xiangrui, Commenting! > TrainValidationSplit is missing in pyspark.ml.tuning > > > Key: SPARK-12877 > URL: https://issues.apache.org/jira/browse/SPARK-12877 > Project: Spark > Issue Type: New Feature > Components: PySpark >Affects Versions: 1.6.0 >Reporter: Wojciech Jurczyk > Fix For: 2.0.0 > > > I was investingating progress in SPARK-10759 and I noticed that there is no > TrainValidationSplit class in pyspark.ml.tuning module. > Java/Scala's examples SPARK-10759 use > org.apache.spark.ml.tuning.TrainValidationSplit that is not available from > Python and this blocks SPARK-10759. > Does the class have different name in PySpark, maybe? Also, I couldn't find > any JIRA task to saying it need to be implemented. Is it by design that the > TrainValidationSplit estimator is not ported to PySpark? If not, that is if > the estimator needs porting then I would like to contribute. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-10759) Missing Python code example in ML Programming guide
[ https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy updated SPARK-10759: --- Comment: was deleted (was: Cannot add example for code that doesn't exist.) > Missing Python code example in ML Programming guide > --- > > Key: SPARK-10759 > URL: https://issues.apache.org/jira/browse/SPARK-10759 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Raela Wang >Assignee: Apache Spark >Priority: Minor > Labels: starter > > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide
[ https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149266#comment-15149266 ] Jeremy commented on SPARK-10759: Cannot add example for code that doesn't exist. > Missing Python code example in ML Programming guide > --- > > Key: SPARK-10759 > URL: https://issues.apache.org/jira/browse/SPARK-10759 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Raela Wang >Assignee: Apache Spark >Priority: Minor > Labels: starter > > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13312) ML Model Selection via Train Validation Split example uses incorrect data
Jeremy created SPARK-13312: -- Summary: ML Model Selection via Train Validation Split example uses incorrect data Key: SPARK-13312 URL: https://issues.apache.org/jira/browse/SPARK-13312 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 1.6.0 Reporter: Jeremy Priority: Minor The Model Selection via Train Validation Split example uses classification data for a regression problem, and so returns the appropriate errors when run. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide
[ https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146336#comment-15146336 ] Jeremy commented on SPARK-10759: It's been a few months, so I've begun to work on this here: https://github.com/apache/spark/compare/master...JeremyNixon:add_py_ex_ml-guide It appears from the documentation that pyspark doesn't have an implementation of train-validation-split, or at least that it's not found in the tuning module like it is in the java and scala docs. Let me know if that's not the case and I'll pull an example from that function into the branch as well. If it doesn't exist, I can create and work on a JIRA requesting it. > Missing Python code example in ML Programming guide > --- > > Key: SPARK-10759 > URL: https://issues.apache.org/jira/browse/SPARK-10759 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.5.0 >Reporter: Raela Wang >Assignee: Lauren Moos >Priority: Minor > Labels: starter > > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation > http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org