[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-11-07 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22087
  
PredictionModel as a super-class of unsupervised and supervised seems sane 
to me. Returning a double to unify the signature also seems sane, although the 
thought of casting it might irk people.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-11-06 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22087
  
This LGTM, but it raises a more general question about the lack of 
single-sample prediction over the entire hierarchy. For example (IMO) there 
should be some kind of single-sample method associated with 
`org.apache.spark.ml.Model`, or `org.apache.spark.ml.Transformer`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files

2018-10-29 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
@mccheah integration testing is passing with the latest container selection 
policy, good to merge?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files

2018-10-29 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
thanks @shaneknapp !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files

2018-10-29 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
k8s integration test failure appears repeatable:
```
Downloading from central: 
https://repo.maven.apache.org/maven2/com/squareup/okhttp3/okhttp/3.9.1/okhttp-3.9.1.jar
[WARNING] Failed to write tracking file 
/home/jenkins/.m2/repository/com/squareup/okhttp3/okhttp/3.9.1/okhttp-3.9.1.jar.lastUpdated
java.io.FileNotFoundException: 
/home/jenkins/.m2/repository/com/squareup/okhttp3/okhttp/3.9.1/okhttp-3.9.1.jar.lastUpdated
 (Permission denied)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files

2018-10-29 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
Latest container selection w/ default to first LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22820: [SPARK-25828][K8S] Bumping Kubernetes-Client version to ...

2018-10-26 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22820
  
merging


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22820: [SPARK-25828][K8S][BUILD] Bumping Kubernetes-Clie...

2018-10-25 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22820#discussion_r228321586
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala
 ---
@@ -157,7 +157,10 @@ private[spark] object KubernetesUtils {
   }.getOrElse(Seq(("container state", "N/A")))
   }
 
-  def formatTime(time: Time): String = {
-if (time != null) time.getTime else "N/A"
+  def formatTime(time: String): String = {
--- End diff --

I would keep it, but possibly add a comment about data type change


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22820: [SPARK-25828][K8S][BUILD] Bumping Kubernetes-Clie...

2018-10-25 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22820#discussion_r228321407
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -45,7 +45,8 @@ logs and remains in "completed" state in the Kubernetes 
API until it's eventuall
 
 Note that in the completed state, the driver pod does *not* use any 
computational or memory resources.
 
-The driver and executor pod scheduling is handled by Kubernetes. It is 
possible to schedule the
+The driver and executor pod scheduling is handled by Kubernetes. 
Communication to the Kubernetes API is done via fabric8, and we are 
+currently running kubernetes-client version 
4.1.0. Make sure that any infrastructure additions are weary of 
said version. It is possible to schedule the
--- End diff --

nit: "wary" ?   or "aware of " ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22838: [SPARK-25835][K8s] fix issues with k8s tests

2018-10-25 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22838
  
LGTM -  the k8s integration tests are off by default, can be run via 
`-P...`, and are passwing via Jenkins 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-22 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22433
  
@suryag10, all things being equal, it is considered preferable to provide 
testing for new functionality on the same PR. Are there are logistical problems 
adding testing here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-22 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22433
  
@suryag10 you were probably encountering github server problems from 
yesterday:
https://status.github.com/messages


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2018-10-19 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/13440
  
update - I'm consulting with some teammates about what it might mean to 
also support Bayesian variations on split quality, since there has been a lot 
of interest in the last few years regarding Bayesian alternatives to more 
traditional p-value based statistics.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integr...

2018-10-16 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22608#discussion_r225625939
  
--- Diff: bin/docker-image-tool.sh ---
@@ -71,18 +71,29 @@ function build {
 --build-arg
 base_img=$(image_ref spark)
   )
-  local BASEDOCKERFILE=${BASEDOCKERFILE:-"$IMG_PATH/spark/Dockerfile"}
-  local 
PYDOCKERFILE=${PYDOCKERFILE:-"$IMG_PATH/spark/bindings/python/Dockerfile"}
-  local RDOCKERFILE=${RDOCKERFILE:-"$IMG_PATH/spark/bindings/R/Dockerfile"}
+  local 
BASEDOCKERFILE=${BASEDOCKERFILE:-"$IMG_PATH/main/dockerfiles/spark/Dockerfile"}
+  local 
PYDOCKERFILE=${PYDOCKERFILE:-"$IMG_PATH/main/dockerfiles/spark/bindings/python/Dockerfile"}
+  local 
KDOCKERFILE=${KDOCKERFILE:-"$IMG_PATH/test/dockerfiles/spark/kerberos/Dockerfile"}
+  local 
RDOCKERFILE=${RDOCKERFILE:-"$IMG_PATH/main/dockerfiles/spark/bindings/R/Dockerfile"}
 
+  # Spark Base
   docker build $NOCACHEARG "${BUILD_ARGS[@]}" \
 -t $(image_ref spark) \
 -f "$BASEDOCKERFILE" .
 
+  # PySpark
   docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \
 -t $(image_ref spark-py) \
 -f "$PYDOCKERFILE" .
 
+  # The following are optional docker builds for Kerberos Testing
+  docker pull ifilonenko/hadoop-base:latest
--- End diff --

I had also discussed concerns re: pulling a "personal" image w/ @ifilonenko 
- I'd prefer some other solution here. Docker build seems plausible, but it 
would be best if it can be done once per test run for efficiency reasons


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-06 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22608
  
Although this is a large patch, its impact on existing code is small, and 
it is nearly all testing code. Unless the tests themselves are unstable, I'd 
consider this plausible to include with the 2.4 release.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-03 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22608
  
re: hadoop-2.7.3.tgz is that something Shane needs to install on the 
testing infra, to build the images you want?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-03 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22608
  
@ifilonenko  can we work with the existing service-account-name config 
parameters for obtaining the resource permissions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-01 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22433
  
If possible, there should be some basic integration testing. Run a thrift 
server command against the minishift cluster used by the other testing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-10-01 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22433
  
In the scenario of a cluster-mode submission, what is the command-line 
behavior?  Does the thrift-server script "block" until the thrift server pod is 
shut down?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2018-09-22 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/13440
  
I think targeting 3.0 with a refactor makes the most sense.  There's no way 
to do this without making small breaking changes, but slightly larger changes 
could clean up the design.  `ImpurityCalculator` can subsume `Impurity`, and a 
more general rethinking of gain and impurity can be accommodated too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-19 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218997600
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Gini.scala ---
@@ -71,6 +71,23 @@ object Gini extends Impurity {
   @Since("1.1.0")
   def instance: this.type = this
 
+  /**
+   * :: DeveloperApi ::
+   * p-values for test-statistic measures, unsupported for [[Gini]]
+   */
+  @Since("2.2.0")
+  @DeveloperApi
+  def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): 
Double =
--- End diff --

@srowen @willb 
I cached the design of the metrics back in. In general, `Impurity` already 
uses methods that are only defined on certain impurity sub-classes, and so this 
new method does not change that situation.

My take on the "problem" is that the existing measures are all based on a 
localized concept of "purity" (or impurity) that can be calculated using _only_ 
the data at a single node. Splitting based on statistical tests (p-values) 
breaks that model, since it is making use of a more generalized concept of 
split quality that requires the sample populations of both children from a 
candidate split. A maximally general signature would probably involve the 
parent and both children.

Another kink in the current design is that `ImpurityCalculator` is 
essentially parallel to `Impurity`, and in fact 
`ImpurityCalculator#calculate()` is how impurity measures are currently 
requested. `Impurity` seems somewhat redundant, and might be factored out in 
favor of `ImpurityCalculator`. The current signature `calculate()` might be 
generalized into a more inclusive concept of split quality that expects to make 
use of {parent,left,right}.

Calls to `calculate()` are not very wide-spread but threading that change 
through is outside the scope of this particular PR. If people are interested in 
that kind of refactoring I could look into it in the near future but probably 
not in the next couple weeks.

That kind of change would also be API breaking and so a good target for 3.0


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-17 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218252461
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Impurity.scala ---
@@ -52,6 +52,49 @@ trait Impurity extends Serializable {
   @Since("1.0.0")
   @DeveloperApi
   def calculate(count: Double, sum: Double, sumSquares: Double): Double
+
+  /**
+   * :: DeveloperApi ::
+   * Compute a test-statistic p-value quality measure from left and right 
split populations
+   * @param calcL impurity calculator for the left split population
+   * @param calcR impurity calculator for the right split population
+   * @return The p-value for the null hypothesis; that left and right 
split populations
+   * represent the same distribution
+   * @note Unless overridden this method will fail with an exception, for 
backward compatability
+   */
+  @Since("2.2.0")
+  @DeveloperApi
+  def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): 
Double
+
+  /**
+   * :: DeveloperApi ::
+   * Determine if this impurity measure is a test-statistic measure
+   * @return True if this is a split quality measure based on a test 
statistic (i.e. returns a
+   * p-value) or false otherwise.
+   * @note Unless overridden this method returns false by default, for 
backward compatability
+   */
+  @Since("2.2.0")
+  @DeveloperApi
+  def isTestStatistic: Boolean
--- End diff --

Can this be customized or extended externally to spark?  I'm wondering why 
it is public.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-17 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218252156
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Gini.scala ---
@@ -71,6 +71,23 @@ object Gini extends Impurity {
   @Since("1.1.0")
   def instance: this.type = this
 
+  /**
+   * :: DeveloperApi ::
+   * p-values for test-statistic measures, unsupported for [[Gini]]
+   */
+  @Since("2.2.0")
+  @DeveloperApi
+  def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): 
Double =
--- End diff --

I suspect that the generalization is closer to my newer signature 
`val pval = imp.calculate(leftImpurityCalculator, rightImpurityCalculator)`
where you have all the context from the left and right nodes. The existing 
gain-based calculation should fit into this framework, just doing its current 
weighted average of purity gain.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-17 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218246415
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Gini.scala ---
@@ -71,6 +71,23 @@ object Gini extends Impurity {
   @Since("1.1.0")
   def instance: this.type = this
 
+  /**
+   * :: DeveloperApi ::
+   * p-values for test-statistic measures, unsupported for [[Gini]]
+   */
+  @Since("2.2.0")
+  @DeveloperApi
+  def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): 
Double =
--- End diff --

I'll consider if there's a unifying idea here. pval-based metrics require 
integrating information across the new split children, which I believe was not 
the case for existing methods.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-17 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218245862
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/ChiSquared.scala ---
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.tree.impurity
+
+import org.apache.spark.annotation.{DeveloperApi, Experimental, Since}
+
+/**
+ * :: Experimental ::
+ * Class for calculating Chi Squared as a split quality metric during 
binary classification.
+ */
+@Since("2.2.0")
--- End diff --

I'll update those.  3.0 might be a good target, especially if I can't do 
this without new `isTestStatistic`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2018-09-17 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r218245670
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -670,14 +670,32 @@ private[spark] object RandomForest extends Logging {
 val leftImpurity = leftImpurityCalculator.calculate() // Note: This 
equals 0 if count = 0
 val rightImpurity = rightImpurityCalculator.calculate()
 
-val leftWeight = leftCount / totalCount.toDouble
-val rightWeight = rightCount / totalCount.toDouble
+val gain = metadata.impurity match {
+  case imp if (imp.isTestStatistic) =>
+// For split quality measures based on a test-statistic, run the 
test on the
+// left and right sub-populations to get a p-value for the null 
hypothesis
+val pval = imp.calculate(leftImpurityCalculator, 
rightImpurityCalculator)
+// Transform the test statistic p-val into a larger-is-better gain 
value
+Impurity.pValToGain(pval)
+
+  case _ =>
+// Default purity-gain logic:
+// measure the weighted decrease in impurity from parent to the 
left and right
+val leftWeight = leftCount / totalCount.toDouble
+val rightWeight = rightCount / totalCount.toDouble
+
+impurity - leftWeight * leftImpurity - rightWeight * rightImpurity
+}
 
-val gain = impurity - leftWeight * leftImpurity - rightWeight * 
rightImpurity
+// If the impurity being used is a test statistic p-val, apply a 
standard transform into
+// a larger-is-better gain value for the minimum-gain threshold
+val minGain =
+  if (metadata.impurity.isTestStatistic) 
Impurity.pValToGain(metadata.minInfoGain)
+  else metadata.minInfoGain
--- End diff --

The main issue I recall was that all of the existing metrics assume some 
kind of "larger is better" gain, and p-values are "smaller is better."  I'll 
take another pass over it and see if I can push that distinction down so it 
doesn't require exposing new methods.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22433: [SPARK-25442][SQL][K8S] Support STS to run in k8s deploy...

2018-09-16 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22433
  
I'm wondering, is there some reason this isn't supported in cluster mode 
for yarn & mesos? Or put another way, what is the rationale for k8s being added 
as an exception to this rule?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-09-12 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22381
  
These new metrics seem useful.  Is there a way to provide unit or 
integration testing for it?
Do these have enable/disable via metrics.properties files?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [SPARK-24434][K8S] pod template files

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
+1 for @skonto recommendation about CI testing templates populated with 
larger numbers of fields, and template errors


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22294: [SPARK-25287][INFRA] Add up-front check for JIRA_USERNAM...

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22294
  
cc @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22294: [SPARK-25287][INFRA] Add up-front check for JIRA_...

2018-08-30 Thread erikerlandson
GitHub user erikerlandson opened a pull request:

https://github.com/apache/spark/pull/22294

[SPARK-25287][INFRA] Add up-front check for JIRA_USERNAME and JIRA_PASSWORD

## What changes were proposed in this pull request?

Add an up-front check that `JIRA_USERNAME` and `JIRA_PASSWORD` have been 
set. If they haven't, ask user if they want to continue. This prevents the JIRA 
state update from failing at the very end of the process because user forgot to 
set these environment variables.

## How was this patch tested?

I ran the script with environment vars set, and unset, to verify it works 
as specified.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/erikerlandson/spark spark-25287

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22294.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22294


commit 1ed41ddc922cd07f6d6c2384c5aa248699f9ef87
Author: Erik Erlandson 
Date:   2018-08-30T21:52:29Z

[SPARK-25287][INFRA] Add up-front check for JIRA_USERNAME and JIRA_PASSWORD




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22179: [SPARK-23131][BUILD] Upgrade Kryo to 4.0.2

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22179
  
For maximum breaking-change forgiveness, the "safe" option would be to punt 
it to 3.0; which seems likely to be the next release after this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22179: [SPARK-23131][BUILD] Upgrade Kryo to 4.0.2

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22179
  
If it doesn't expose any breaking changes to users, that seems reasonable. 
I assume there would be no integration problems with scala 2.12?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22285: [SPARK-25275][K8S] require memberhip in wheel to run 'su...

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22285
  
looks like alpine doesn't have PAM installed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22285: [SPARK-25275][K8S] require memberhip in wheel to run 'su...

2018-08-30 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22285
  
jenkins test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22285: [SPARK-25275][K8S] require memberhip in wheel to ...

2018-08-30 Thread erikerlandson
GitHub user erikerlandson opened a pull request:

https://github.com/apache/spark/pull/22285

[SPARK-25275][K8S] require memberhip in wheel to run 'su' in dockerfiles

## What changes were proposed in this pull request?
Add a PAM configuration in k8s dockerfile to require authentication into 
wheel to run as `su`

## How was this patch tested?
Verify against CI that PAM config succeeds & causes no regressions

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/erikerlandson/spark spark-25275

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22285.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22285


commit f29051edb52c038b18592915187dab2955a87d7f
Author: Erik Erlandson 
Date:   2018-08-30T15:02:04Z

[SPARK-25275][K8S] require memberhip in wheel to run 'su' in dockerfiles




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21279: [SPARK-24219][k8s] Improve the docker building sc...

2018-08-28 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21279#discussion_r213317184
  
--- Diff: bin/docker-image-tool.sh ---
@@ -44,15 +44,37 @@ function image_ref {
 function build {
   local BUILD_ARGS
   local IMG_PATH
+  local TMPFOLDER
 
   if [ ! -f "$SPARK_HOME/RELEASE" ]; then
 # Set image build arguments accordingly if this is a source repo and 
not a distribution archive.
+local 
JARS="${SPARK_HOME}/assembly/target/scala-${SPARK_SCALA_VERSION}/jars"
+TMPFOLDER=`mktemp -q -d examples.XX`
+if [ $? -ne 0 ]; then
+  ehco "Cannot create temp folder, exiting..."
+  exit 1
+fi
+
+mkdir -p "${TMPFOLDER}/jars"
--- End diff --

Also, is using `${SPARK_SCALA_VERSION}` preferable to  `scala*`  ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21279: [SPARK-24219][k8s] Improve the docker building sc...

2018-08-28 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21279#discussion_r213316408
  
--- Diff: bin/docker-image-tool.sh ---
@@ -44,15 +44,37 @@ function image_ref {
 function build {
   local BUILD_ARGS
   local IMG_PATH
+  local TMPFOLDER
 
   if [ ! -f "$SPARK_HOME/RELEASE" ]; then
 # Set image build arguments accordingly if this is a source repo and 
not a distribution archive.
+local 
JARS="${SPARK_HOME}/assembly/target/scala-${SPARK_SCALA_VERSION}/jars"
+TMPFOLDER=`mktemp -q -d examples.XX`
+if [ $? -ne 0 ]; then
+  ehco "Cannot create temp folder, exiting..."
+  exit 1
+fi
+
+mkdir -p "${TMPFOLDER}/jars"
--- End diff --

(nit)
`cp -R ${SPARK_HOME}"/examples/target/scala*/jars ${TMPFOLDER}` ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21279: [SPARK-24219][k8s] Improve the docker building sc...

2018-08-28 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21279#discussion_r213314870
  
--- Diff: bin/docker-image-tool.sh ---
@@ -44,15 +44,37 @@ function image_ref {
 function build {
   local BUILD_ARGS
   local IMG_PATH
+  local TMPFOLDER
 
   if [ ! -f "$SPARK_HOME/RELEASE" ]; then
 # Set image build arguments accordingly if this is a source repo and 
not a distribution archive.
+local 
JARS="${SPARK_HOME}/assembly/target/scala-${SPARK_SCALA_VERSION}/jars"
+TMPFOLDER=`mktemp -q -d examples.XX`
--- End diff --

This will probably work, just wondering if we can use `-p /some/path` where 
`/some/path` is guaranteed to exist. This could fail in theory if for some 
reason `/tmp` doesn't exist (or doesn't have space?) or if the user's TMPDIR is 
set to something non-writable


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22212: [SPARK-25220] Seperate kubernetes node selector config b...

2018-08-27 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22212
  
I agree there's an argument for keeping this, but an alternative would be 
to leave the original for backward compatability, deprecate it, and recommend 
people make use of custom pod templates (#22146)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22212: [SPARK-25220] Seperate kubernetes node selector c...

2018-08-27 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22212#discussion_r213127037
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -663,11 +663,21 @@ specific to Spark on Kubernetes.
   
 
 
-  spark.kubernetes.node.selector.[labelKey]
+  spark.kubernetes.driver.selector.[labelKey]
--- End diff --

agreed we should keep it, but recommend annotating it as deprecated


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-24 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212668742
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -225,6 +225,18 @@ private[spark] object Config extends Logging {
 "Ensure that major Python version is either Python2 or Python3")
   .createWithDefault("2")
 
+  val KUBERNETES_DRIVER_CONTAINER_NAME =
+ConfigBuilder("spark.kubernetes.driver.containerName")
--- End diff --

I'm sensing some confusion over the purpose of 
`spark.kubernetes.driver.containerName` - is it to select the container out of 
the `containers` array in the template, or is it to *set* that name? Since this 
PR is creating `containerName` spark conf, I'm assuming it is no longer 
hard-coded, and we're exposing it to user control.

IIUC, the current concept is that the user *must* declare a container name, 
*and* also must specify it via `spark.kubernetes.driver.containerName`; If not, 
I'm not sure what the default behavior is.

I agree that leaning on convention (spark container is the first one) can 
cause problems, although the same problems would happen if they specify the 
wrong container name in the config. Clear documentation would be important 
regardless.  In general, user supplied templates are a "power-user" feature, 
and some increased opportunities for error are inherent here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212469389
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -225,6 +225,18 @@ private[spark] object Config extends Logging {
 "Ensure that major Python version is either Python2 or Python3")
   .createWithDefault("2")
 
+  val KUBERNETES_DRIVER_CONTAINER_NAME =
+ConfigBuilder("spark.kubernetes.driver.containerName")
--- End diff --

I'm having mixed feelings about utility of convention vs configuration on 
this. Mostly just because the purpose of the PR is to prevent further spark 
configurations. If the feature leans on convention that first container in the 
list is to be the spark driver/executor, how confusing is that likely to be?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212460665
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -775,4 +787,183 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.driver.containerName
+  "spark-kubernetes-driver"
+  
+   This sets the driver container name. If you are specifying a driver 
[pod template](#pod-template), you can match this name to the
--- End diff --

Does it mean "this is the name to look for in a list of containers coming 
in from the template?"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212438850
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -225,6 +225,18 @@ private[spark] object Config extends Logging {
 "Ensure that major Python version is either Python2 or Python3")
   .createWithDefault("2")
 
+  val KUBERNETES_DRIVER_CONTAINER_NAME =
+ConfigBuilder("spark.kubernetes.driver.containerName")
--- End diff --

@mccheah it raises interesting question: a user might use this feature to 
add containers, e.g. side-car proxies, service mesh, etc. In fact, I'd want 
users to be able to do this. If there *are* multiple containers defined, we may 
need a convention for identifying which one is driver/executor. Alternatives 
might include additive-only, like labels, or disallowing, but supporting this 
seems very desirable.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212423082
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -775,4 +787,183 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.driver.containerName
+  "spark-kubernetes-driver"
+  
+   This sets the driver container name. If you are specifying a driver 
[pod template](#pod-template), you can match this name to the
--- End diff --

What does "match this name to the name set in the template" mean?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212424515
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -775,4 +787,183 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.driver.containerName
+  "spark-kubernetes-driver"
+  
+   This sets the driver container name. If you are specifying a driver 
[pod template](#pod-template), you can match this name to the
+   driver container name set in the template.
+  
+
+
+  spark.kubernetes.executor.containerName
+  "spark-kubernetes-executor"
+  
+   This sets the executor container name. If you are specifying a an 
executor [pod template](#pod-template), you can match this name to the
+   driver container name set in the template.
+  
+
+
+  spark.kubernetes.driver.podTemplateFile
+  (none)
+  
+   Specify the local file that contains the driver [pod 
template](#pod-template). For example
+   
spark.kubernetes.driver.podTemplateFile=/path/to/driver-pod-template.yaml`
+  
+
+
+  spark.kubernetes.executor.podTemplateFile
+  (none)
+  
+   Specify the local file that contains the executor [pod 
template](#pod-template). For example
+   
spark.kubernetes.executor.podTemplateFile=/path/to/executor-pod-template.yaml`
+  
+
+
+
+ Pod template properties
+
+See the below table for the full list of pod specifications that will be 
overwritten by spark.
+
+### Pod Metadata
+
+
+Pod metadata keyModified 
valueDescription
+
+  name
+  Value of spark.kubernetes.driver.pod.name
+  
+The driver pod name will be overwritten with either the configured or 
default value of
+spark.kubernetes.driver.pod.name. The executor pod names 
will be unaffected.
+  
+
+
+  namespace
+  Value of spark.kubernetes.namespace
+  
+Spark makes strong assumptions about the driver and executor 
namespaces. Both driver and executor namespaces will
+be replaced by this spark conf value.
+  
+
+
+  labels
+  Adds the labels from 
spark.kubernetes.{driver,executor}.label.*
+  
+Spark will add additional labels specified by the spark configuration.
+  
+
+
+  annotations
+  Adds the annotations from 
spark.kubernetes.{driver,executor}.annotation.*
+  
+Spark will add additional labels specified by the spark configuration.
+  
+
+
+
+### Pod Spec
+
+
+Pod spec keyModified valueDescription
+
+  imagePullSecrets
+  Adds image pull secrets from 
spark.kubernetes.container.image.pullSecrets
+  
+Additional pull secrets will be added from the spark configuration to 
both executor pods.
+  
+
+
+  nodeSelector
+  Adds node selectors from 
spark.kubernetes.node.selector.*
+  
+Additional node selectors will be added from the spark configuration 
to both executor pods.
+  
+
+
+  restartPolicy
+  "never"
+  
+Spark assumes that both drivers and executors never restart.
+  
+
+
+  serviceAccount
+  Value of 
spark.kubernetes.authenticate.driver.serviceAccountName
+  
+Spark will override serviceAccount with the value of the 
spark configuration for only
--- End diff --

similar Q to namespace: if no spark-conf is set, will spark's conf default 
override here as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212423941
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -775,4 +787,183 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.driver.containerName
+  "spark-kubernetes-driver"
+  
+   This sets the driver container name. If you are specifying a driver 
[pod template](#pod-template), you can match this name to the
+   driver container name set in the template.
+  
+
+
+  spark.kubernetes.executor.containerName
+  "spark-kubernetes-executor"
+  
+   This sets the executor container name. If you are specifying a an 
executor [pod template](#pod-template), you can match this name to the
+   driver container name set in the template.
+  
+
+
+  spark.kubernetes.driver.podTemplateFile
+  (none)
+  
+   Specify the local file that contains the driver [pod 
template](#pod-template). For example
+   
spark.kubernetes.driver.podTemplateFile=/path/to/driver-pod-template.yaml`
+  
+
+
+  spark.kubernetes.executor.podTemplateFile
+  (none)
+  
+   Specify the local file that contains the executor [pod 
template](#pod-template). For example
+   
spark.kubernetes.executor.podTemplateFile=/path/to/executor-pod-template.yaml`
+  
+
+
+
+ Pod template properties
+
+See the below table for the full list of pod specifications that will be 
overwritten by spark.
+
+### Pod Metadata
+
+
+Pod metadata keyModified 
valueDescription
+
+  name
+  Value of spark.kubernetes.driver.pod.name
+  
+The driver pod name will be overwritten with either the configured or 
default value of
+spark.kubernetes.driver.pod.name. The executor pod names 
will be unaffected.
+  
+
+
+  namespace
+  Value of spark.kubernetes.namespace
+  
+Spark makes strong assumptions about the driver and executor 
namespaces. Both driver and executor namespaces will
--- End diff --

if the spark conf value for namespace isn't set, can spark use the template 
setting, or will spark's conf default also override the template?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212399808
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -225,6 +225,18 @@ private[spark] object Config extends Logging {
 "Ensure that major Python version is either Python2 or Python3")
   .createWithDefault("2")
 
+  val KUBERNETES_DRIVER_CONTAINER_NAME =
+ConfigBuilder("spark.kubernetes.driver.containerName")
--- End diff --

If you're saying that manipulating `containers` doesn't give specific 
control over which name is assigned to which container, then I agree


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22146: [SPARK-24434][K8S] pod template files

2018-08-23 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/22146#discussion_r212364915
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ---
@@ -225,6 +225,18 @@ private[spark] object Config extends Logging {
 "Ensure that major Python version is either Python2 or Python3")
   .createWithDefault("2")
 
+  val KUBERNETES_DRIVER_CONTAINER_NAME =
+ConfigBuilder("spark.kubernetes.driver.containerName")
--- End diff --

Is there any way to do this via the `containers` container array from the 
pod template?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [WIP][SPARK-24434][K8S] pod template files

2018-08-20 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
Thanks @onursatici!
Can you please resolve merge conflicts (rebase?), so CI can build it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22146: [WIP][SPARK-24434][K8S] pod template files

2018-08-20 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/22146
  
Jenkins, OK to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21555: [SPARK-24547][K8S] Allow for building spark on k8s docke...

2018-06-20 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21555
  
LGMT, I am OK to merge.

Most of the automated image build tooling I've seen is custom, but I agree 
w/ Matt that being able to selectively build is worth supporting, via a 
followup PR. It might make it easier to write custom logic using this as an 
abstraction.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s commands

2018-06-19 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
testing environment is synced again and its passing, so I'm going to merge


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21551: [K8S] Fix issue in 'docker-image-tool.sh'

2018-06-19 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21551
  
@fabriziocucci can you close this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s commands

2018-06-19 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
Jenkins, test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21551: [K8S] Fix issue in 'docker-image-tool.sh'

2018-06-18 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21551
  
OK, I see it referenced above as commit b8dbfcc. I'd close this, but that 
option is disabled for me. Probably assuming it happens from the merge-pr 
script.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21551: [K8S] Fix issue in 'docker-image-tool.sh'

2018-06-18 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21551
  
@mccheah this isn't showing as merged, you said you ran the merge script?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s commands

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
please test this


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s commands

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195866275
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -37,11 +37,17 @@ if [ -z "$uidentry" ] ; then
 fi
 
 SPARK_K8S_CMD="$1"
-if [ -z "$SPARK_K8S_CMD" ]; then
-  echo "No command to execute has been provided." 1>&2
-  exit 1
-fi
-shift 1
+case "$SPARK_K8S_CMD" in
+driver | driver-py | executor)
+  shift 1
+  ;;
+"")
+  ;;
+*)
+  echo "No SPARK_K8S_CMD provided: proceeding in pass-through mode..."
--- End diff --

Overall this looks good to me.  Looking at this simplified logic, the 
logging message should probably be more like:

"Non-spark-k8s command provided, proceeding in pass-through mode..."


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195811228
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -38,10 +38,10 @@ fi
 
 SPARK_K8S_CMD="$1"
--- End diff --

@tmckayus that seems wrong, since the `CMD` that gets executed after the 
`case` is now missing the first element, am I missing something?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195804610
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -38,10 +38,10 @@ fi
 
 SPARK_K8S_CMD="$1"
--- End diff --

And it does the right thing w.r.t. `shift` in both cases?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195794051
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -38,10 +38,10 @@ fi
 
 SPARK_K8S_CMD="$1"
--- End diff --

I see two possible pass-through conditions here: one is "empty 
SPARK_K8S_CMD" and the other is "SPARK_K8S_CMD is non empty but has non-spark 
command in it"  Is that the convention, or is the pass-through case always 
expected to be "empty SPARK_K8S_CMD" ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195793358
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -110,8 +110,7 @@ case "$SPARK_K8S_CMD" in
 ;;
 
   *)
-echo "Unknown command: $SPARK_K8S_CMD" 1>&2
-exit 1
+CMD=("$@")
--- End diff --

Should log a message here too, about "executing in pass-through mode" since 
this is the guaranteed code path for pass through


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s comman...

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21572#discussion_r195787809
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -38,10 +38,10 @@ fi
 
 SPARK_K8S_CMD="$1"
 if [ -z "$SPARK_K8S_CMD" ]; then
-  echo "No command to execute has been provided." 1>&2
-  exit 1
+  echo "No command to execute has been provided. Ignoring spark-on-k8s 
workflow..." 1>&2
--- End diff --

I'd propose:
"No SPARK_K8S_CMD provided: proceeding in pass-through mode..."


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: [SPARK-24534][K8S] Bypass non spark-on-k8s commands

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
@rimolive there are some significant differences between the head of master 
and your file. I'm wondering if you should get the head of master and re-edit 
from there, because rebasing didn't seem to sync it


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: Bypass non spark-on-k8s commands

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
@rimolive I think you may need to rebase this to latest head of master, to 
pick up the pyspark updates


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21572: Bypass non spark-on-k8s commands

2018-06-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21572
  
Thanks @rimolive !  Can you please prepend `[SPARK-24534][K8S]` to the 
title of this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21366: [SPARK-24248][K8S][WIP] Use the Kubernetes API to...

2018-05-24 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21366#discussion_r190755750
  
--- Diff: 
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/DeterministicExecutorPodsEventQueue.scala
 ---
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.scheduler.cluster.k8s
+
+import io.fabric8.kubernetes.api.model.Pod
+import scala.collection.mutable
+
+class DeterministicExecutorPodsEventQueue extends ExecutorPodsEventQueue {
+
+  private val eventBuffer = mutable.Buffer.empty[Pod]
+  private val subscribers = mutable.Buffer.empty[(Seq[Pod]) => Unit]
+
+  override def addSubscriber
+  (processBatchIntervalMillis: Long)
+  (onNextBatch: (Seq[Pod]) => Unit): Unit = {
+subscribers += onNextBatch
+  }
+
+  override def stopProcessingEvents(): Unit = {}
+
+  override def pushPodUpdate(updatedPod: Pod): Unit = eventBuffer += 
updatedPod
--- End diff --

So events are pods themselves, as opposed to some event structure on pods?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21366: [SPARK-24248][K8S][WIP] Use the Kubernetes API to...

2018-05-21 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21366#discussion_r189735073
  
--- Diff: pom.xml ---
@@ -150,6 +150,7 @@
 
 4.5.4
 4.4.8
+3.0.1
--- End diff --

My take is that the performance is probably not worth the additional 
dependency.
I also noticed that the trove dep is LGPL, which is considered incompatible 
with Apache license. Although I believe this is not a show-stopper with respect 
to "containerized" dependencies, it probably is for a direct dependency in the 
code.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21366: [SPARK-24248][K8S][WIP] Use the Kubernetes API to...

2018-05-19 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21366#discussion_r189435162
  
--- Diff: pom.xml ---
@@ -150,6 +150,7 @@
 
 4.5.4
 4.4.8
+3.0.1
--- End diff --

it looks like this dep is being taken on for a couple data structures: are 
these important or could they be replaced with scala Array and HashMap?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21238: [SPARK-24137][K8s] Mount local directories as empty dir ...

2018-05-09 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21238
  
I agree with @mcheah that the potential code reuse is small. Keeping this 
as a separate pod construction step, decoupled from the user-exposed step, is 
cleaner.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21241: [SPARK-24135][K8s] Resilience to init-container e...

2018-05-08 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21241#discussion_r186859389
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala
 ---
@@ -320,50 +322,83 @@ private[spark] class 
KubernetesClusterSchedulerBackend(
 override def eventReceived(action: Action, pod: Pod): Unit = {
   val podName = pod.getMetadata.getName
   val podIP = pod.getStatus.getPodIP
-
+  val podPhase = pod.getStatus.getPhase
   action match {
-case Action.MODIFIED if (pod.getStatus.getPhase == "Running"
+case Action.MODIFIED if (podPhase == "Running"
 && pod.getMetadata.getDeletionTimestamp == null) =>
   val clusterNodeName = pod.getSpec.getNodeName
   logInfo(s"Executor pod $podName ready, launched at 
$clusterNodeName as IP $podIP.")
   executorPodsByIPs.put(podIP, pod)
 
-case Action.DELETED | Action.ERROR =>
+case Action.MODIFIED if (podPhase == "Init:Error" || podPhase == 
"Init:CrashLoopBackoff")
+  && pod.getMetadata.getDeletionTimestamp == null =>
   val executorId = getExecutorId(pod)
-  logDebug(s"Executor pod $podName at IP $podIP was at $action.")
-  if (podIP != null) {
-executorPodsByIPs.remove(podIP)
+  failedInitExecutors.add(executorId)
+  if (failedInitExecutors.size >= executorMaxInitErrors) {
+val errorMessage = s"Aborting Spark application because 
$executorMaxInitErrors" +
+  s" executors failed to start. The maximum number of allowed 
startup failures is" +
+  s" $executorMaxInitErrors. Please contact your cluster 
administrator or increase" +
+  s" your setting of 
${KUBERNETES_EXECUTOR_MAX_INIT_ERRORS.key}."
+logError(errorMessage)
+
KubernetesClusterSchedulerBackend.this.scheduler.sc.stopInNewThread()
--- End diff --

Leaning on configured minimum executors could be construed as a way to let 
the user express application-dependent throughput minimums. If the use case is 
terabyte-scale, then setting minimum executors appropriately would address 
that.  A %-threshold is also reasonable, it just adds a new knob.

Requesting new executors indefinitely seems plausible, as long as failed 
pods don't accumulate. Having a submission hang indefinitely while kube churns 
may be confusing behavior, at least if users are not familiar with kube and 
hoping to treat it transparently. Maybe mesos backend policies could provide an 
analogy?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21241: [SPARK-24135][K8s] Resilience to init-container e...

2018-05-08 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21241#discussion_r186855467
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -561,6 +561,13 @@ specific to Spark on Kubernetes.
 This is distinct from spark.executor.cores: it is only 
used and takes precedence over spark.executor.cores for specifying 
the executor pod cpu request if set. Task 
 parallelism, e.g., number of tasks an executor can run concurrently is 
not affected by this.
 
+
+  spark.kubernetes.executor.maxInitFailures
+  10
+  
+Maximum number of times executors are allowed to fail with an 
Init:Error state before failing the application. Note that Init:Error failures 
should not be caused by Spark itself because Spark does not attach 
init-containers to pods. Init-containers can be attached by the cluster itself. 
Users should check with their cluster administrator if these kinds of failures 
to start the executor pod occur frequently.
--- End diff --

As long as it's relatively easy to extend, generalizing on a case by case 
basis should be OK


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21241: [SPARK-24135][K8s] Resilience to init-container e...

2018-05-08 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21241#discussion_r186846341
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -561,6 +561,13 @@ specific to Spark on Kubernetes.
 This is distinct from spark.executor.cores: it is only 
used and takes precedence over spark.executor.cores for specifying 
the executor pod cpu request if set. Task 
 parallelism, e.g., number of tasks an executor can run concurrently is 
not affected by this.
 
+
+  spark.kubernetes.executor.maxInitFailures
+  10
+  
+Maximum number of times executors are allowed to fail with an 
Init:Error state before failing the application. Note that Init:Error failures 
should not be caused by Spark itself because Spark does not attach 
init-containers to pods. Init-containers can be attached by the cluster itself. 
Users should check with their cluster administrator if these kinds of failures 
to start the executor pod occur frequently.
--- End diff --

I'm in agreement w/ @foxish about designing for wider categories of error, 
which can be future-proofed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21241: [SPARK-24135][K8s] Resilience to init-container e...

2018-05-08 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21241#discussion_r186839692
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala
 ---
@@ -320,50 +322,83 @@ private[spark] class 
KubernetesClusterSchedulerBackend(
 override def eventReceived(action: Action, pod: Pod): Unit = {
   val podName = pod.getMetadata.getName
   val podIP = pod.getStatus.getPodIP
-
+  val podPhase = pod.getStatus.getPhase
   action match {
-case Action.MODIFIED if (pod.getStatus.getPhase == "Running"
+case Action.MODIFIED if (podPhase == "Running"
 && pod.getMetadata.getDeletionTimestamp == null) =>
   val clusterNodeName = pod.getSpec.getNodeName
   logInfo(s"Executor pod $podName ready, launched at 
$clusterNodeName as IP $podIP.")
   executorPodsByIPs.put(podIP, pod)
 
-case Action.DELETED | Action.ERROR =>
+case Action.MODIFIED if (podPhase == "Init:Error" || podPhase == 
"Init:CrashLoopBackoff")
+  && pod.getMetadata.getDeletionTimestamp == null =>
   val executorId = getExecutorId(pod)
-  logDebug(s"Executor pod $podName at IP $podIP was at $action.")
-  if (podIP != null) {
-executorPodsByIPs.remove(podIP)
+  failedInitExecutors.add(executorId)
+  if (failedInitExecutors.size >= executorMaxInitErrors) {
+val errorMessage = s"Aborting Spark application because 
$executorMaxInitErrors" +
+  s" executors failed to start. The maximum number of allowed 
startup failures is" +
+  s" $executorMaxInitErrors. Please contact your cluster 
administrator or increase" +
+  s" your setting of 
${KUBERNETES_EXECUTOR_MAX_INIT_ERRORS.key}."
+logError(errorMessage)
+
KubernetesClusterSchedulerBackend.this.scheduler.sc.stopInNewThread()
--- End diff --

I'm unsure how elaborate we can/should get with this, but a basic test from 
the back-end side for context seems like "are there any registered executors", 
which isn't perfect but might be a decent heuristic for whether it's worth 
trying to continue or just fail the job entirely.  If there are no currently 
registered executors, and we hit a failure threshold, just failing out 
altogether seems like a sane response. If there are executors running 
(satisfying configured minimum?), then continuing might be reasonable since 
that is graceful degradation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20697: [SPARK-23010][k8s] Initial checkin of k8s integration te...

2018-05-02 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20697
  
@ssuchter good point, they should. I thought they were set up to be invoked 
because kube has already been added to modules.py


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20697: [SPARK-23010][k8s] Initial checkin of k8s integration te...

2018-05-02 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20697
  
Barring further feedback, resolving a two issues should have this ready to 
merge:
1. move cloud-backed testing to a new PR
1. remove the repository clone logic



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21092: [SPARK-23984][K8S] Initial Python Bindings for PySpark o...

2018-05-02 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21092
  
@holdenk I think your [comment 
above](https://github.com/apache/spark/pull/21092#issuecomment-383211329) gets 
at a use-case "ambiguity" that containerization causes. There are now at least 
two choices of channel for supplying dependencies: from the command line, or by 
customized container (and here there are at least two sub-cases: manually 
created customizations, or via source-to-image tooling).

When specifying deps via the command line, particularly in cluster mode, we 
have backed out of staging local files via init-container; does pulling from 
URI suffice?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21092: [SPARK-23984][K8S][WIP] Initial Python Bindings f...

2018-04-20 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21092#discussion_r183161979
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile
 ---
@@ -0,0 +1,33 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+ARG base_img
+FROM $base_img
+WORKDIR /
+COPY python /opt/spark/python
+RUN apk add --no-cache python && \
+python -m ensurepip && \
+rm -r /usr/lib/python*/ensurepip && \
+pip install --upgrade pip setuptools && \
+rm -r /root/.cache
+ENV PYTHON_VERSION 2.7.13
--- End diff --

I think a canonical container should include both. My instinct is that a 
user should be able to "force" the use of one or the other. If someone is 
invoking spark-submit in cluster-mode, with a supplied python file, some kind 
of CLI argument (--conf or otherwise) seems like the only totally foolproof way 
to identify that for the eventual pod construction, but maybe there is a better 
way?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21092: [SPARK-23984][K8S][WIP] Initial Python Bindings f...

2018-04-18 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21092#discussion_r182564020
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
 ---
@@ -71,7 +77,7 @@ private[spark] class BasicDriverFeatureStep(
   ("cpu", new QuantityBuilder(false).withAmount(limitCores).build())
 }
 
-val driverContainer = new ContainerBuilder(pod.container)
+val withoutArgsDriverContainer: ContainerBuilder = new 
ContainerBuilder(pod.container)
--- End diff --

is there a corresponding driver container _with_ args?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21092: [SPARK-23984][K8S][WIP] Initial Python Bindings f...

2018-04-18 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/21092#discussion_r182563748
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -62,6 +69,14 @@ case "$SPARK_K8S_CMD" in
   "$@"
 )
 ;;
+  driver-py)
+CMD=(
+  "$SPARK_HOME/bin/spark-submit"
+  --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
+  --deploy-mode client
+  "$@" $PYSPARK_PRIMARY $PYSPARK_SECONDARY
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21092: [SPARK-23984][K8S][WIP] Initial Python Bindings for PySp...

2018-04-18 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/21092
  
Thanks @ifilonenko !
I'm interested in figuring out what it means for the container images to be 
"python 2/3 generic" - does that imply being able to run either, based on 
submit parameters?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...

2018-03-19 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20669
  
@mccheah workflow is to use `dev/merge_spark_pr.py`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20669: [SPARK-22839][K8S] Remove the use of init-contain...

2018-03-19 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/20669#discussion_r175532438
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -53,14 +53,10 @@ fi
 case "$SPARK_K8S_CMD" in
   driver)
 CMD=(
-  ${JAVA_HOME}/bin/java
-  "${SPARK_JAVA_OPTS[@]}"
-  -cp "$SPARK_CLASSPATH"
-  -Xms$SPARK_DRIVER_MEMORY
-  -Xmx$SPARK_DRIVER_MEMORY
-  -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS
-  $SPARK_DRIVER_CLASS
-  $SPARK_DRIVER_ARGS
+  "$SPARK_HOME/bin/spark-submit"
+  --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
+  --deploy-mode client
+  "$@"
--- End diff --

as @mccheah mentioned, I included some logic on the current entrypoint.sh 
to allow Spark to work in cases such as an anonymous uid (another way to look 
at it is managing Spark's long-standing quirk of failing when it can't find a 
passwd entry).  Putting it in entrypoint.sh was a good way to make sure this 
happened regardless of how the actual CMD evolved.  A kind of defensive 
future-proofing, which is important for reference Dockerfiles.  It also 
provides execution via `tini` as "pid 1", which is considered good standard 
practice.

All of this is done in part with the expectation that _most_ users are 
liable to just want to customize their images by building the reference 
dockerfiles and using those as base images for their own, without modifying the 
CMD or entrypoint.sh

That said, I think that in terms of formally documenting a container API, 
entrypoint.sh may be a red herring. In theory, a user _should_ be able to build 
their own custom container from the ground up, up to and including a different 
entrypoint, or default entrypoint, etc.

Part of the reason we went with an externalized CMD (instead of creating 
one in the backend code) was to allow maximum flexibility in how images were 
constructed.  The back-end provides certain information to the pod.  The "API" 
is a catalogue of this information, combined with any behaviors that the user's 
container _must_ implement.  API doc shouldn't assume the existence of 
entrypoint.sh


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20822: [SPARK-23680] Fix entrypoint.sh to properly support Arbi...

2018-03-16 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20822
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20822: [SPARK-23680] Fix entrypoint.sh to properly support Arbi...

2018-03-14 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20822
  
LGTM,  pending tests


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20822: [SPARK-23680] Fix entrypoint.sh to properly suppo...

2018-03-14 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/20822#discussion_r174527459
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh ---
@@ -17,13 +17,15 @@
 #
 
 # echo commands to the terminal output
-set -ex
+set -x
 
 # Check whether there is a passwd entry for the container UID
 myuid=$(id -u)
 mygid=$(id -g)
 uidentry=$(getent passwd $myuid)
 
+set -e
--- End diff --

it is probably best to limit the "no -e" specifically to `getent`, such as:
```bash
set -ex
myuid=$(id -u)
mygid=$(id -g)
# turn off -e for getent because it will return error code in anonymous uid 
case
set +e
uidentry=$(getent passwd $myuid)
set -e
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20822: [SPARK-23680] Fix entrypoint.sh to properly support Arbi...

2018-03-14 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20822
  
jenkins please test this


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20822: [SPARK-23680] Fix entrypoint.sh to properly support Arbi...

2018-03-14 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20822
  
@rimolive thanks for this!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19775: [SPARK-22343][core] Add support for publishing Spark met...

2018-02-14 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/19775
  
Although this is not kube-specific, kubernetes deployment is a major 
prometheus use case. Has it been tested in a kube environment?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19775: [SPARK-22343][core] Add support for publishing Spark met...

2018-02-14 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/19775
  
I agree w/ @jerryshao that adding new deps to core isn't ideal.  (Also that 
having #11994 would be really nice)
New deps on a sub-project seems more palatable, but interested in what 
other Sparkers think.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20007: [SPARK-22777][Scheduler] Kubernetes mode dockerfile perm...

2017-12-18 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20007
  
If we wanted to lock down permissions a bit more, we might consider `750`, 
but I'd prefer to make permission changes with more time for testing, maybe for 
2.4


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20007: [SPARK-22777][Scheduler] Kubernetes mode dockerfile perm...

2017-12-18 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/20007
  
Yes, by default in openshift, containers run as an anonymous uid, and as 
group id 0.  So there are a few things that need to be given access to gid 0.  
I asked some of our security people and they considered these mods to be 
acceptable.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19995: [SPARK-22807] [Scheduler] Remove config that says docker...

2017-12-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/19995
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19995: [SPARK-22807] [Scheduler] Remove config that says docker...

2017-12-15 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/19995
  
I'm picking up a few stragglers from `git grep`

```

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BaseDriverConfigurationStep.scala
33=private[spark] class BaseDriverConfigurationStep(
49:  private val driverDockerImage = submissionSparkConf
51:.getOrElse(throw new SparkException("Must specify the driver Docker 
image"))
113:  .withImage(driverDockerImage)
```

```
core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
95=class SparkSubmitSuite
399:  "--conf", "spark.kubernetes.driver.docker.image=bar",
415:conf.get("spark.kubernetes.driver.docker.image") should be ("bar")
```



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - D...

2017-12-15 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/19946#discussion_r157274013
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -0,0 +1,502 @@
+---
+layout: global
+title: Running Spark on Kubernetes
+---
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+Spark can run on clusters managed by [Kubernetes](https://kubernetes.io). 
This feature makes use of native
+Kubernetes scheduler that has been added to Spark.
+
+# Prerequisites
+
+* A runnable distribution of Spark 2.3 or above.
+* A running Kubernetes cluster at version >= 1.6 with access configured to 
it using
+[kubectl](https://kubernetes.io/docs/user-guide/prereqs/).  If you do not 
already have a working Kubernetes cluster,
+you may setup a test cluster on your local machine using
+[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
+  * We recommend using the latest release of minikube with the DNS addon 
enabled.
+* You must have appropriate permissions to list, create, edit and delete
+[pods](https://kubernetes.io/docs/user-guide/pods/) in your cluster. You 
can verify that you can list these resources
+by running `kubectl auth can-i <list|create|edit|delete> pods`.
+  * The service account credentials used by the driver pods must be 
allowed to create pods, services and configmaps.
+* You must have [Kubernetes 
DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/) 
configured in your cluster.
+
+# How it works
+
+
+  
+
+
+spark-submit can be directly used to submit a Spark 
application to a Kubernetes cluster.
+The submission mechanism works as follows:
+
+* Spark creates a Spark driver running within a [Kubernetes 
pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
+* The driver creates executors which are also running within Kubernetes 
pods and connects to them, and executes application code.
+* When the application completes, the executor pods terminate and are 
cleaned up, but the driver pod persists
+logs and remains in "completed" state in the Kubernetes API until it's 
eventually garbage collected or manually cleaned up.
+
+Note that in the completed state, the driver pod does *not* use any 
computational or memory resources.
+
+The driver and executor pod scheduling is handled by Kubernetes. It will 
be possible to affect Kubernetes scheduling
+decisions for driver and executor pods using advanced primitives like
+[node 
selectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector)
+and [node/pod 
affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity)
+in a future release.
+
+# Submitting Applications to Kubernetes
+
+## Docker Images
+
+Kubernetes requires users to supply images that can be deployed into 
containers within pods. The images are built to
+be run in a container runtime environment that Kubernetes supports. Docker 
is a container runtime environment that is
--- End diff --

I'd expect the container run-time to be transparent to the particular 
image, and very definitely transparent to spark.  Using `container.image` makes 
sense from either standpoint.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - D...

2017-12-14 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/19946#discussion_r157076409
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -0,0 +1,502 @@
+---
+layout: global
+title: Running Spark on Kubernetes
+---
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+Spark can run on clusters managed by [Kubernetes](https://kubernetes.io). 
This feature makes use of native
+Kubernetes scheduler that has been added to Spark.
+
+# Prerequisites
+
+* A runnable distribution of Spark 2.3 or above.
+* A running Kubernetes cluster at version >= 1.6 with access configured to 
it using
+[kubectl](https://kubernetes.io/docs/user-guide/prereqs/).  If you do not 
already have a working Kubernetes cluster,
+you may setup a test cluster on your local machine using
+[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
+  * We recommend using the latest release of minikube with the DNS addon 
enabled.
+* You must have appropriate permissions to list, create, edit and delete
+[pods](https://kubernetes.io/docs/user-guide/pods/) in your cluster. You 
can verify that you can list these resources
+by running `kubectl auth can-i <list|create|edit|delete> pods`.
+  * The service account credentials used by the driver pods must be 
allowed to create pods, services and configmaps.
+* You must have [Kubernetes 
DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/) 
configured in your cluster.
+
+# How it works
+
+
+  
+
+
+spark-submit can be directly used to submit a Spark 
application to a Kubernetes cluster.
+The submission mechanism works as follows:
+
+* Spark creates a Spark driver running within a [Kubernetes 
pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
+* The driver creates executors which are also running within Kubernetes 
pods and connects to them, and executes application code.
+* When the application completes, the executor pods terminate and are 
cleaned up, but the driver pod persists
+logs and remains in "completed" state in the Kubernetes API until it's 
eventually garbage collected or manually cleaned up.
+
+Note that in the completed state, the driver pod does *not* use any 
computational or memory resources.
+
+The driver and executor pod scheduling is handled by Kubernetes. It will 
be possible to affect Kubernetes scheduling
+decisions for driver and executor pods using advanced primitives like
+[node 
selectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector)
+and [node/pod 
affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity)
+in a future release.
+
+# Submitting Applications to Kubernetes
+
+## Docker Images
+
+Kubernetes requires users to supply images that can be deployed into 
containers within pods. The images are built to
+be run in a container runtime environment that Kubernetes supports. Docker 
is a container runtime environment that is
--- End diff --

the images should be runnable by other container runtimes (cri-o, rkt, 
etc), although I'm not aware of anybody testing that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - D...

2017-12-14 Thread erikerlandson
Github user erikerlandson commented on a diff in the pull request:

https://github.com/apache/spark/pull/19946#discussion_r157075744
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -0,0 +1,502 @@
+---
+layout: global
+title: Running Spark on Kubernetes
+---
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+Spark can run on clusters managed by [Kubernetes](https://kubernetes.io). 
This feature makes use of native
+Kubernetes scheduler that has been added to Spark.
+
+# Prerequisites
+
+* A runnable distribution of Spark 2.3 or above.
+* A running Kubernetes cluster at version >= 1.6 with access configured to 
it using
+[kubectl](https://kubernetes.io/docs/user-guide/prereqs/).  If you do not 
already have a working Kubernetes cluster,
+you may setup a test cluster on your local machine using
+[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
+  * We recommend using the latest release of minikube with the DNS addon 
enabled.
+* You must have appropriate permissions to list, create, edit and delete
+[pods](https://kubernetes.io/docs/user-guide/pods/) in your cluster. You 
can verify that you can list these resources
+by running `kubectl auth can-i <list|create|edit|delete> pods`.
+  * The service account credentials used by the driver pods must be 
allowed to create pods, services and configmaps.
+* You must have [Kubernetes 
DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/) 
configured in your cluster.
+
+# How it works
+
+
+  
+
+
+spark-submit can be directly used to submit a Spark 
application to a Kubernetes cluster.
+The submission mechanism works as follows:
+
+* Spark creates a Spark driver running within a [Kubernetes 
pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
+* The driver creates executors which are also running within Kubernetes 
pods and connects to them, and executes application code.
+* When the application completes, the executor pods terminate and are 
cleaned up, but the driver pod persists
+logs and remains in "completed" state in the Kubernetes API until it's 
eventually garbage collected or manually cleaned up.
+
+Note that in the completed state, the driver pod does *not* use any 
computational or memory resources.
+
+The driver and executor pod scheduling is handled by Kubernetes. It will 
be possible to affect Kubernetes scheduling
+decisions for driver and executor pods using advanced primitives like
+[node 
selectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector)
+and [node/pod 
affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity)
+in a future release.
+
+# Submitting Applications to Kubernetes
+
+## Docker Images
+
+Kubernetes requires users to supply images that can be deployed into 
containers within pods. The images are built to
+be run in a container runtime environment that Kubernetes supports. Docker 
is a container runtime environment that is
+frequently used with Kubernetes. With Spark 2.3, there are Dockerfiles 
provided in the runnable distribution that can be customized
+and built for your usage.
+
+You may build these docker images from sources.
+There is a script, `sbin/build-push-docker-images.sh` that you can use to 
build and push
+customized Spark distribution images consisting of all the above 
components.
+
+Example usage is:
+
+./sbin/build-push-docker-images.sh -r  -t my-tag build
+./sbin/build-push-docker-images.sh -r  -t my-tag push
+
+Docker files are under the `dockerfiles/` and can be customized further 
before
+building using the supplied script, or manually.
+
+## Cluster Mode
+
+To launch Spark Pi in cluster mode,
+
+{% highlight bash %}
+$ bin/spark-submit \
+--master k8s://https://: \
+--deploy-mode cluster \
+--name spark-pi \
+--class org.apache.spark.examples.SparkPi \
+--conf spark.executor.instances=5 \
+--conf spark.kubernetes.driver.docker.image= \
+--conf spark.kubernetes.executor.docker.image= \
+local:///path/to/examples.jar
+{% endhighlight %}
+
+The Spark master, specified either via passing the `--master` command line 
argument to `spark-submit` or by setting
+`spark.master` in the application's configuration, must be a URL with the 
format `k8s://`. Prefixing the
+master string with `k8s://` will cause the Spark application to launch on 
the Kubernetes cluster, with the API server
+being contacted at `api_server_url`. If no HTTP protocol is specified in 
the URL, it defaults to `https`. For example,
+setting the master to `k8

  1   2   3   >