date:20180402

spark-website git commit: Update committer page

2018-04-02 Thread gurwls223

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 114925632 -> f524d4f53


Update committer page


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/f524d4f5
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/f524d4f5
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/f524d4f5

Branch: refs/heads/asf-site
Commit: f524d4f53dde007b6283eb7e7511620273b6262b
Parents: 1149256
Author: hyukjinkwon 
Authored: Mon Apr 2 17:14:03 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 12:39:41 2018 +0800

--
 committers.md| 2 +-
 site/committers.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/f524d4f5/committers.md
--
diff --git a/committers.md b/committers.md
index 299a160..3456a43 100644
--- a/committers.md
+++ b/committers.md
@@ -36,7 +36,7 @@ navigation:
 |Holden Karau|IBM|
 |Cody Koeninger|Nexstar Digital|
 |Andy Konwinski|Databricks|
-|Hyukjin Kwon|Mobigen|
+|Hyukjin Kwon|Hortonworks|
 |Ryan LeCompte|Quantifind|
 |Haoyuan Li|Alluxio, UC Berkeley|
 |Xiao Li|Databricks|

http://git-wip-us.apache.org/repos/asf/spark-website/blob/f524d4f5/site/committers.html
--
diff --git a/site/committers.html b/site/committers.html
index 7996091..ffca33e 100644
--- a/site/committers.html
+++ b/site/committers.html
@@ -311,7 +311,7 @@
 
 
   Hyukjin Kwon
-  Mobigen
+  Hortonworks
 
 
   Ryan LeCompte


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: add committer

2018-04-02 Thread gurwls223

Repository: spark-website
Updated Branches:
  refs/heads/asf-site a1d84bcbf -> 114925632


add committer


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/11492563
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/11492563
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/11492563

Branch: refs/heads/asf-site
Commit: 114925632af194d6dd7f2ca253c547e79aeb9364
Parents: a1d84bc
Author: Zhenhua Wang 
Authored: Mon Apr 2 23:10:31 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 12:34:10 2018 +0800

--
 committers.md| 1 +
 site/committers.html | 4 
 2 files changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/11492563/committers.md
--
diff --git a/committers.md b/committers.md
index 38fb3b0..299a160 100644
--- a/committers.md
+++ b/committers.md
@@ -64,6 +64,7 @@ navigation:
 |Takuya Ueshin|Databricks|
 |Marcelo Vanzin|Cloudera|
 |Shivaram Venkataraman|UC Berkeley|
+|Zhenhua Wang|Huawei|
 |Patrick Wendell|Databricks|
 |Andrew Xia|Alibaba|
 |Reynold Xin|Databricks|

http://git-wip-us.apache.org/repos/asf/spark-website/blob/11492563/site/committers.html
--
diff --git a/site/committers.html b/site/committers.html
index 044ad80..7996091 100644
--- a/site/committers.html
+++ b/site/committers.html
@@ -422,6 +422,10 @@
   UC Berkeley
 
 
+  Zhenhua Wang
+  Huawei
+
+
   Patrick Wendell
   Databricks
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][DOC] Fix a few markdown typos

2018-04-02 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/master 441d0d076 -> 8020f66fc


[MINOR][DOC] Fix a few markdown typos

## What changes were proposed in this pull request?

Easy fix in the markdown.

## How was this patch tested?

jekyII build test manually.

Please review http://spark.apache.org/contributing.html before opening a pull 
request.

Author: lemonjing <932191...@qq.com>

Closes #20897 from Lemonjing/master.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8020f66f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8020f66f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8020f66f

Branch: refs/heads/master
Commit: 8020f66fc47140a1b5f843fb18c34ec80541d5ca
Parents: 441d0d0
Author: lemonjing <932191...@qq.com>
Authored: Tue Apr 3 09:36:44 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 09:36:44 2018 +0800

--
 docs/ml-guide.md | 2 +-
 docs/mllib-feature-extraction.md | 4 ++--
 docs/mllib-pmml-model-export.md  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8020f66f/docs/ml-guide.md
--
diff --git a/docs/ml-guide.md b/docs/ml-guide.md
index 702bcf7..aea07be 100644
--- a/docs/ml-guide.md
+++ b/docs/ml-guide.md
@@ -111,7 +111,7 @@ and the migration guide below will explain all changes 
between releases.
 * The class and trait hierarchy for logistic regression model summaries was 
changed to be cleaner
 and better accommodate the addition of the multi-class summary. This is a 
breaking change for user
 code that casts a `LogisticRegressionTrainingSummary` to a
-` BinaryLogisticRegressionTrainingSummary`. Users should instead use the 
`model.binarySummary`
+`BinaryLogisticRegressionTrainingSummary`. Users should instead use the 
`model.binarySummary`
 method. See [SPARK-17139](https://issues.apache.org/jira/browse/SPARK-17139) 
for more detail 
 (_note_ this is an `Experimental` API). This _does not_ affect the Python 
`summary` method, which
 will still work correctly for both multinomial and binary cases.

http://git-wip-us.apache.org/repos/asf/spark/blob/8020f66f/docs/mllib-feature-extraction.md
--
diff --git a/docs/mllib-feature-extraction.md b/docs/mllib-feature-extraction.md
index 75aea70..8b89296 100644
--- a/docs/mllib-feature-extraction.md
+++ b/docs/mllib-feature-extraction.md
@@ -278,8 +278,8 @@ for details on the API.
 multiplication. In other words, it scales each column of the dataset by a 
scalar multiplier. This
 represents the [Hadamard 
product](https://en.wikipedia.org/wiki/Hadamard_product_%28matrices%29)
 between the input vector, `v` and transforming vector, `scalingVec`, to yield 
a result vector.
-Qu8T948*1#
-Denoting the `scalingVec` as "`w`," this transformation may be written as:
+
+Denoting the `scalingVec` as "`w`", this transformation may be written as:
 
 `\[ \begin{pmatrix}
 v_1 \\

http://git-wip-us.apache.org/repos/asf/spark/blob/8020f66f/docs/mllib-pmml-model-export.md
--
diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md
index d353090..f567565 100644
--- a/docs/mllib-pmml-model-export.md
+++ b/docs/mllib-pmml-model-export.md
@@ -7,7 +7,7 @@ displayTitle: PMML model export - RDD-based API
 * Table of contents
 {:toc}
 
-## `spark.mllib` supported models
+## spark.mllib supported models
 
 `spark.mllib` supports model export to Predictive Model Markup Language 
([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)).
 
@@ -15,7 +15,7 @@ The table below outlines the `spark.mllib` models that can be 
exported to PMML a
 
 
   
-`spark.mllib` modelPMML model
+spark.mllib modelPMML model
   
   
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][DOC] Fix a few markdown typos

2018-04-02 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 6ca6483c1 -> ce1565115


[MINOR][DOC] Fix a few markdown typos

## What changes were proposed in this pull request?

Easy fix in the markdown.

## How was this patch tested?

jekyII build test manually.

Please review http://spark.apache.org/contributing.html before opening a pull 
request.

Author: lemonjing <932191...@qq.com>

Closes #20897 from Lemonjing/master.

(cherry picked from commit 8020f66fc47140a1b5f843fb18c34ec80541d5ca)
Signed-off-by: hyukjinkwon 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce156511
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ce156511
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ce156511

Branch: refs/heads/branch-2.3
Commit: ce1565115481343af9043ecc4080d6d97eee698c
Parents: 6ca6483
Author: lemonjing <932191...@qq.com>
Authored: Tue Apr 3 09:36:44 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 09:36:59 2018 +0800

--
 docs/ml-guide.md | 2 +-
 docs/mllib-feature-extraction.md | 4 ++--
 docs/mllib-pmml-model-export.md  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ce156511/docs/ml-guide.md
--
diff --git a/docs/ml-guide.md b/docs/ml-guide.md
index 702bcf7..aea07be 100644
--- a/docs/ml-guide.md
+++ b/docs/ml-guide.md
@@ -111,7 +111,7 @@ and the migration guide below will explain all changes 
between releases.
 * The class and trait hierarchy for logistic regression model summaries was 
changed to be cleaner
 and better accommodate the addition of the multi-class summary. This is a 
breaking change for user
 code that casts a `LogisticRegressionTrainingSummary` to a
-` BinaryLogisticRegressionTrainingSummary`. Users should instead use the 
`model.binarySummary`
+`BinaryLogisticRegressionTrainingSummary`. Users should instead use the 
`model.binarySummary`
 method. See [SPARK-17139](https://issues.apache.org/jira/browse/SPARK-17139) 
for more detail 
 (_note_ this is an `Experimental` API). This _does not_ affect the Python 
`summary` method, which
 will still work correctly for both multinomial and binary cases.

http://git-wip-us.apache.org/repos/asf/spark/blob/ce156511/docs/mllib-feature-extraction.md
--
diff --git a/docs/mllib-feature-extraction.md b/docs/mllib-feature-extraction.md
index 75aea70..8b89296 100644
--- a/docs/mllib-feature-extraction.md
+++ b/docs/mllib-feature-extraction.md
@@ -278,8 +278,8 @@ for details on the API.
 multiplication. In other words, it scales each column of the dataset by a 
scalar multiplier. This
 represents the [Hadamard 
product](https://en.wikipedia.org/wiki/Hadamard_product_%28matrices%29)
 between the input vector, `v` and transforming vector, `scalingVec`, to yield 
a result vector.
-Qu8T948*1#
-Denoting the `scalingVec` as "`w`," this transformation may be written as:
+
+Denoting the `scalingVec` as "`w`", this transformation may be written as:
 
 `\[ \begin{pmatrix}
 v_1 \\

http://git-wip-us.apache.org/repos/asf/spark/blob/ce156511/docs/mllib-pmml-model-export.md
--
diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md
index d353090..f567565 100644
--- a/docs/mllib-pmml-model-export.md
+++ b/docs/mllib-pmml-model-export.md
@@ -7,7 +7,7 @@ displayTitle: PMML model export - RDD-based API
 * Table of contents
 {:toc}
 
-## `spark.mllib` supported models
+## spark.mllib supported models
 
 `spark.mllib` supports model export to Predictive Model Markup Language 
([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)).
 
@@ -15,7 +15,7 @@ The table below outlines the `spark.mllib` models that can be 
exported to PMML a
 
 
   
-`spark.mllib` modelPMML model
+spark.mllib modelPMML model
   
   
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

2018-04-02 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/branch-2.3 f1f10da2b -> 6ca6483c1


[SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

These tests can fail with a timeout if the remote repos are not responding,
or slow. The tests don't need anything from those repos, so use an empty
ivy config file to avoid setting up the defaults.

The tests are passing reliably for me locally now, and failing more often
than not today without this change since 
http://dl.bintray.com/spark-packages/maven
doesn't seem to be loading from my machine.

Author: Marcelo Vanzin 

Closes #20916 from vanzin/SPARK-19964.

(cherry picked from commit 441d0d0766e9a6ac4c6ff79680394999ff7191fd)
Signed-off-by: hyukjinkwon 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6ca6483c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6ca6483c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6ca6483c

Branch: refs/heads/branch-2.3
Commit: 6ca6483c122baa40d69c1781bb34a3cd9e1361c0
Parents: f1f10da
Author: Marcelo Vanzin 
Authored: Tue Apr 3 09:31:47 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 09:32:03 2018 +0800

--
 .../org/apache/spark/deploy/DependencyUtils.scala  | 13 -
 .../scala/org/apache/spark/deploy/SparkSubmit.scala|  3 ++-
 .../org/apache/spark/deploy/SparkSubmitArguments.scala |  2 ++
 .../org/apache/spark/deploy/worker/DriverWrapper.scala | 13 +
 .../org/apache/spark/deploy/SparkSubmitSuite.scala |  9 ++---
 5 files changed, 27 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6ca6483c/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala 
b/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
index ab319c8..fac834a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
@@ -33,7 +33,8 @@ private[deploy] object DependencyUtils {
   packagesExclusions: String,
   packages: String,
   repositories: String,
-  ivyRepoPath: String): String = {
+  ivyRepoPath: String,
+  ivySettingsPath: Option[String]): String = {
 val exclusions: Seq[String] =
   if (!StringUtils.isBlank(packagesExclusions)) {
 packagesExclusions.split(",")
@@ -41,10 +42,12 @@ private[deploy] object DependencyUtils {
 Nil
   }
 // Create the IvySettings, either load from file or build defaults
-val ivySettings = sys.props.get("spark.jars.ivySettings").map { 
ivySettingsFile =>
-  SparkSubmitUtils.loadIvySettings(ivySettingsFile, Option(repositories), 
Option(ivyRepoPath))
-}.getOrElse {
-  SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
+val ivySettings = ivySettingsPath match {
+  case Some(path) =>
+SparkSubmitUtils.loadIvySettings(path, Option(repositories), 
Option(ivyRepoPath))
+
+  case None =>
+SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
 }
 
 SparkSubmitUtils.resolveMavenCoordinates(packages, ivySettings, exclusions 
= exclusions)

http://git-wip-us.apache.org/repos/asf/spark/blob/6ca6483c/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index b44c880..deb52a4 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -361,7 +361,8 @@ object SparkSubmit extends CommandLineUtils with Logging {
   // Resolve maven dependencies if there are any and add classpath to 
jars. Add them to py-files
   // too for packages that include Python code
   val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(
-args.packagesExclusions, args.packages, args.repositories, 
args.ivyRepoPath)
+args.packagesExclusions, args.packages, args.repositories, 
args.ivyRepoPath,
+args.ivySettingsPath)
 
   if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
 args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates)

http://git-wip-us.apache.org/repos/asf/spark/blob/6ca6483c/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit

spark git commit: [SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

2018-04-02 Thread gurwls223

Repository: spark
Updated Branches:
  refs/heads/master a1351828d -> 441d0d076


[SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

These tests can fail with a timeout if the remote repos are not responding,
or slow. The tests don't need anything from those repos, so use an empty
ivy config file to avoid setting up the defaults.

The tests are passing reliably for me locally now, and failing more often
than not today without this change since 
http://dl.bintray.com/spark-packages/maven
doesn't seem to be loading from my machine.

Author: Marcelo Vanzin 

Closes #20916 from vanzin/SPARK-19964.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/441d0d07
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/441d0d07
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/441d0d07

Branch: refs/heads/master
Commit: 441d0d0766e9a6ac4c6ff79680394999ff7191fd
Parents: a135182
Author: Marcelo Vanzin 
Authored: Tue Apr 3 09:31:47 2018 +0800
Committer: hyukjinkwon 
Committed: Tue Apr 3 09:31:47 2018 +0800

--
 .../org/apache/spark/deploy/DependencyUtils.scala  | 13 -
 .../scala/org/apache/spark/deploy/SparkSubmit.scala|  3 ++-
 .../org/apache/spark/deploy/SparkSubmitArguments.scala |  2 ++
 .../org/apache/spark/deploy/worker/DriverWrapper.scala | 13 +
 .../org/apache/spark/deploy/SparkSubmitSuite.scala |  9 ++---
 5 files changed, 27 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/441d0d07/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala 
b/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
index ab319c8..fac834a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala
@@ -33,7 +33,8 @@ private[deploy] object DependencyUtils {
   packagesExclusions: String,
   packages: String,
   repositories: String,
-  ivyRepoPath: String): String = {
+  ivyRepoPath: String,
+  ivySettingsPath: Option[String]): String = {
 val exclusions: Seq[String] =
   if (!StringUtils.isBlank(packagesExclusions)) {
 packagesExclusions.split(",")
@@ -41,10 +42,12 @@ private[deploy] object DependencyUtils {
 Nil
   }
 // Create the IvySettings, either load from file or build defaults
-val ivySettings = sys.props.get("spark.jars.ivySettings").map { 
ivySettingsFile =>
-  SparkSubmitUtils.loadIvySettings(ivySettingsFile, Option(repositories), 
Option(ivyRepoPath))
-}.getOrElse {
-  SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
+val ivySettings = ivySettingsPath match {
+  case Some(path) =>
+SparkSubmitUtils.loadIvySettings(path, Option(repositories), 
Option(ivyRepoPath))
+
+  case None =>
+SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
 }
 
 SparkSubmitUtils.resolveMavenCoordinates(packages, ivySettings, exclusions 
= exclusions)

http://git-wip-us.apache.org/repos/asf/spark/blob/441d0d07/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 3965f17..eddbede 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -359,7 +359,8 @@ object SparkSubmit extends CommandLineUtils with Logging {
   // Resolve maven dependencies if there are any and add classpath to 
jars. Add them to py-files
   // too for packages that include Python code
   val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(
-args.packagesExclusions, args.packages, args.repositories, 
args.ivyRepoPath)
+args.packagesExclusions, args.packages, args.repositories, 
args.ivyRepoPath,
+args.ivySettingsPath)
 
   if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
 args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates)

http://git-wip-us.apache.org/repos/asf/spark/blob/441d0d07/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index e7796d4..8e70705 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmi

spark git commit: [SPARK-23690][ML] Add handleinvalid to VectorAssembler

2018-04-02 Thread jkbradley

Repository: spark
Updated Branches:
  refs/heads/master 28ea4e314 -> a1351828d


[SPARK-23690][ML] Add handleinvalid to VectorAssembler

## What changes were proposed in this pull request?

Introduce `handleInvalid` parameter in `VectorAssembler` that can take in 
`"keep", "skip", "error"` options. "error" throws an error on seeing a row 
containing a `null`, "skip" filters out all such rows, and "keep" adds relevant 
number of NaN. "keep" figures out an example to find out what this number of 
NaN s should be added and throws an error when no such number could be found.

## How was this patch tested?

Unit tests are added to check the behavior of `assemble` on specific rows and 
the transformer is called on `DataFrame`s of different configurations to test 
different corner cases.

Author: Yogesh Garg 
Author: Bago Amirbekian 
Author: Yogesh Garg <1059168+yoge...@users.noreply.github.com>

Closes #20829 from yogeshg/rformula_handleinvalid.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a1351828
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a1351828
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a1351828

Branch: refs/heads/master
Commit: a1351828d376a01e5ee0959cf608f767d756dd86
Parents: 28ea4e3
Author: Yogesh Garg 
Authored: Mon Apr 2 16:41:26 2018 -0700
Committer: Joseph K. Bradley 
Committed: Mon Apr 2 16:41:26 2018 -0700

--
 .../apache/spark/ml/feature/StringIndexer.scala |   2 +-
 .../spark/ml/feature/VectorAssembler.scala  | 198 +++
 .../spark/ml/feature/VectorAssemblerSuite.scala | 131 ++--
 3 files changed, 284 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a1351828/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
index 1cdcdfc..67cdb09 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
@@ -234,7 +234,7 @@ class StringIndexerModel (
 val metadata = NominalAttribute.defaultAttr
   .withName($(outputCol)).withValues(filteredLabels).toMetadata()
 // If we are skipping invalid records, filter them out.
-val (filteredDataset, keepInvalid) = getHandleInvalid match {
+val (filteredDataset, keepInvalid) = $(handleInvalid) match {
   case StringIndexer.SKIP_INVALID =>
 val filterer = udf { label: String =>
   labelToIndex.contains(label)

http://git-wip-us.apache.org/repos/asf/spark/blob/a1351828/mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala
index b373ae9..6bf4aa3 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala
@@ -17,14 +17,17 @@
 
 package org.apache.spark.ml.feature
 
-import scala.collection.mutable.ArrayBuilder
+import java.util.NoSuchElementException
+
+import scala.collection.mutable
+import scala.language.existentials
 
 import org.apache.spark.SparkException
 import org.apache.spark.annotation.Since
 import org.apache.spark.ml.Transformer
 import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, 
NumericAttribute, UnresolvedAttribute}
 import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
-import org.apache.spark.ml.param.ParamMap
+import org.apache.spark.ml.param.{Param, ParamMap, ParamValidators}
 import org.apache.spark.ml.param.shared._
 import org.apache.spark.ml.util._
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
@@ -33,10 +36,14 @@ import org.apache.spark.sql.types._
 
 /**
  * A feature transformer that merges multiple columns into a vector column.
+ *
+ * This requires one pass over the entire dataset. In case we need to infer 
column lengths from the
+ * data we require an additional call to the 'first' Dataset method, see 
'handleInvalid' parameter.
  */
 @Since("1.4.0")
 class VectorAssembler @Since("1.4.0") (@Since("1.4.0") override val uid: 
String)
-  extends Transformer with HasInputCols with HasOutputCol with 
DefaultParamsWritable {
+  extends Transformer with HasInputCols with HasOutputCol with HasHandleInvalid
+with DefaultParamsWritable {
 
   @Since("1.4.0")
   def this() = this(Identifiable.randomUID("vecAssembler"))
@@ -49,32 +56,63 @@ class VectorAssembler @Since("1.4.0") (@Since("1.4.0") 
override val uid:

spark git commit: [SPARK-23834][TEST] Wait for connection before disconnect in LauncherServer test.

2018-04-02 Thread vanzin

Repository: spark
Updated Branches:
  refs/heads/master a7c19d9c2 -> 28ea4e314


[SPARK-23834][TEST] Wait for connection before disconnect in LauncherServer 
test.

It was possible that the disconnect() was called on the handle before the
server had received the handshake messages, so no connection was yet
attached to the handle. The fix waits until we're sure the handle has been
mapped to a client connection.

Author: Marcelo Vanzin 

Closes #20950 from vanzin/SPARK-23834.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/28ea4e31
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/28ea4e31
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/28ea4e31

Branch: refs/heads/master
Commit: 28ea4e3142b88eb396aa8dd5daf7b02b556204ba
Parents: a7c19d9
Author: Marcelo Vanzin 
Authored: Mon Apr 2 14:35:07 2018 -0700
Committer: Marcelo Vanzin 
Committed: Mon Apr 2 14:35:07 2018 -0700

--
 .../java/org/apache/spark/launcher/LauncherServerSuite.java  | 8 
 1 file changed, 8 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/28ea4e31/launcher/src/test/java/org/apache/spark/launcher/LauncherServerSuite.java
--
diff --git 
a/launcher/src/test/java/org/apache/spark/launcher/LauncherServerSuite.java 
b/launcher/src/test/java/org/apache/spark/launcher/LauncherServerSuite.java
index 5413d3a..f8dc0ec 100644
--- a/launcher/src/test/java/org/apache/spark/launcher/LauncherServerSuite.java
+++ b/launcher/src/test/java/org/apache/spark/launcher/LauncherServerSuite.java
@@ -196,6 +196,14 @@ public class LauncherServerSuite extends BaseSuite {
   Socket s = new Socket(InetAddress.getLoopbackAddress(), 
server.getPort());
   client = new TestClient(s);
   client.send(new Hello(secret, "1.4.0"));
+  client.send(new SetAppId("someId"));
+
+  // Wait until we know the server has received the messages and matched 
the handle to the
+  // connection before disconnecting.
+  eventually(Duration.ofSeconds(1), Duration.ofMillis(10), () -> {
+assertEquals("someId", handle.getAppId());
+  });
+
   handle.disconnect();
   waitForError(client, secret);
 } finally {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-23713][SQL] Cleanup UnsafeWriter and BufferHolder classes

2018-04-02 Thread hvanhovell

Repository: spark
Updated Branches:
  refs/heads/master fe2b7a456 -> a7c19d9c2


[SPARK-23713][SQL] Cleanup UnsafeWriter and BufferHolder classes

## What changes were proposed in this pull request?

This PR implemented the following cleanups related to  `UnsafeWriter` class:
- Remove code duplication between `UnsafeRowWriter` and `UnsafeArrayWriter`
- Make `BufferHolder` class internal by delegating its accessor methods to 
`UnsafeWriter`
- Replace `UnsafeRow.setTotalSize(...)` with `UnsafeRowWriter.setTotalSize()`

## How was this patch tested?

Tested by existing UTs

Author: Kazuaki Ishizaki 

Closes #20850 from kiszk/SPARK-23713.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a7c19d9c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a7c19d9c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a7c19d9c

Branch: refs/heads/master
Commit: a7c19d9c21d59fd0109a7078c80b33d3da03fafd
Parents: fe2b7a4
Author: Kazuaki Ishizaki 
Authored: Mon Apr 2 21:48:44 2018 +0200
Committer: Herman van Hovell 
Committed: Mon Apr 2 21:48:44 2018 +0200

--
 .../sql/kafka010/KafkaContinuousReader.scala|   3 -
 .../KafkaRecordToUnsafeRowConverter.scala   |  11 +-
 .../expressions/codegen/BufferHolder.java   |  32 ++--
 .../expressions/codegen/UnsafeArrayWriter.java  | 133 +++--
 .../expressions/codegen/UnsafeRowWriter.java| 189 +++
 .../expressions/codegen/UnsafeWriter.java   | 157 ++-
 .../InterpretedUnsafeProjection.scala   |  90 -
 .../codegen/GenerateUnsafeProjection.scala  | 124 ++--
 .../expressions/RowBasedKeyValueBatchSuite.java |  28 +--
 .../aggregate/RowBasedHashMapGenerator.scala|  12 +-
 .../columnar/GenerateColumnAccessor.scala   |   9 +-
 .../datasources/text/TextFileFormat.scala   |  11 +-
 12 files changed, 391 insertions(+), 408 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a7c19d9c/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala
--
diff --git 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala
 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala
index e7e2787..f26c134 100644
--- 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala
+++ 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala
@@ -27,13 +27,10 @@ import org.apache.spark.TaskContext
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.expressions.UnsafeRow
-import org.apache.spark.sql.catalyst.expressions.codegen.{BufferHolder, 
UnsafeRowWriter}
-import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import 
org.apache.spark.sql.kafka010.KafkaSourceProvider.{INSTRUCTION_FOR_FAIL_ON_DATA_LOSS_FALSE,
 INSTRUCTION_FOR_FAIL_ON_DATA_LOSS_TRUE}
 import org.apache.spark.sql.sources.v2.reader._
 import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousDataReader, 
ContinuousReader, Offset, PartitionOffset}
 import org.apache.spark.sql.types.StructType
-import org.apache.spark.unsafe.types.UTF8String
 
 /**
  * A [[ContinuousReader]] for data from kafka.

http://git-wip-us.apache.org/repos/asf/spark/blob/a7c19d9c/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala
--
diff --git 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala
 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala
index 1acdd56..f35a143 100644
--- 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala
+++ 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala
@@ -20,18 +20,16 @@ package org.apache.spark.sql.kafka010
 import org.apache.kafka.clients.consumer.ConsumerRecord
 
 import org.apache.spark.sql.catalyst.expressions.UnsafeRow
-import org.apache.spark.sql.catalyst.expressions.codegen.{BufferHolder, 
UnsafeRowWriter}
+import org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.unsafe.types.UTF8String
 
 /** A simple class for converting Kafka ConsumerRecord to UnsafeRow */
 private[kafka010] class KafkaRecordToUnsafeRowConverter {
-  private val sharedRow = new UnsafeRow(7)
-  private val bufferHolder = new BufferHolder(sharedRow)
-  privat

spark git commit: [SPARK-23285][K8S] Add a config property for specifying physical executor cores

2018-04-02 Thread foxish

Repository: spark
Updated Branches:
  refs/heads/master 6151f29f9 -> fe2b7a456


[SPARK-23285][K8S] Add a config property for specifying physical executor cores

## What changes were proposed in this pull request?

As mentioned in SPARK-23285, this PR introduces a new configuration property 
`spark.kubernetes.executor.cores` for specifying the physical CPU cores 
requested for each executor pod. This is to avoid changing the semantics of 
`spark.executor.cores` and `spark.task.cpus` and their role in task scheduling, 
task parallelism, dynamic resource allocation, etc. The new configuration 
property only determines the physical CPU cores available to an executor. An 
executor can still run multiple tasks simultaneously by using appropriate 
values for `spark.executor.cores` and `spark.task.cpus`.

## How was this patch tested?

Unit tests.

felixcheung srowen jiangxb1987 jerryshao mccheah foxish

Author: Yinan Li 
Author: Yinan Li 

Closes #20553 from liyinan926/master.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fe2b7a45
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fe2b7a45
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fe2b7a45

Branch: refs/heads/master
Commit: fe2b7a4568d65a62da6e6eb00fff05f248b4332c
Parents: 6151f29
Author: Yinan Li 
Authored: Mon Apr 2 12:20:55 2018 -0700
Committer: Anirudh Ramanathan 
Committed: Mon Apr 2 12:20:55 2018 -0700

--
 docs/running-on-kubernetes.md   | 15 ---
 .../org/apache/spark/deploy/k8s/Config.scala|  6 +
 .../cluster/k8s/ExecutorPodFactory.scala| 12 ++---
 .../cluster/k8s/ExecutorPodFactorySuite.scala   | 27 
 4 files changed, 53 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/fe2b7a45/docs/running-on-kubernetes.md
--
diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index 975b28d..9c46449 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -549,14 +549,23 @@ specific to Spark on Kubernetes.
   spark.kubernetes.driver.limit.cores
   (none)
   
-Specify the hard CPU 
[limit](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container)
 for the driver pod.
+Specify a hard cpu 
[limit](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container)
 for the driver pod.
   
 
 
+  spark.kubernetes.executor.request.cores
+  (none)
+  
+Specify the cpu request for each executor pod. Values conform to the 
Kubernetes 
[convention](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu).
 
+Example values include 0.1, 500m, 1.5, 5, etc., with the definition of cpu 
units documented in [CPU 
units](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#cpu-units).
   
+This is distinct from spark.executor.cores: it is only used 
and takes precedence over spark.executor.cores for specifying the 
executor pod cpu request if set. Task 
+parallelism, e.g., number of tasks an executor can run concurrently is not 
affected by this.
+
+
   spark.kubernetes.executor.limit.cores
   (none)
   
-Specify the hard CPU 
[limit](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container)
 for each executor pod launched for the Spark Application.
+Specify a hard cpu 
[limit](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-requests-and-limits-of-pod-and-container)
 for each executor pod launched for the Spark Application.
   
 
 
@@ -593,4 +602,4 @@ specific to Spark on Kubernetes.
spark.kubernetes.executor.secrets.spark-secret=/etc/secrets.
   
 
-
\ No newline at end of file
+

http://git-wip-us.apache.org/repos/asf/spark/blob/fe2b7a45/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
--
diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
index da34a7e..405ea47 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
@@ -91,6 +91,12 @@ private[spark] object Config extends Logging {
   .stringConf
   .createOptional
 
+  val KUBERNETES_EXECUTOR_REQUEST_CORES =
+ConfigB

svn commit: r26088 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_02_12_01-6151f29-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-04-02 Thread pwendell

Author: pwendell
Date: Mon Apr  2 19:17:05 2018
New Revision: 26088

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_04_02_12_01-6151f29 docs


[This commit notification would consist of 1452 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-23825][K8S] Requesting memory + memory overhead for pod memory

2018-04-02 Thread mcheah

Repository: spark
Updated Branches:
  refs/heads/master 44a9f8e6e -> 6151f29f9


[SPARK-23825][K8S] Requesting memory + memory overhead for pod memory

## What changes were proposed in this pull request?

Kubernetes driver and executor pods should request `memory + memoryOverhead` as 
their resources instead of just `memory`, see 
https://issues.apache.org/jira/browse/SPARK-23825

## How was this patch tested?
Existing unit tests were adapted.

Author: David Vogelbacher 

Closes #20943 from dvogelbacher/spark-23825.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6151f29f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6151f29f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6151f29f

Branch: refs/heads/master
Commit: 6151f29f9f589301159482044fc32717f430db6e
Parents: 44a9f8e
Author: David Vogelbacher 
Authored: Mon Apr 2 12:00:37 2018 -0700
Committer: mcheah 
Committed: Mon Apr 2 12:00:37 2018 -0700

--
 .../deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala | 5 +
 .../spark/scheduler/cluster/k8s/ExecutorPodFactory.scala   | 5 +
 .../k8s/submit/steps/BasicDriverConfigurationStepSuite.scala   | 2 +-
 .../spark/scheduler/cluster/k8s/ExecutorPodFactorySuite.scala  | 6 --
 4 files changed, 7 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6151f29f/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala
--
diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala
index 347c4d2..b811db3 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep.scala
@@ -93,9 +93,6 @@ private[spark] class BasicDriverConfigurationStep(
   .withAmount(driverCpuCores)
   .build()
 val driverMemoryQuantity = new QuantityBuilder(false)
-  .withAmount(s"${driverMemoryMiB}Mi")
-  .build()
-val driverMemoryLimitQuantity = new QuantityBuilder(false)
   .withAmount(s"${driverMemoryWithOverheadMiB}Mi")
   .build()
 val maybeCpuLimitQuantity = driverLimitCores.map { limitCores =>
@@ -117,7 +114,7 @@ private[spark] class BasicDriverConfigurationStep(
   .withNewResources()
 .addToRequests("cpu", driverCpuQuantity)
 .addToRequests("memory", driverMemoryQuantity)
-.addToLimits("memory", driverMemoryLimitQuantity)
+.addToLimits("memory", driverMemoryQuantity)
 .addToLimits(maybeCpuLimitQuantity.toMap.asJava)
 .endResources()
   .addToArgs("driver")

http://git-wip-us.apache.org/repos/asf/spark/blob/6151f29f/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
--
diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
index 98cbd56..ac42385 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
@@ -108,9 +108,6 @@ private[spark] class ExecutorPodFactory(
   SPARK_ROLE_LABEL -> SPARK_POD_EXECUTOR_ROLE) ++
   executorLabels
 val executorMemoryQuantity = new QuantityBuilder(false)
-  .withAmount(s"${executorMemoryMiB}Mi")
-  .build()
-val executorMemoryLimitQuantity = new QuantityBuilder(false)
   .withAmount(s"${executorMemoryWithOverhead}Mi")
   .build()
 val executorCpuQuantity = new QuantityBuilder(false)
@@ -167,7 +164,7 @@ private[spark] class ExecutorPodFactory(
   .withImagePullPolicy(imagePullPolicy)
   .withNewResources()
 .addToRequests("memory", executorMemoryQuantity)
-.addToLimits("memory", executorMemoryLimitQuantity)
+.addToLimits("memory", executorMemoryQuantity)
 .addToRequests("cpu", executorCpuQuantity)
 .endResources()
   .addAllToEnv(executorEnv.asJava)

http://git-wip-us.apache.org/repos/asf/spark/blob/6151f29f/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/submit/steps/BasicDriverConfigurationStep

spark git commit: [SPARK-15009][PYTHON][FOLLOWUP] Add default param checks for CountVectorizerModel

2018-04-02 Thread cutlerb

Repository: spark
Updated Branches:
  refs/heads/master 529f84710 -> 44a9f8e6e


[SPARK-15009][PYTHON][FOLLOWUP] Add default param checks for 
CountVectorizerModel

## What changes were proposed in this pull request?

Adding test for default params for `CountVectorizerModel` constructed from 
vocabulary.  This required that the param `maxDF` be added, which was done in 
SPARK-23615.

## How was this patch tested?

Added an explicit test for CountVectorizerModel in DefaultValuesTests.

Author: Bryan Cutler 

Closes #20942 from 
BryanCutler/pyspark-CountVectorizerModel-default-param-test-SPARK-15009.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/44a9f8e6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/44a9f8e6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/44a9f8e6

Branch: refs/heads/master
Commit: 44a9f8e6e82c300dc61ca18515aee16f17f27501
Parents: 529f847
Author: Bryan Cutler 
Authored: Mon Apr 2 09:53:37 2018 -0700
Committer: Bryan Cutler 
Committed: Mon Apr 2 09:53:37 2018 -0700

--
 python/pyspark/ml/tests.py | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/44a9f8e6/python/pyspark/ml/tests.py
--
diff --git a/python/pyspark/ml/tests.py b/python/pyspark/ml/tests.py
index 6b4376c..c2c4861 100755
--- a/python/pyspark/ml/tests.py
+++ b/python/pyspark/ml/tests.py
@@ -2096,6 +2096,11 @@ class DefaultValuesTests(PySparkTestCase):
 # NOTE: disable check_params_exist until there is parity 
with Scala API
 ParamTests.check_params(self, cls(), 
check_params_exist=False)
 
+# Additional classes that need explicit construction
+from pyspark.ml.feature import CountVectorizerModel
+ParamTests.check_params(self, 
CountVectorizerModel.from_vocabulary(['a'], 'input'),
+check_params_exist=False)
+
 
 def _squared_distance(a, b):
 if isinstance(a, Vector):


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: Update committer page

spark-website git commit: add committer

spark git commit: [MINOR][DOC] Fix a few markdown typos

spark git commit: [MINOR][DOC] Fix a few markdown typos

spark git commit: [SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

spark git commit: [SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.

spark git commit: [SPARK-23690][ML] Add handleinvalid to VectorAssembler

spark git commit: [SPARK-23834][TEST] Wait for connection before disconnect in LauncherServer test.

spark git commit: [SPARK-23713][SQL] Cleanup UnsafeWriter and BufferHolder classes

spark git commit: [SPARK-23285][K8S] Add a config property for specifying physical executor cores

svn commit: r26088 - in /dev/spark/2.4.0-SNAPSHOT-2018_04_02_12_01-6151f29-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

spark git commit: [SPARK-23825][K8S] Requesting memory + memory overhead for pod memory

spark git commit: [SPARK-15009][PYTHON][FOLLOWUP] Add default param checks for CountVectorizerModel

13 matches

Site Navigation

Mail list logo

Footer information