[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79855/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79855 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79855/testReport)**
 for PR 18388 at commit 
[`4bfeabb`](https://github.com/apache/spark/commit/4bfeabb8755b71f161f086ef68f95f522b848f23).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18709#discussion_r128890400
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -345,6 +347,7 @@ object CatalogTable {
   val VIEW_QUERY_OUTPUT_PREFIX = "view.query.out."
   val VIEW_QUERY_OUTPUT_NUM_COLUMNS = VIEW_QUERY_OUTPUT_PREFIX + "numCols"
   val VIEW_QUERY_OUTPUT_COLUMN_NAME_PREFIX = VIEW_QUERY_OUTPUT_PREFIX + 
"col."
+  val SCHEMA_SPARK_VERSION = "spark.sql.create.version"
--- End diff --

`CREATED_SPARK_VERSION`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18709#discussion_r128890390
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -304,6 +305,7 @@ case class CatalogTable(
 if (owner.nonEmpty) map.put("Owner", owner)
 map.put("Created", new Date(createTime).toString)
--- End diff --

not related, but seems `Created Time` is better?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18709#discussion_r128890385
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -304,6 +305,7 @@ case class CatalogTable(
 if (owner.nonEmpty) map.put("Owner", owner)
 map.put("Created", new Date(createTime).toString)
 map.put("Last Access", new Date(lastAccessTime).toString)
+map.put("Create Version", createVersion)
--- End diff --

Created Version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18709#discussion_r128890382
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -217,6 +217,7 @@ case class CatalogTable(
 owner: String = "",
 createTime: Long = System.currentTimeMillis,
 lastAccessTime: Long = -1,
+createVersion: String = org.apache.spark.SPARK_VERSION,
--- End diff --

add parameter doc?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17006: [SPARK-17636] Parquet filter push down doesn't handle st...

2017-07-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17006
  
Does that one deal with nested filter access as well we nested column 
pruning?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17006: [SPARK-17636] Parquet filter push down doesn't handle st...

2017-07-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17006
  
No, this is filter push down whereas that one is column pruneing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18313: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-07-21 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18313#discussion_r12283
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -113,15 +122,28 @@ class CrossValidator @Since("1.2.0") (@Since("1.4.0") 
override val uid: String)
   // multi-model training
   logDebug(s"Train split $splitIndex with multiple sets of 
parameters.")
   val models = est.fit(trainingDataset, 
epm).asInstanceOf[Seq[Model[_]]]
-  trainingDataset.unpersist()
+
   var i = 0
   while (i < numModels) {
 // TODO: duplicate evaluator to take extra params from input
 val metric = eval.evaluate(models(i).transform(validationDataset, 
epm(i)))
 logDebug(s"Got metric $metric for model trained with ${epm(i)}.")
+if (isDefined(modelPreservePath)) {
+  models(i) match {
+case w: MLWritable =>
+  // e.g. maxIter-5-regParam-0.001-split0-0.859
+  val fileName = epm(i).toSeq.map(p => p.param.name + "-" + 
p.value).sorted
+.mkString("-") + s"-split$splitIndex-${math.rint(metric * 
1000) / 1000}"
+  w.save(new Path($(modelPreservePath), fileName).toString)
+case _ =>
+  // for third-party algorithms
+  logWarning(models(i).uid + " did not implement MLWritable. 
Serialization omitted.")
+  }
+}
 metrics(i) += metric
--- End diff --

Yes I think so.
In order to save time, I would like to take over this feature, if you don't 
mind.
ping @jkbradley 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17006: [SPARK-17636] Parquet filter push down doesn't handle st...

2017-07-21 Thread Gauravshah
Github user Gauravshah commented on the issue:

https://github.com/apache/spark/pull/17006
  
https://github.com/apache/spark/pull/16578 PR should solve this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16992: [SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit T...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16992
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16992: [SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit T...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16992
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79854/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16992: [SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit T...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16992
  
**[Test build #79854 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79854/testReport)**
 for PR 16992 at commit 
[`2e5afe4`](https://github.com/apache/spark/commit/2e5afe4852699aea7e33b0c889b78202b5fe184c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79857 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79857/testReport)**
 for PR 18388 at commit 
[`4de417f`](https://github.com/apache/spark/commit/4de417f946430dd6d963768583d5fa1f22fe4622).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18313: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-07-21 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request:

https://github.com/apache/spark/pull/18313#discussion_r128886371
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -113,15 +122,28 @@ class CrossValidator @Since("1.2.0") (@Since("1.4.0") 
override val uid: String)
   // multi-model training
   logDebug(s"Train split $splitIndex with multiple sets of 
parameters.")
   val models = est.fit(trainingDataset, 
epm).asInstanceOf[Seq[Model[_]]]
-  trainingDataset.unpersist()
+
   var i = 0
   while (i < numModels) {
 // TODO: duplicate evaluator to take extra params from input
 val metric = eval.evaluate(models(i).transform(validationDataset, 
epm(i)))
 logDebug(s"Got metric $metric for model trained with ${epm(i)}.")
+if (isDefined(modelPreservePath)) {
+  models(i) match {
+case w: MLWritable =>
+  // e.g. maxIter-5-regParam-0.001-split0-0.859
+  val fileName = epm(i).toSeq.map(p => p.param.name + "-" + 
p.value).sorted
+.mkString("-") + s"-split$splitIndex-${math.rint(metric * 
1000) / 1000}"
+  w.save(new Path($(modelPreservePath), fileName).toString)
+case _ =>
+  // for third-party algorithms
+  logWarning(models(i).uid + " did not implement MLWritable. 
Serialization omitted.")
+  }
+}
 metrics(i) += metric
--- End diff --

so you want to keep all the trained models in CrossValidatorModel?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79856 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79856/testReport)**
 for PR 18388 at commit 
[`5f622c3`](https://github.com/apache/spark/commit/5f622c3da3b65b8d183e329ac641caa1c9aed9bb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79855 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79855/testReport)**
 for PR 18388 at commit 
[`4bfeabb`](https://github.com/apache/spark/commit/4bfeabb8755b71f161f086ef68f95f522b848f23).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18594: [SPARK-20904][core] Don't report task failures to driver...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79852/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18594: [SPARK-20904][core] Don't report task failures to driver...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18594
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18594: [SPARK-20904][core] Don't report task failures to driver...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18594
  
**[Test build #79852 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79852/testReport)**
 for PR 18594 at commit 
[`a68c2f2`](https://github.com/apache/spark/commit/a68c2f2478f190ac56a491801c98ebda862605a6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Spark UI shows incorrect task status ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18707
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79850/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Spark UI shows incorrect task status ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18707
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Spark UI shows incorrect task status ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18707
  
**[Test build #79850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79850/testReport)**
 for PR 18707 at commit 
[`172fc20`](https://github.com/apache/spark/commit/172fc20898896058b7288360eb5292ed9df9d79c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79853/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18645
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18709
  
**[Test build #79853 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79853/testReport)**
 for PR 18709 at commit 
[`ccbd3a9`](https://github.com/apache/spark/commit/ccbd3a96e5d9fe154f8adec172179fd0021eada2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18645
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79851/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18645
  
**[Test build #79851 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79851/testReport)**
 for PR 18645 at commit 
[`fec76be`](https://github.com/apache/spark/commit/fec76beb6a3b63e698b57d93e61f8254b56d4b0d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18652: [SPARK-21497][SQL][WIP] Pull non-deterministic equi join...

2017-07-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18652
  
@viirya I have not start reading the comments and the codes carefully. Just 
want to confirm whether the code changes in this PR follow what Hive is doing 
when we turn on the flag? If not, what is the behavior difference? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18710: [SPARK][Docs] Added note on meaning of position to subst...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18710
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128882147
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
 ---
@@ -130,11 +143,25 @@ private void processFetchRequest(final 
ChunkFetchRequest req) {
   return;
 }
 
-respond(new ChunkFetchSuccess(req.streamChunkId, buf));
+respond(new ChunkFetchSuccess(req.streamChunkId, 
buf)).addListener(future -> {
+  streamManager.chunkSent(req.streamChunkId.streamId);
+});
   }
 
   private void processStreamRequest(final StreamRequest req) {
+if (logger.isTraceEnabled()) {
+  logger.trace("Received req from {} to fetch stream {}", 
getRemoteAddress(channel),
+req.streamId);
+}
+
+long chunksBeingTransferred = streamManager.chunksBeingTransferred();
+if (chunksBeingTransferred > maxChunksBeingTransferred) {
+  logger.warn("The number of chunks being transferred {} is above {}, 
close the connection.",
+chunksBeingTransferred, maxChunksBeingTransferred);
+  channel.close();
+}
--- End diff --

To make the error handling simple, you can increase chunksBeingTransferred 
just before writing the chunk to the channel, and decrease it in the future 
returned by write.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread susanxhuynh
Github user susanxhuynh commented on the issue:

https://github.com/apache/spark/pull/18705
  
@skonto LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18710: [SPARK][Docs] Added note on meaning of position t...

2017-07-21 Thread maclockard
GitHub user maclockard opened a pull request:

https://github.com/apache/spark/pull/18710

[SPARK][Docs] Added note on meaning of position to substring function

## What changes were proposed in this pull request?

Enhanced some existing documentation

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maclockard/spark maclockard-patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18710.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18710


commit d503359235a06f92605755fb994272a03b4d2743
Author: Mac 
Date:   2017-07-22T00:01:24Z

Added note on meaning of position to substring function




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18709
  
cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18709#discussion_r128881870
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/describe-table-after-alter-table.sql.out
 ---
@@ -25,6 +25,7 @@ Database  default
 Table  table_with_comment  
 Created [not included in comparison]
 Last Access [not included in comparison]
+Create Version [not included in comparison]
--- End diff --

I manually verified all these version values are right. To avoid the 
unnecessary test result updates, hide the values. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18698: [SPARK-21434][Python][DOCS] Add pyspark pip docum...

2017-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18698


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128881079
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
 ---
@@ -130,11 +143,25 @@ private void processFetchRequest(final 
ChunkFetchRequest req) {
   return;
 }
 
-respond(new ChunkFetchSuccess(req.streamChunkId, buf));
+respond(new ChunkFetchSuccess(req.streamChunkId, 
buf)).addListener(future -> {
+  streamManager.chunkSent(req.streamChunkId.streamId);
+});
   }
 
   private void processStreamRequest(final StreamRequest req) {
+if (logger.isTraceEnabled()) {
+  logger.trace("Received req from {} to fetch stream {}", 
getRemoteAddress(channel),
+req.streamId);
+}
+
+long chunksBeingTransferred = streamManager.chunksBeingTransferred();
+if (chunksBeingTransferred > maxChunksBeingTransferred) {
+  logger.warn("The number of chunks being transferred {} is above {}, 
close the connection.",
+chunksBeingTransferred, maxChunksBeingTransferred);
+  channel.close();
+}
--- End diff --

Also please decrease `chunksBeingTransferred` for when sending 
ChunkFetchFailure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128879751
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
 ---
@@ -130,11 +143,25 @@ private void processFetchRequest(final 
ChunkFetchRequest req) {
   return;
 }
 
-respond(new ChunkFetchSuccess(req.streamChunkId, buf));
+respond(new ChunkFetchSuccess(req.streamChunkId, 
buf)).addListener(future -> {
+  streamManager.chunkSent(req.streamChunkId.streamId);
+});
   }
 
   private void processStreamRequest(final StreamRequest req) {
+if (logger.isTraceEnabled()) {
+  logger.trace("Received req from {} to fetch stream {}", 
getRemoteAddress(channel),
+req.streamId);
+}
+
+long chunksBeingTransferred = streamManager.chunksBeingTransferred();
+if (chunksBeingTransferred > maxChunksBeingTransferred) {
+  logger.warn("The number of chunks being transferred {} is above {}, 
close the connection.",
+chunksBeingTransferred, maxChunksBeingTransferred);
+  channel.close();
+}
 ManagedBuffer buf;
+
--- End diff --

nit: extra empty line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128879607
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java
 ---
@@ -96,18 +103,23 @@ public ManagedBuffer getChunk(long streamId, int 
chunkIndex) {
 
   @Override
   public ManagedBuffer openStream(String streamChunkId) {
-String[] array = streamChunkId.split("_");
-assert array.length == 2:
-  "Stream id and chunk index should be specified when open stream for 
fetching block.";
-long streamId = Long.valueOf(array[0]);
-int chunkIndex = Integer.valueOf(array[1]);
-return getChunk(streamId, chunkIndex);
+Pair streamChunkIdPair = 
parseStreamChunkId(streamChunkId);
+return getChunk(streamChunkIdPair.getLeft(), 
streamChunkIdPair.getRight());
   }
 
   public static String genStreamChunkId(long streamId, int chunkId) {
 return String.format("%d_%d", streamId, chunkId);
   }
 
+  public static Pair parseStreamChunkId(String 
streamChunkId) {
--- End diff --

nit: please document the meaning of the return value for this public method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128879502
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java
 ---
@@ -122,6 +134,7 @@ public void connectionTerminated(Channel channel) {
 }
   }
 }
+
--- End diff --

nit: extra empty line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128879315
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java
 ---
@@ -53,9 +56,13 @@
 // that the caller only requests each chunk one at a time, in order.
 int curChunk = 0;
 
+// Used to keep track of the number of chunks being transferred and 
not finished yet.
+AtomicLong chunksBeingTransferred;
--- End diff --

@jinxing64 `chunksBeingTransferred` are modified in the same thread. Not a 
big deal though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18705: [SPARK-21502][Mesos] fix --supervise for mesos in...

2017-07-21 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18705#discussion_r128878694
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -369,7 +369,8 @@ private[spark] class MesosClusterScheduler(
   }
 
   private def getDriverFrameworkID(desc: MesosDriverDescription): String = 
{
-s"${frameworkId}-${desc.submissionId}"
+val retries = desc.retryState.map{d => s"-retry-${d.retries.toString}"}
--- End diff --

nit: add spaces around the braces


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18705: [SPARK-21502][Mesos] fix --supervise for mesos in...

2017-07-21 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18705#discussion_r128878929
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -369,7 +369,8 @@ private[spark] class MesosClusterScheduler(
   }
 
   private def getDriverFrameworkID(desc: MesosDriverDescription): String = 
{
-s"${frameworkId}-${desc.submissionId}"
+val retries = desc.retryState.map{d => s"-retry-${d.retries.toString}"}
+s"${frameworkId}-${desc.submissionId}${retries.getOrElse("")}"
--- End diff --

nit: move the `getOrElse()` call out of the string for clarity?

val suffix = desc.retryState.map { }.getOrElse("")



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128879045
  
--- Diff: docs/configuration.md ---
@@ -1809,6 +1809,14 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.shuffle.maxChunksBeingTransferred
+  Long.MAX_VALUE
+  
+The max number of chunks being transferred at the same time. This 
config helps avoid OOM on
--- End diff --

Please also move this to `Shuffle Behavior` section.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18705: [SPARK-21502][Mesos] fix --supervise for mesos in...

2017-07-21 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18705#discussion_r128878713
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -672,3 +682,9 @@ private class Slave(val hostname: String) {
   var taskFailures = 0
   var shuffleRegistered = false
 }
+
+object IdHelper {
+  // Use atomic values since Spark contexts can initialized in parallel
--- End diff --

"can be initialized"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128878875
  
--- Diff: docs/configuration.md ---
@@ -1809,6 +1809,14 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.shuffle.maxChunksBeingTransferred
+  Long.MAX_VALUE
+  
+The max number of chunks being transferred at the same time. This 
config helps avoid OOM on
--- End diff --

nit: `The max number of chunks allowed to being transferred at the same 
time on shuffle service.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128878596
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
 ---
@@ -130,11 +143,25 @@ private void processFetchRequest(final 
ChunkFetchRequest req) {
   return;
 }
 
-respond(new ChunkFetchSuccess(req.streamChunkId, buf));
+respond(new ChunkFetchSuccess(req.streamChunkId, 
buf)).addListener(future -> {
+  streamManager.chunkSent(req.streamChunkId.streamId);
+});
   }
 
   private void processStreamRequest(final StreamRequest req) {
+if (logger.isTraceEnabled()) {
+  logger.trace("Received req from {} to fetch stream {}", 
getRemoteAddress(channel),
+req.streamId);
+}
+
+long chunksBeingTransferred = streamManager.chunksBeingTransferred();
+if (chunksBeingTransferred > maxChunksBeingTransferred) {
+  logger.warn("The number of chunks being transferred {} is above {}, 
close the connection.",
+chunksBeingTransferred, maxChunksBeingTransferred);
+  channel.close();
+}
--- End diff --

missing `return`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128878556
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
 ---
@@ -118,6 +124,13 @@ private void processFetchRequest(final 
ChunkFetchRequest req) {
 req.streamChunkId);
 }
 
+long chunksBeingTransferred = streamManager.chunksBeingTransferred();
+if (chunksBeingTransferred > maxChunksBeingTransferred) {
+  logger.warn("The number of chunks being transferred {} is above {}, 
close the connection.",
+chunksBeingTransferred, maxChunksBeingTransferred);
+  channel.close();
+}
--- End diff --

missing `return`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16992: [SPARK-19662][SCHEDULER][TEST] Add Fair Scheduler Unit T...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16992
  
**[Test build #79854 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79854/testReport)**
 for PR 16992 at commit 
[`2e5afe4`](https://github.com/apache/spark/commit/2e5afe4852699aea7e33b0c889b78202b5fe184c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18388#discussion_r128878335
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
 ---
@@ -257,4 +257,7 @@ public Properties cryptoConf() {
 return CryptoUtils.toCryptoConf("spark.network.crypto.config.", 
conf.getAll());
   }
 
+  public long maxChunksBeingTransferred() {
+return conf.getLong("spark.network.shuffle.maxChunksBeingTransferred", 
Long.MAX_VALUE);
--- End diff --

This default value totally depends on the JVM heap size. Seems hard to pick 
up a reasonable value. If it's too small, then if a user uses a large heap 
size, when they upgrade, their shuffle service may start to fail. If it's too 
large, it's just the same as MAX_VALUE.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18707: [SPARK-21503][UI]: Spark UI shows incorrect task ...

2017-07-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18707#discussion_r128876674
  
--- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala 
---
@@ -140,6 +140,8 @@ class ExecutorsListener(storageStatusListener: 
StorageStatusListener, conf: Spar
   return
 case _: ExceptionFailure =>
   taskSummary.tasksFailed += 1
+case _: ExecutorLostFailure =>
--- End diff --

Looks like we can use `info.successful`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18709
  
**[Test build #79853 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79853/testReport)**
 for PR 18709 at commit 
[`ccbd3a9`](https://github.com/apache/spark/commit/ccbd3a96e5d9fe154f8adec172179fd0021eada2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18709: [SPARK-21504] [SQL] Add spark version info into t...

2017-07-21 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/18709

[SPARK-21504] [SQL] Add spark version info into table metadata

## What changes were proposed in this pull request?
This PR is to add the spark version info in the table metadata. When 
creating the table, this value is assigned. It can help users find which 
version of Spark was used to create the table.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark addVersion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18709.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18709


commit ccbd3a96e5d9fe154f8adec172179fd0021eada2
Author: gatorsmile 
Date:   2017-07-21T22:22:02Z

fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18594: [SPARK-20904][core] Don't report task failures to driver...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18594
  
**[Test build #79852 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79852/testReport)**
 for PR 18594 at commit 
[`a68c2f2`](https://github.com/apache/spark/commit/a68c2f2478f190ac56a491801c98ebda862605a6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18594: [SPARK-20904][core] Don't report task failures to...

2017-07-21 Thread jsoltren
Github user jsoltren commented on a diff in the pull request:

https://github.com/apache/spark/pull/18594#discussion_r128873660
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -473,29 +473,36 @@ private[spark] class Executor(
   // the default uncaught exception handler, which will terminate 
the Executor.
   logError(s"Exception in $taskName (TID $taskId)", t)
 
-  // Collect latest accumulator values to report back to the driver
-  val accums: Seq[AccumulatorV2[_, _]] =
-if (task != null) {
-  task.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStart)
-  task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
-  task.collectAccumulatorUpdates(taskFailed = true)
-} else {
-  Seq.empty
-}
+  // SPARK-20904: Do not report failure to driver if if happened 
during shut down. Because
+  // libraries may set up shutdown hooks that race with running 
tasks during shutdown,
+  // spurious failures may occur and can result in improper 
accounting in the driver (e.g.
+  // the task failure would not be ignored if the shutdown 
happened because of premption,
+  // instead of an app issue).
+  if (!ShutdownHookManager.inShutdown()) {
--- End diff --

Yeah, it isn't guaranteed. I'm thinking that if this happens often enough 
maybe one executor will print the message, giving a clue to the user. Also it's 
a de-facto code comment. Yes, any daemon thread will terminate at any time at 
shutdown - even finishing this block isn't guaranteed. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18645: [SPARK-14280][BUILD][WIP] Update change-version.s...

2017-07-21 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/18645#discussion_r128872194
  
--- Diff: core/src/main/scala/org/apache/spark/FutureAction.scala ---
@@ -89,6 +89,14 @@ trait FutureAction[T] extends Future[T] {
*/
   override def value: Option[Try[T]]
 
+  // These two methods must be implemented in Scala 2.12, but won't be 
used by Spark
+
+  def transform[S](f: (Try[T]) => Try[S])(implicit executor: 
ExecutionContext): Future[S] =
--- End diff --

I tried compiling a small app that calls `RDD.countAsync` (which returns a 
`FutureAction`) and even implements a custom `FutureAction` and compiled vs 
2.2.0, then ran vs this build, and it worked. I believe this may be 
legitimately excluded from MiMa.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18645: [SPARK-14280][BUILD][WIP] Update change-version.sh and p...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18645
  
**[Test build #79851 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79851/testReport)**
 for PR 18645 at commit 
[`fec76be`](https://github.com/apache/spark/commit/fec76beb6a3b63e698b57d93e61f8254b56d4b0d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18594: [SPARK-20904][core] Don't report task failures to...

2017-07-21 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18594#discussion_r128872073
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -473,29 +473,36 @@ private[spark] class Executor(
   // the default uncaught exception handler, which will terminate 
the Executor.
   logError(s"Exception in $taskName (TID $taskId)", t)
 
-  // Collect latest accumulator values to report back to the driver
-  val accums: Seq[AccumulatorV2[_, _]] =
-if (task != null) {
-  task.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStart)
-  task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
-  task.collectAccumulatorUpdates(taskFailed = true)
-} else {
-  Seq.empty
-}
+  // SPARK-20904: Do not report failure to driver if if happened 
during shut down. Because
+  // libraries may set up shutdown hooks that race with running 
tasks during shutdown,
+  // spurious failures may occur and can result in improper 
accounting in the driver (e.g.
+  // the task failure would not be ignored if the shutdown 
happened because of premption,
+  // instead of an app issue).
+  if (!ShutdownHookManager.inShutdown()) {
--- End diff --

Sure, I can add a log, but it's not guaranteed to be printed. During 
shutdown the JVM can die at any moment (only shutdown hooks run to completion, 
and this is not one of them)...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18594: [SPARK-20904][core] Don't report task failures to...

2017-07-21 Thread jsoltren
Github user jsoltren commented on a diff in the pull request:

https://github.com/apache/spark/pull/18594#discussion_r128871196
  
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -473,29 +473,36 @@ private[spark] class Executor(
   // the default uncaught exception handler, which will terminate 
the Executor.
   logError(s"Exception in $taskName (TID $taskId)", t)
 
-  // Collect latest accumulator values to report back to the driver
-  val accums: Seq[AccumulatorV2[_, _]] =
-if (task != null) {
-  task.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStart)
-  task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
-  task.collectAccumulatorUpdates(taskFailed = true)
-} else {
-  Seq.empty
-}
+  // SPARK-20904: Do not report failure to driver if if happened 
during shut down. Because
+  // libraries may set up shutdown hooks that race with running 
tasks during shutdown,
+  // spurious failures may occur and can result in improper 
accounting in the driver (e.g.
+  // the task failure would not be ignored if the shutdown 
happened because of premption,
+  // instead of an app issue).
+  if (!ShutdownHookManager.inShutdown()) {
--- End diff --

At this point I don't think we have any information on why we're in 
shutdown, whether it is an app issue, the Spark executor process being killed 
from the command line, etc.

Yes, a nice log message would be nice. Maybe, in the else clause to this 
if, something like logInfo(s"Not reporting failure as we are in the middle of a 
shutdown").


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18708: [SPARK-21339] [CORE] spark-shell --packages option does ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18708
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18708: [SPARK-21339] [CORE] spark-shell --packages optio...

2017-07-21 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/18708

[SPARK-21339] [CORE] spark-shell --packages option does not add jars to 
classpath on windows

## What changes were proposed in this pull request?
The --packages option jars are getting added to the classpath with the 
scheme as "file:///", in Unix it doesn't have problem with this since the 
scheme contains the Unix Path separator which separates the jar name with 
location in the classpath. In Windows, the jar file is not getting resolved 
from the classpath because of the scheme.

Windows : file:///C:/Users//.ivy2/jars/.jar
Unix : file:///home//.ivy2/jars/.jar

With this PR, we are avoiding the 'file://' scheme to get added to the 
packages jar files.

## How was this patch tested?
I have verified manually in Windows and Unix environments, with the change 
it adds the jar to classpath like below,

Windows : C:\Users\\.ivy2\jars\.jar
Unix : /home//.ivy2/jars/.jar


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-21339

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18708


commit 3242532d4815fa9595dd9ca2d4d0b86c6d206ddb
Author: Devaraj K 
Date:   2017-07-21T21:51:46Z

[SPARK-21339] [CORE] spark-shell --packages option does not add jars to
classpath on windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Spark UI shows incorrect task status ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18707
  
**[Test build #79850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79850/testReport)**
 for PR 18707 at commit 
[`172fc20`](https://github.com/apache/spark/commit/172fc20898896058b7288360eb5292ed9df9d79c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Fixed the issue

2017-07-21 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/18707
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18707: [SPARK-21503][UI]: Fixed the issue

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18707
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18707: [SPARK-21503][UI]: Fixed the issue

2017-07-21 Thread pgandhi999
GitHub user pgandhi999 opened a pull request:

https://github.com/apache/spark/pull/18707

[SPARK-21503][UI]: Fixed the issue

Added the case ExecutorLostFailure which was previously not there, thus, 
the default case would be executed in which case, task would be marked as 
completed.

## What changes were proposed in this pull request?
  Added the case ExecutorLostFailure in the ExecutorsTab.scala class, 
which will consider all those cases where executor connection to Spark Driver 
was lost due to killing the executor process, network connection etc.



## How was this patch tested?
Manually Tested the fix by observing the UI change before and after.
Before:
https://user-images.githubusercontent.com/8190/28482929-571c9cea-6e30-11e7-93dd-728de5cdea95.png;>
After:
https://user-images.githubusercontent.com/8190/28482964-8649f5ee-6e30-11e7-91bd-2eb2089c61cc.png;>


Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pgandhi999/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18707.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18707


commit 172fc20898896058b7288360eb5292ed9df9d79c
Author: pgandhi 
Date:   2017-07-21T21:00:22Z

[SPARK-21503]: Fixed the issue

Added the case ExecutorLostFailure which was previously not there, thus, 
the default case would be executed in which case, task would be marked as 
completed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18706: [SPARK-21494][network] Use correct app id when authentic...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18706
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18706: [SPARK-21494][network] Use correct app id when authentic...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18706
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79846/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18706: [SPARK-21494][network] Use correct app id when authentic...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18706
  
**[Test build #79846 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79846/testReport)**
 for PR 18706 at commit 
[`4e6cc53`](https://github.com/apache/spark/commit/4e6cc532009efd2325be97c262f37e154ac17370).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18519: [SPARK-16742] Mesos Kerberos Support

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18519
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79849/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18519: [SPARK-16742] Mesos Kerberos Support

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18519
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18519: [SPARK-16742] Mesos Kerberos Support

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18519
  
**[Test build #79849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79849/testReport)**
 for PR 18519 at commit 
[`e6a7357`](https://github.com/apache/spark/commit/e6a73573c9276d9bd68ae23f38eef15f9897ffef).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread ArtRand
Github user ArtRand commented on the issue:

https://github.com/apache/spark/pull/18705
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18705
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18705
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79844/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18705: [SPARK-21502][Mesos] fix --supervise for mesos in cluste...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18705
  
**[Test build #79844 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79844/testReport)**
 for PR 18705 at commit 
[`b987c4b`](https://github.com/apache/spark/commit/b987c4b28c3aa96f39e78dcc74da570226c6bdba).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18696: [SPARK-21490][core] Make sure SparkLauncher redirects ne...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18696
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18696: [SPARK-21490][core] Make sure SparkLauncher redirects ne...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18696
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79848/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18696: [SPARK-21490][core] Make sure SparkLauncher redirects ne...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18696
  
**[Test build #79848 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79848/testReport)**
 for PR 18696 at commit 
[`7d3db5f`](https://github.com/apache/spark/commit/7d3db5fa1d7b0b6d1a9e247d5d6a223e4ef774df).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18691: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-21 Thread dhruve
Github user dhruve commented on the issue:

https://github.com/apache/spark/pull/18691
  
Thanks @tgravescs Closing the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18691: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-21 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/18691
  
merged


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79843/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #79843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79843/testReport)**
 for PR 18704 at commit 
[`bd0c334`](https://github.com/apache/spark/commit/bd0c3340c06f25522cb76a95239f679cb01a04ac).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18691: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-21 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/18691
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18651: [SPARK-21383][Core] Fix the YarnAllocator allocat...

2017-07-21 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/18651#discussion_r128838209
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -525,9 +534,11 @@ private[yarn] class YarnAllocator(
   } catch {
 case NonFatal(e) =>
   logError(s"Failed to launch executor $executorId on 
container $containerId", e)
-  // Assigned container should be released immediately to 
avoid unnecessary resource
-  // occupation.
+  // Assigned container should be released immediately
+  // to avoid unnecessary resource occupation.
   amClient.releaseAssignedContainer(containerId)
+  } finally {
+numExecutorsStarting.decrementAndGet()
--- End diff --

yep, the fact that its still marked as Starting even though failed will fix 
itself next loop through. Its no different then if we didn't know it failed and 
it was still in the ExecutorRunnable.run code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for st...

2017-07-21 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/18630#discussion_r128835800
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -43,7 +52,7 @@ object DriverWrapper {
 rpcEnv.setupEndpoint("workerWatcher", new WorkerWatcher(rpcEnv, 
workerUrl))
 
 val currentLoader = Thread.currentThread.getContextClassLoader
-val userJarUrl = new File(userJar).toURI().toURL()
+val userJarUrl = new File(userJar).toURI.toURL
--- End diff --

Its more scala style that I prefer but anyway...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for st...

2017-07-21 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/18630#discussion_r128835610
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -66,4 +77,68 @@ object DriverWrapper {
 System.exit(-1)
 }
   }
+
+  // R or Python are not supported in cluster mode so download the jars to 
the driver side
+  private def setupDependencies(loader: MutableURLClassLoader, userJar: 
String): Unit = {
+val packagesExclusions = sys.props.get("spark.jars.excludes").orNull
+val packages = sys.props.get("spark.jars.packages").orNull
+val repositories = sys.props.get("spark.jars.repositories").orNull
+val hadoopConf = new HadoopConfiguration()
+val childClasspath = new ArrayBuffer[String]()
+val ivyRepoPath = sys.props.get("spark.jars.ivy").orNull
+var jars = sys.props.get("spark.jars").orNull
+
+val exclusions: Seq[String] =
+  if (!StringUtils.isBlank(packagesExclusions)) {
+packagesExclusions.split(",")
+  } else {
+Nil
+  }
+
+// Create the IvySettings, either load from file or build defaults
+val ivySettings = sys.props.get("spark.jars.ivySettings").map { 
ivySettingsFile =>
+  SparkSubmitUtils.loadIvySettings(ivySettingsFile, 
Option(repositories),
+Option(ivyRepoPath))
+}.getOrElse {
+  SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
+}
+
+val resolvedMavenCoordinates = 
SparkSubmitUtils.resolveMavenCoordinates(packages,
+  ivySettings, exclusions = exclusions)
+
+if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
+  jars = SparkSubmit.mergeFileLists(jars, resolvedMavenCoordinates)
+}
+
+val targetDir = Files.createTempDirectory("tmp").toFile
+// scalastyle:off runtimeaddshutdownhook
+Runtime.getRuntime.addShutdownHook(new Thread() {
+  override def run(): Unit = {
+FileUtils.deleteQuietly(targetDir)
+  }
+})
+// scalastyle:on runtimeaddshutdownhook
+
+val sparkProperties = new mutable.HashMap[String, String]()
+val securityProperties = List("spark.ssl.fs.trustStore", 
"spark.ssl.trustStore",
+  "spark.ssl.fs.trustStorePassword", "spark.ssl.trustStorePassword",
+  "spark.ssl.fs.protocol", "spark.ssl.protocol")
+
+securityProperties
+  .map {pName => sys.props.get(pName)
+.map{pValue => sparkProperties.put(pName, pValue)}}
+
+jars = Option(jars).map(SparkSubmit.resolveGlobPaths(_, 
hadoopConf)).orNull
+
+// Filter out the user jar
+jars = 
jars.split(",").filterNot(_.contains(userJar.split("/").last)).mkString(",")
+jars = Option(jars)
+  .map(SparkSubmit.downloadFileList(_, targetDir, sparkProperties, 
hadoopConf)).orNull
+
+if (jars != null) {childClasspath ++= jars.split(",")}
+
+for (jar <- childClasspath) {
+SparkSubmit.addJarToClasspath(jar, loader)
--- End diff --

Ok I will do so...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for st...

2017-07-21 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/18630#discussion_r128835523
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -66,4 +77,68 @@ object DriverWrapper {
 System.exit(-1)
 }
   }
+
+  // R or Python are not supported in cluster mode so download the jars to 
the driver side
+  private def setupDependencies(loader: MutableURLClassLoader, userJar: 
String): Unit = {
+val packagesExclusions = sys.props.get("spark.jars.excludes").orNull
+val packages = sys.props.get("spark.jars.packages").orNull
+val repositories = sys.props.get("spark.jars.repositories").orNull
+val hadoopConf = new HadoopConfiguration()
+val childClasspath = new ArrayBuffer[String]()
+val ivyRepoPath = sys.props.get("spark.jars.ivy").orNull
+var jars = sys.props.get("spark.jars").orNull
+
+val exclusions: Seq[String] =
+  if (!StringUtils.isBlank(packagesExclusions)) {
+packagesExclusions.split(",")
+  } else {
+Nil
+  }
+
+// Create the IvySettings, either load from file or build defaults
+val ivySettings = sys.props.get("spark.jars.ivySettings").map { 
ivySettingsFile =>
+  SparkSubmitUtils.loadIvySettings(ivySettingsFile, 
Option(repositories),
+Option(ivyRepoPath))
+}.getOrElse {
+  SparkSubmitUtils.buildIvySettings(Option(repositories), 
Option(ivyRepoPath))
+}
+
+val resolvedMavenCoordinates = 
SparkSubmitUtils.resolveMavenCoordinates(packages,
+  ivySettings, exclusions = exclusions)
+
+if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
+  jars = SparkSubmit.mergeFileLists(jars, resolvedMavenCoordinates)
+}
+
+val targetDir = Files.createTempDirectory("tmp").toFile
+// scalastyle:off runtimeaddshutdownhook
+Runtime.getRuntime.addShutdownHook(new Thread() {
+  override def run(): Unit = {
+FileUtils.deleteQuietly(targetDir)
+  }
+})
+// scalastyle:on runtimeaddshutdownhook
+
+val sparkProperties = new mutable.HashMap[String, String]()
+val securityProperties = List("spark.ssl.fs.trustStore", 
"spark.ssl.trustStore",
+  "spark.ssl.fs.trustStorePassword", "spark.ssl.trustStorePassword",
+  "spark.ssl.fs.protocol", "spark.ssl.protocol")
+
+securityProperties
+  .map {pName => sys.props.get(pName)
+.map{pValue => sparkProperties.put(pName, pValue)}}
+
+jars = Option(jars).map(SparkSubmit.resolveGlobPaths(_, 
hadoopConf)).orNull
+
+// Filter out the user jar
+jars = 
jars.split(",").filterNot(_.contains(userJar.split("/").last)).mkString(",")
+jars = Option(jars)
+  .map(SparkSubmit.downloadFileList(_, targetDir, sparkProperties, 
hadoopConf)).orNull
--- End diff --

I cannot omit the jars here... I skipped them earlier in SparkSubmit so I 
have to do this somewhere.. I am just migrating the place where this action 
takes place...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for st...

2017-07-21 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/18630#discussion_r128834993
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -66,4 +77,68 @@ object DriverWrapper {
 System.exit(-1)
 }
   }
+
+  // R or Python are not supported in cluster mode so download the jars to 
the driver side
+  private def setupDependencies(loader: MutableURLClassLoader, userJar: 
String): Unit = {
+val packagesExclusions = sys.props.get("spark.jars.excludes").orNull
--- End diff --

Its not exactly the same with the code in SparkSubmit but I can give it a 
shot. I thought about too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18651: [SPARK-21383][Core] Fix the YarnAllocator allocat...

2017-07-21 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18651#discussion_r128834971
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -525,9 +534,11 @@ private[yarn] class YarnAllocator(
   } catch {
 case NonFatal(e) =>
   logError(s"Failed to launch executor $executorId on 
container $containerId", e)
-  // Assigned container should be released immediately to 
avoid unnecessary resource
-  // occupation.
+  // Assigned container should be released immediately
+  // to avoid unnecessary resource occupation.
   amClient.releaseAssignedContainer(containerId)
+  } finally {
+numExecutorsStarting.decrementAndGet()
--- End diff --

Ok, I see. The current code can double count the same executor as starting 
and running, while the previous code could count it as starting even though it 
failed to start (for a really small window), but that is a self-healing 
situation while the previous can have some adverse effects.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for st...

2017-07-21 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/18630#discussion_r128834674
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala ---
@@ -66,4 +77,68 @@ object DriverWrapper {
 System.exit(-1)
 }
   }
+
+  // R or Python are not supported in cluster mode so download the jars to 
the driver side
--- End diff --

Just saying we dont covering them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79840/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18388
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18388
  
**[Test build #79840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79840/testReport)**
 for PR 18388 at commit 
[`98123ee`](https://github.com/apache/spark/commit/98123ee6e4bbe685f75db6cd55a1d7e9c87ee9d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18651: [SPARK-21383][Core] Fix the YarnAllocator allocat...

2017-07-21 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/18651#discussion_r128831747
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -525,9 +534,11 @@ private[yarn] class YarnAllocator(
   } catch {
 case NonFatal(e) =>
   logError(s"Failed to launch executor $executorId on 
container $containerId", e)
-  // Assigned container should be released immediately to 
avoid unnecessary resource
-  // occupation.
+  // Assigned container should be released immediately
+  // to avoid unnecessary resource occupation.
   amClient.releaseAssignedContainer(containerId)
+  } finally {
+numExecutorsStarting.decrementAndGet()
--- End diff --

yes but its a bug right now as the numbers can be wrong. Are you looking at 
the synchronization?

Right now everything is called synchronized up to the point of launcher 
pool to do the ExecutorRunnable.  At this point running is not incremented, 
pending is decremented and we now increment Starting.  That is fine.

But when the ExecutorRunnable finishes the only place its called 
synchronized is in updateInternalState.  This right now increments running but 
does not decrement starting.  if updateResourceRequests gets called (which is 
synchronized), Right after updateInternalState (which leave the syncrhonized) 
but before the finally block executes and decrements starting the total number 
can be more then it really is.  That executor is counted as both running and 
starting


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18540
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

2017-07-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18540
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79841/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18540: [SPARK-19451][SQL] rangeBetween method should accept Lon...

2017-07-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18540
  
**[Test build #79841 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79841/testReport)**
 for PR 18540 at commit 
[`43b2399`](https://github.com/apache/spark/commit/43b23993af1b7abf20088ad0b63893afebe9a00d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >