[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-17 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/21700#discussion_r202933053
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala
 ---
@@ -270,11 +273,42 @@ private[state] class HDFSBackedStateStoreProvider 
extends StateStoreProvider wit
 } else Iterator.empty
   }
 
+  /** This method is intended to be only used for unit test(s). DO NOT 
TOUCH ELEMENTS IN MAP! */
+  private[state] def getClonedLoadedMaps(): util.SortedMap[Long, MapType] 
= synchronized {
+// shallow copy as a minimal guard
+loadedMaps.clone().asInstanceOf[util.SortedMap[Long, MapType]]
+  }
+
+  private def putStateIntoStateCache(newVersion: Long, map: MapType): Unit 
= synchronized {
+if (numberOfVersionsToRetainInMemory <= 0) {
+  if (loadedMaps.size() > 0) loadedMaps.clear()
+  return
+}
+
+while (loadedMaps.size() > numberOfVersionsToRetainInMemory) {
+  loadedMaps.remove(loadedMaps.lastKey())
+}
+
+val size = loadedMaps.size()
+if (size == numberOfVersionsToRetainInMemory) {
+  val versionIdForLastKey = loadedMaps.lastKey()
+  if (versionIdForLastKey > newVersion) {
+// this is the only case which put doesn't need
--- End diff --

Will update the comment to clarify a bit more. We just avoid the case when 
the element is being added to the last and required to be evicted right away.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1047/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21785
  
**[Test build #93161 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93161/testReport)**
 for PR 21785 at commit 
[`9d87160`](https://github.com/apache/spark/commit/9d87160bc2c01321280d43f655c256c30d9fbc14).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/21514
  
@tooptoop4 the only thing I can see is that you are suing `'` instead of 
`"` for string. Anyway you can run the scala linter on your local machine too 
and you can check which part of the code you added to the MasterSuite file 
causes the failure


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2018-07-17 Thread rvesse
Github user rvesse commented on the issue:

https://github.com/apache/spark/pull/13599
  
@holdenk What we're doing in some of our products currently is that we 
require that users create their Python environments up front and that they be 
stored on a file system that is accessible to all physical nodes.  This is 
partly for performance and partly because our compute nodes don't have external 
network connectivity.

Then when we spin up containers we volume mount the appropriate file system 
into our containers and have logic in our entry point scripts that activates 
the relevant environment prior to starting Spark, Dask Distributed or whatever 
Python job we're actually launching.

We're doing this with Spark standalone clusters currently but I expect much 
the same approach would work for Kubernetes and other resource managers.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization ...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21352#discussion_r202944800
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3226,7 +3218,7 @@ case class ArrayDistinct(child: Expression)
 
   override def dataType: DataType = child.dataType
 
-  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+  private def elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
--- End diff --

yes, 
https://github.com/apache/spark/pull/21352/files#diff-9853dcf5ce3d2ac1e94d473197ff5768R183
 for instance


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21788: [SPARK-24609][ML][DOC] PySpark/SparkR doc doesn't explai...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21788
  
**[Test build #93162 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93162/testReport)**
 for PR 21788 at commit 
[`b53d14f`](https://github.com/apache/spark/commit/b53d14f129bdce0f7a4b6495edb86e515c18a162).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...

2018-07-17 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21442
  
@dbtsai can you open a new PR? thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1041/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20838
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93150/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21784: [SPARK-24182][YARN][FOLLOW-UP] Turn off noisy log output

2018-07-17 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/21784
  
cc @vanzin, @jerryshao 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20838
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93151/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #93155 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93155/testReport)**
 for PR 21102 at commit 
[`28e0c45`](https://github.com/apache/spark/commit/28e0c45441348c89c627770059d03f7228d0f94b).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21720: [SPARK-24163][SPARK-24164][SQL] Support column list as t...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21720
  
**[Test build #93152 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93152/testReport)**
 for PR 21720 at commit 
[`b27245e`](https://github.com/apache/spark/commit/b27245e3e2ca021815e6b353036925e57f665e7a).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20636
  
**[Test build #93153 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93153/testReport)**
 for PR 20636 at commit 
[`a134091`](https://github.com/apache/spark/commit/a134091aad0c3f8e3674f6cd751c2b8d5d83e39e).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93156/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21720: [SPARK-24163][SPARK-24164][SQL] Support column list as t...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21720
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93152/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21720: [SPARK-24163][SPARK-24164][SQL] Support column list as t...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21720
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93153/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21537
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21102
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19449
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20636
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization ...

2018-07-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21352#discussion_r202902862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3226,7 +3218,7 @@ case class ArrayDistinct(child: Expression)
 
   override def dataType: DataType = child.dataType
 
-  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+  private def elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
--- End diff --

+1. If it's used in `eval`, let's use lazy val, as it will be called for 
every input. Are there more places like this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/21785
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1042/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1040/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #93155 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93155/testReport)**
 for PR 21102 at commit 
[`28e0c45`](https://github.com/apache/spark/commit/28e0c45441348c89c627770059d03f7228d0f94b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21785
  
**[Test build #93154 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93154/testReport)**
 for PR 21785 at commit 
[`9d87160`](https://github.com/apache/spark/commit/9d87160bc2c01321280d43f655c256c30d9fbc14).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19449
  
**[Test build #93156 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93156/testReport)**
 for PR 19449 at commit 
[`8680026`](https://github.com/apache/spark/commit/86800261afa8c451f9d0bf43903026a14ee971ae).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21638: [SPARK-22357][CORE] SparkContext.binaryFiles igno...

2018-07-17 Thread bomeng
Github user bomeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/21638#discussion_r202907829
  
--- Diff: 
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat[T]
   def setMinPartitions(sc: SparkContext, context: JobContext, 
minPartitions: Int) {
 val defaultMaxSplitBytes = 
sc.getConf.get(config.FILES_MAX_PARTITION_BYTES)
 val openCostInBytes = sc.getConf.get(config.FILES_OPEN_COST_IN_BYTES)
-val defaultParallelism = sc.defaultParallelism
+val defaultParallelism = Math.max(sc.defaultParallelism, minPartitions)
--- End diff --

you need to pass in the minPartitions to use this method, what do you mean 
minParititions is not set? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20838
  
**[Test build #93150 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93150/testReport)**
 for PR 20838 at commit 
[`2c4f15c`](https://github.com/apache/spark/commit/2c4f15c13efa8b181c8c53bd9a90f4f578a40169).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21785
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93154/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19449
  
**[Test build #93156 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93156/testReport)**
 for PR 19449 at commit 
[`8680026`](https://github.com/apache/spark/commit/86800261afa8c451f9d0bf43903026a14ee971ae).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93155/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21785
  
**[Test build #93154 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93154/testReport)**
 for PR 21785 at commit 
[`9d87160`](https://github.com/apache/spark/commit/9d87160bc2c01321280d43f655c256c30d9fbc14).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21537
  
**[Test build #93151 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93151/testReport)**
 for PR 21537 at commit 
[`807d8d4`](https://github.com/apache/spark/commit/807d8d44f950b8a588065b15bb7fa6a5db753075).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1045/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19449
  
**[Test build #93159 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93159/testReport)**
 for PR 19449 at commit 
[`8680026`](https://github.com/apache/spark/commit/86800261afa8c451f9d0bf43903026a14ee971ae).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1044/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1043/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #93157 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93157/testReport)**
 for PR 21102 at commit 
[`28e0c45`](https://github.com/apache/spark/commit/28e0c45441348c89c627770059d03f7228d0f94b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20636
  
**[Test build #93158 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93158/testReport)**
 for PR 20636 at commit 
[`a134091`](https://github.com/apache/spark/commit/a134091aad0c3f8e3674f6cd751c2b8d5d83e39e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21764#discussion_r202903084
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -46,7 +47,23 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
 
   protected def fixedPoint = FixedPoint(SQLConf.get.optimizerMaxIterations)
 
-  def batches: Seq[Batch] = {
+  protected def postAnalysisBatches: Seq[Batch] = {
+Batch("Eliminate Distinct", Once, EliminateDistinct) ::
+// Technically some of the rules in Finish Analysis are not optimizer 
rules and belong more
+// in the analyzer, because they are needed for correctness (e.g. 
ComputeCurrentTime).
+// However, because we also use the analyzer to canonicalized queries 
(for view definition),
+// we do not eliminate subqueries or compute current time in the 
analyzer.
+Batch("Finish Analysis", Once,
+  EliminateSubqueryAliases,
+  EliminateView,
+  ReplaceExpressions,
+  ComputeCurrentTime,
+  GetCurrentDatabase(sessionCatalog),
+  RewriteDistinctAggregates,
+  ReplaceDeduplicateWithAggregate) :: Nil
+  }
+
+  protected def optimizationBatches: Seq[Batch] = {
--- End diff --

yes. We need to exclude `Batch("Eliminate Distinct")`, `Batch("Finish 
Analysis")`, `Batch("Replace Operators")`, `Batch("Pullup Correlated 
Expressions")`, and ` Batch("RewriteSubquery")` 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19449
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-07-17 Thread wangtao605
Github user wangtao605 commented on the issue:

https://github.com/apache/spark/pull/21766
  
@dmateusp Actually i think "Numeric" has no essential difference with 
"Decimal". May be just have it as an alias is better,i will add some tests if 
you agree.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21102
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-07-17 Thread wangtao605
Github user wangtao605 commented on the issue:

https://github.com/apache/spark/pull/21766
  
@rxin In order to support sql syntax better and align SQL standards. I 
think it is worth to add a numeric type as an alias of decimal.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21785: [SPARK-24529][BUILD][test-maven][FOLLOW-UP] Set spotbugs...

2018-07-17 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21785
  
Now, I am checking `make-distribution.sh` in my environment. If my memory 
is correct, the error in `spark-sql_2.11` or `spark-catalyst_2.11` was a 
motivation to stop forking of spotbugs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21447: [SPARK-24339][SQL]Add project for transform/map/reduce s...

2018-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21447
  
Sorry, the fix does not look good to me. We should let the optimizer add 
the project automatically. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21792
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21788: [SPARK-24609][ML][DOC] PySpark/SparkR doc doesn't explai...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21788
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21788: [SPARK-24609][ML][DOC] PySpark/SparkR doc doesn't explai...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21788
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1051/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21792
  
**[Test build #93169 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93169/testReport)**
 for PR 21792 at commit 
[`febd66f`](https://github.com/apache/spark/commit/febd66fb6bf5bff5ce377bd5f2899d10d7da6ccc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21792
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93169/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization ...

2018-07-17 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request:

https://github.com/apache/spark/pull/21352#discussion_r202983734
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -737,21 +733,22 @@ case class MapConcat(children: Seq[Expression]) 
extends ComplexTypeMergingExpres
   since = "2.4.0")
 case class MapFromEntries(child: Expression) extends UnaryExpression {
 
-  @transient
-  private lazy val dataTypeDetails: Option[(MapType, Boolean, Boolean)] = 
child.dataType match {
-case ArrayType(
-  StructType(Array(
-StructField(_, keyType, keyNullable, _),
-StructField(_, valueType, valueNullable, _))),
-  containsNull) => Some((MapType(keyType, valueType, valueNullable), 
keyNullable, containsNull))
-case _ => None
+  @transient private lazy val dataTypeDetails: Option[(MapType, Boolean, 
Boolean)] = {
--- End diff --

Here I wanted to be consistent in terms of formatting. (```@transient``` to 
be on the same line as ```private lazy val dataTypeDetails```) After the 
change, two lines were exceeding 100 characters.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization ...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21352#discussion_r202984326
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -737,21 +733,22 @@ case class MapConcat(children: Seq[Expression]) 
extends ComplexTypeMergingExpres
   since = "2.4.0")
 case class MapFromEntries(child: Expression) extends UnaryExpression {
 
-  @transient
-  private lazy val dataTypeDetails: Option[(MapType, Boolean, Boolean)] = 
child.dataType match {
-case ArrayType(
-  StructType(Array(
-StructField(_, keyType, keyNullable, _),
-StructField(_, valueType, valueNullable, _))),
-  containsNull) => Some((MapType(keyType, valueType, valueNullable), 
keyNullable, containsNull))
-case _ => None
+  @transient private lazy val dataTypeDetails: Option[(MapType, Boolean, 
Boolean)] = {
--- End diff --

I see, but this seems an unneeded change to me and I think there are other 
places where we use this syntax, so I see no reason to change it


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19449
  
**[Test build #93159 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93159/testReport)**
 for PR 19449 at commit 
[`8680026`](https://github.com/apache/spark/commit/86800261afa8c451f9d0bf43903026a14ee971ae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20864: [SPARK-23745][SQL]Remove the directories of the �...

2018-07-17 Thread zuotingbing
Github user zuotingbing closed the pull request at:

https://github.com/apache/spark/pull/20864


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19449
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93159/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18812: [SPARK-21606][SQL]HiveThriftServer2 catches OOMs ...

2018-07-17 Thread zuotingbing
Github user zuotingbing closed the pull request at:

https://github.com/apache/spark/pull/18812


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20636
  
**[Test build #93158 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93158/testReport)**
 for PR 20636 at commit 
[`a134091`](https://github.com/apache/spark/commit/a134091aad0c3f8e3674f6cd751c2b8d5d83e39e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21787: [SPARK-24568] Code refactoring for DataType equalsXXX me...

2018-07-17 Thread swapnilushinde
Github user swapnilushinde commented on the issue:

https://github.com/apache/spark/pull/21787
  
@HyukjinKwon I removed old commented code in new comment.
Old code was repeating same recursive datatype checks again & again in 
every equals* function. This commit abstracts that recursive equality checks 
and gives flexibility to easily add any combination of equals* method in future.
Equals* functions are just one line with this commit.

Please let me know if I need to change anything to make code changes more 
readable.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93158/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20636
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21792
  
**[Test build #93170 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93170/testReport)**
 for PR 21792 at commit 
[`2003324`](https://github.com/apache/spark/commit/20033249492c0115e7135c0900959b20a9ad2552).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19789: [SPARK-22562][Streaming] CachedKafkaConsumer unsafe evic...

2018-07-17 Thread daroo
Github user daroo commented on the issue:

https://github.com/apache/spark/pull/19789
  
sure :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21792
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93170/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21792: [SPARK-23231][ML][DOC] Add doc for string indexer orderi...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21792
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread tooptoop4
Github user tooptoop4 commented on the issue:

https://github.com/apache/spark/pull/21514
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread tooptoop4
Github user tooptoop4 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202990383
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
 ---
@@ -99,8 +99,10 @@ private[spark] class StandaloneSchedulerBackend(
 // Start executors with a few necessary configs for registering with 
the scheduler
 val sparkJavaOpts = Utils.sparkJavaOpts(conf, 
SparkConf.isExecutorStartupConf)
 val javaOpts = sparkJavaOpts ++ extraJavaOpts
-val command = 
Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
-  args, sc.executorEnvs, classPathEntries ++ testingClassPath, 
libraryPathEntries, javaOpts)
+val javaOptsFiltered = javaOpts.filterNot { opt =>
+opt.startsWith("-Dspark.ssl.keyStorePassword") || 
opt.startsWith("-Dspark.ssl.keyPassword")}
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread tooptoop4
Github user tooptoop4 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202990353
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -130,7 +130,9 @@ private[deploy] class Master(
 logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
 webUi = new MasterWebUI(this, webUiPort)
 webUi.bind()
-masterWebUiUrl = "http://; + masterPublicAddress + ":" + 
webUi.boundPort
+val SSL_ENABLED = conf.getBoolean("spark.ssl.enabled", false)
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21514
  
**[Test build #93172 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93172/testReport)**
 for PR 21514 at commit 
[`d4de123`](https://github.com/apache/spark/commit/d4de123e1faee856ea6047229134c2f23458869b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21514
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93172/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21514
  
**[Test build #93172 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93172/testReport)**
 for PR 21514 at commit 
[`d4de123`](https://github.com/apache/spark/commit/d4de123e1faee856ea6047229134c2f23458869b).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21514
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202991417
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
 ---
@@ -99,8 +99,11 @@ private[spark] class StandaloneSchedulerBackend(
 // Start executors with a few necessary configs for registering with 
the scheduler
 val sparkJavaOpts = Utils.sparkJavaOpts(conf, 
SparkConf.isExecutorStartupConf)
 val javaOpts = sparkJavaOpts ++ extraJavaOpts
-val command = 
Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
-  args, sc.executorEnvs, classPathEntries ++ testingClassPath, 
libraryPathEntries, javaOpts)
+val javaOptsFiltered = javaOpts.filterNot { opt =>
+opt.startsWith("-Dspark.ssl.keyStorePassword") || 
opt.startsWith("-Dspark.ssl.keyPassword")
--- End diff --

wrong indentation: missing 2 spaces.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93157/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202991834
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala ---
@@ -282,6 +282,40 @@ class MasterSuite extends SparkFunSuite
 }
   }
 
+  test("SPARK-24621: https urls when ssl enabled") {
+implicit val formats = org.json4s.DefaultFormats
+val conf = new SparkConf()
+conf.set("spark.ssl.enabled", "true")
+val localCluster = new LocalSparkCluster(2, 2, 512, conf)
+localCluster.start()
+try {
+  eventually(timeout(5 seconds), interval(100 milliseconds)) {
+val json = 
Source.fromURL(s"https://localhost:${localCluster.masterWebUIPort}/json;)
+  .getLines().mkString("\n")
+assert(json.contains('http://localhost:${localCluster.masterWebUIPort}/json;)
+  .getLines().mkString("\n")
+

[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202991814
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala ---
@@ -282,6 +282,40 @@ class MasterSuite extends SparkFunSuite
 }
   }
 
+  test("SPARK-24621: https urls when ssl enabled") {
+implicit val formats = org.json4s.DefaultFormats
+val conf = new SparkConf()
+conf.set("spark.ssl.enabled", "true")
+val localCluster = new LocalSparkCluster(2, 2, 512, conf)
+localCluster.start()
+try {
+  eventually(timeout(5 seconds), interval(100 milliseconds)) {
+val json = 
Source.fromURL(s"https://localhost:${localCluster.masterWebUIPort}/json;)
+  .getLines().mkString("\n")
+assert(json.contains('

[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #93157 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93157/testReport)**
 for PR 21102 at commit 
[`28e0c45`](https://github.com/apache/spark/commit/28e0c45441348c89c627770059d03f7228d0f94b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-07-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21514#discussion_r202991729
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
 ---
@@ -99,8 +99,11 @@ private[spark] class StandaloneSchedulerBackend(
 // Start executors with a few necessary configs for registering with 
the scheduler
 val sparkJavaOpts = Utils.sparkJavaOpts(conf, 
SparkConf.isExecutorStartupConf)
 val javaOpts = sparkJavaOpts ++ extraJavaOpts
-val command = 
Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
-  args, sc.executorEnvs, classPathEntries ++ testingClassPath, 
libraryPathEntries, javaOpts)
+val javaOptsFiltered = javaOpts.filterNot { opt =>
+opt.startsWith("-Dspark.ssl.keyStorePassword") || 
opt.startsWith("-Dspark.ssl.keyPassword")
+}
+val command = 
Command("org.apache.spark.executor.CoarseGrainedExecutorBackend", args,
+sc.executorEnvs, classPathEntries ++ testingClassPath, 
libraryPathEntries, javaOptsFiltered)
--- End diff --

wrong indentation here too, missing 2 spaces. Moreover, in such cases, we 
usually put one argument per line, so:

```
val command = 
Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
  args,
  sc.executorEnvs,
  ...)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21782: [SPARK-24816][SQL] SQL interface support repartitionByRa...

2018-07-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21782
  
**[Test build #93160 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93160/testReport)**
 for PR 21782 at commit 
[`5f2951d`](https://github.com/apache/spark/commit/5f2951db5a5b7a5030d9ac373a234ed1a007cce7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21791: [SPARK-24832][SQL] Improve inputMetrics's bytesRe...

2018-07-17 Thread yucai
GitHub user yucai opened a pull request:

https://github.com/apache/spark/pull/21791

[SPARK-24832][SQL] Improve inputMetrics's bytesRead update for ColumnarBatch

## What changes were proposed in this pull request?

Currently, ColumnarBatch's bytesRead need to be updated every 4096 * 1000 
rows, which makes the metrics out of date. This PR makes it update for each 
batch.

## How was this patch tested?

Existing UTs.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yucai/spark SPARK-24832

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21791.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21791


commit b9320c8d8a3735d4569709b51a0d66a7121e23cb
Author: yucai 
Date:   2018-07-17T10:20:18Z

[SPARK-24832][SQL] Improve inputMetrics's bytesRead update for ColumnarBatch




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21791: [SPARK-24832][SQL] Improve inputMetrics's bytesRead upda...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21791
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21791: [SPARK-24832][SQL] Improve inputMetrics's bytesRead upda...

2018-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21791
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >