[GitHub] spark pull request #20944: [SPARK-23831][SQL] Add org.apache.derby to Isolat...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20944#discussion_r179935823
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
 ---
@@ -188,6 +188,9 @@ private[hive] class IsolatedClientLoader(
 (name.startsWith("com.google") && 
!name.startsWith("com.google.cloud")) ||
 name.startsWith("java.lang.") ||
 name.startsWith("java.net") ||
+name.startsWith("com.sun.") ||
+name.startsWith("sun.reflect.") ||
--- End diff --

Do not add them unless we have to do it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20944
  
@wangyum What is the root cause?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20260: [SPARK-23039][SQL] Finish TODO work in alter table set l...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20260
  
cc @gengliangwang 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89019/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...

2018-04-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20963


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20987
  
**[Test build #89019 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89019/testReport)**
 for PR 20987 at commit 
[`b1997a7`](https://github.com/apache/spark/commit/b1997a7e9df56d48c28f825cfb30c02fe61de21d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20963: [SPARK-23849][SQL] Tests for the samplingRatio option of...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20963
  
LGTM. Thanks! Merged to master/


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20963#discussion_r179934861
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2127,4 +2127,39 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   assert(df.schema === expectedSchema)
 }
   }
+
+  test("SPARK-23849: schema inferring touches less data if samplingRation 
< 1.0") {
+val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46,
+  57, 62, 68, 72)
--- End diff --

Not need to have so many elements in this set. Please combine the tests in 
your CSV PR.

Instead of calling `json()`, we can do it using `format("json")`. Then, you 
can combine the test cases for both CSV and Json. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20963#discussion_r179934815
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2127,4 +2127,39 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   assert(df.schema === expectedSchema)
 }
   }
+
+  test("SPARK-23849: schema inferring touches less data if samplingRation 
< 1.0") {
+val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46,
+  57, 62, 68, 72)
+withTempPath { path =>
+  val writer = Files.newBufferedWriter(Paths.get(path.getAbsolutePath),
+StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW)
+  for (i <- 0 until 100) {
+if (predefinedSample.contains(i)) {
+  writer.write(s"""{"f1":${i.toString}}""" + "\n")
+} else {
+  writer.write(s"""{"f1":${(i.toDouble + 0.1).toString}}""" + "\n")
+}
+  }
+  writer.close()
+
+  val ds = spark.read.option("samplingRatio", 
0.1).json(path.getCanonicalPath)
--- End diff --

Yes. The seed is also given. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20959
  
@MaxGekk Thanks for working on this!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20959#discussion_r179934772
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -528,6 +529,7 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
* `header` (default `false`): uses the first line as names of 
columns.
* `inferSchema` (default `false`): infers the input schema 
automatically from data. It
* requires one extra pass over the data.
+   * `samplingRatio` (default 1.0): the sample ratio of rows used for 
schema inferring.
--- End diff --

Also need to update the PySpark API


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20959#discussion_r179934767
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1279,4 +1279,45 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils {
   Row("0,2013-111-11 12:13:14") :: Row(null) :: Nil
 )
   }
+
+  test("SPARK-23846: schema inferring touches less data if samplingRation 
< 1.0") {
+val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46,
+  57, 62, 68, 72)
+withTempPath { path =>
+  val writer = Files.newBufferedWriter(Paths.get(path.getAbsolutePath),
+StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW)
+  for (i <- 0 until 100) {
+if (predefinedSample.contains(i)) {
+  writer.write(i.toString + "\n")
+} else {
+  writer.write((i.toDouble + 0.1).toString + "\n")
+}
+  }
+  writer.close()
+
+  val ds = spark.read
+.option("inferSchema", true)
+.option("samplingRatio", 0.1)
+.csv(path.getCanonicalPath)
+  assert(ds.schema == new StructType().add("_c0", IntegerType))
+}
+  }
+
+  test("SPARK-23846: usage of samplingRation while parsing of dataset of 
strings") {
+val dstr = spark.sparkContext.parallelize(0 until 100, 1).map { i =>
+  val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46,
+57, 62, 68, 72)
+  if (predefinedSample.contains(i)) {
+i.toString + "\n"
+  } else {
+(i.toDouble + 0.1) + "\n"
+  }
+}.toDS()
+val ds = spark.read
+  .option("inferSchema", true)
+  .option("samplingRatio", 0.1)
--- End diff --

Add some negative case. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2071/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20959#discussion_r179934764
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1279,4 +1279,45 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils {
   Row("0,2013-111-11 12:13:14") :: Row(null) :: Nil
 )
   }
+
+  test("SPARK-23846: schema inferring touches less data if samplingRation 
< 1.0") {
+val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46,
+  57, 62, 68, 72)
--- End diff --

`val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46)` is 
enough. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21002
  
**[Test build #89024 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89024/testReport)**
 for PR 21002 at commit 
[`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21002
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20971
  
Thanks! Merged to 2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, as...

2018-04-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20962


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89020/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21002
  
**[Test build #89020 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89020/testReport)**
 for PR 21002 at commit 
[`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20962
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19691
  
@dongjoon-hyun @maropu @mgaido91  Could you review this PR? I think this 
command is a pretty useful to end users.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19691
  
**[Test build #89023 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89023/testReport)**
 for PR 19691 at commit 
[`9832ec5`](https://github.com/apache/spark/commit/9832ec55191deb995fe975d01d7899cb049207e5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP P...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19691#discussion_r179934313
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -282,6 +282,27 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
 parts.toMap
   }
 
+  /**
+   * Create a partition filter specification.
+   */
+  def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = 
withOrigin(ctx) {
+val parts = ctx.expression.asScala.map { pVal =>
+  expression(pVal) match {
+case EqualNullSafe(_, _) =>
+  throw new ParseException("'<=>' operator is not allowed in 
partition specification.", ctx)
+case cmp @ BinaryComparison(UnresolvedAttribute(name :: Nil), 
constant: Literal) =>
--- End diff --

Still the same question here. Constant has to be in the right side?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20999: [WIP][SPARK-23866][SQL] Support partition filters in ALT...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20999
  
Let us start reviewing that PR. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19691
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20999: [WIP][SPARK-23866][SQL] Support partition filters...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20999#discussion_r179934268
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -261,6 +261,14 @@ partitionVal
 : identifier (EQ constant)?
 ;
 
+dropPartitionSpec
+: PARTITION '(' dropPartitionVal (',' dropPartitionVal)* ')'
+;
+
+dropPartitionVal
+: identifier (comparisonOperator constant)?
--- End diff --

It has to be in this format? `partCol1 > 2` How about `2 > partCol1`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20816
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20816
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2070/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20816
  
**[Test build #89022 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89022/testReport)**
 for PR 20816 at commit 
[`7fe9329`](https://github.com/apache/spark/commit/7fe93295df5627f2fc4e712b71aa9ce75383d410).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20816
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2069/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20945: [SPARK-23790][Mesos] fix metastore connection issue

2018-04-07 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/20945
  
@susanxhuynh Unfortunately I cannot unify the APIs even for DC/OS, 1.10.x 
is different from 1.11.x 
(https://docs.mesosphere.com/services/spark/2.3.0-2.2.1-2/security/) and code 
is dependent on this (I played a bit with the DC/OS secret store API), not to 
mention other APIs out there. This would require a a generic secrets API at the 
pure mesos level (like in k8s) so I don't see a viable solution for now, unless 
I manage to restrict access to the TGT in client mode and essentially make it 
safe.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20992
  
**[Test build #89021 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89021/testReport)**
 for PR 20992 at commit 
[`0e1e0a0`](https://github.com/apache/spark/commit/0e1e0a0234d07ae9b0af2da31c58f5367911e54c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20858
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20858
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89018/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20858
  
**[Test build #89018 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89018/testReport)**
 for PR 20858 at commit 
[`367ee22`](https://github.com/apache/spark/commit/367ee2241901225e7451d7280611cecf23be82f1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...

2018-04-07 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/20944
  
cc @jerryshao 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20987
  
I filed https://issues.apache.org/jira/browse/SPARK-23894 for the test 
failure -- appears to be a flaky test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2068/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21002
  
**[Test build #89020 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89020/testReport)**
 for PR 21002 at commit 
[`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21002
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20987: [SPARK-23816][CORE] Killed tasks should ignore Fe...

2018-04-07 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20987#discussion_r179931481
  
--- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala 
---
@@ -257,19 +281,32 @@ class ExecutorSuite extends SparkFunSuite with 
LocalSparkContext with MockitoSug
   }
 
   private def runTaskAndGetFailReason(taskDescription: TaskDescription): 
TaskFailedReason = {
-runTaskGetFailReasonAndExceptionHandler(taskDescription)._1
+runTaskGetFailReasonAndExceptionHandler(taskDescription, false)._1
   }
 
   private def runTaskGetFailReasonAndExceptionHandler(
-  taskDescription: TaskDescription): (TaskFailedReason, 
UncaughtExceptionHandler) = {
+  taskDescription: TaskDescription,
+  killTask: Boolean): (TaskFailedReason, UncaughtExceptionHandler) = {
 val mockBackend = mock[ExecutorBackend]
 val mockUncaughtExceptionHandler = mock[UncaughtExceptionHandler]
 var executor: Executor = null
+var killingThread: Thread = null
--- End diff --

yeah good point -- I was originally thinking of that but I don't think that 
is needed.  however I did get rid of the indefinite awaits.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2067/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20987
  
**[Test build #89019 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89019/testReport)**
 for PR 20987 at commit 
[`b1997a7`](https://github.com/apache/spark/commit/b1997a7e9df56d48c28f825cfb30c02fe61de21d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20858
  
**[Test build #89018 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89018/testReport)**
 for PR 20858 at commit 
[`367ee22`](https://github.com/apache/spark/commit/367ee2241901225e7451d7280611cecf23be82f1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20874
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89014/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20874
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20992
  
**[Test build #89016 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89016/testReport)**
 for PR 20992 at commit 
[`3d25617`](https://github.com/apache/spark/commit/3d256179fbb833f2b49f3b8578d9de68e66429f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89016/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20874
  
**[Test build #89014 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89014/testReport)**
 for PR 20874 at commit 
[`9ef19df`](https://github.com/apache/spark/commit/9ef19dfcde9dc84f494bff5f03a56db840741496).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread mn-mikke
Github user mn-mikke commented on the issue:

https://github.com/apache/spark/pull/20858
  
@maropu I've modified the solution according to your comments:
- Removed UnresolvedConcat and merged string and array concatenation into 
one expression class.
- Implemented type coercion for concatenation of arrays and added tests for 
it
- Added codegen examples into the description

Please take a look... 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89017/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21001
  
**[Test build #89017 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89017/testReport)**
 for PR 21001 at commit 
[`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89012/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21002
  
**[Test build #89012 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89012/testReport)**
 for PR 21002 at commit 
[`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2066/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21003
  
**[Test build #89015 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89015/testReport)**
 for PR 21003 at commit 
[`63959c9`](https://github.com/apache/spark/commit/63959c90e712f4d8ff8ae660b22cf61dc91e3874).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89015/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21001
  
**[Test build #89017 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89017/testReport)**
 for PR 21001 at commit 
[`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/21001
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2065/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2064/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20992
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21003
  
**[Test build #89015 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89015/testReport)**
 for PR 21003 at commit 
[`63959c9`](https://github.com/apache/spark/commit/63959c90e712f4d8ff8ae660b22cf61dc91e3874).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20992
  
**[Test build #89016 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89016/testReport)**
 for PR 20992 at commit 
[`3d25617`](https://github.com/apache/spark/commit/3d256179fbb833f2b49f3b8578d9de68e66429f0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20874
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2063/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20874
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20858
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89010/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20858
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20858
  
**[Test build #89010 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89010/testReport)**
 for PR 20858 at commit 
[`8abd1a8`](https://github.com/apache/spark/commit/8abd1a8b92eee5b83c13a1969dcbfca7e6cb6a06).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20874
  
**[Test build #89014 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89014/testReport)**
 for PR 20874 at commit 
[`9ef19df`](https://github.com/apache/spark/commit/9ef19dfcde9dc84f494bff5f03a56db840741496).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for Arr...

2018-04-07 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20984#discussion_r179920210
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala 
---
@@ -164,3 +167,32 @@ abstract class ArrayData extends SpecializedGetters 
with Serializable {
 }
   }
 }
+
+class ArrayDataIndexedSeq[T](arrayData: ArrayData, dataType: DataType) 
extends IndexedSeq[T] {
+
+  private lazy val accessor: (Int) => Any = dataType match {
+case BooleanType => (idx: Int) => arrayData.getBoolean(idx)
+case ByteType => (idx: Int) => arrayData.getByte(idx)
+case ShortType => (idx: Int) => arrayData.getShort(idx)
+case IntegerType => (idx: Int) => arrayData.getInt(idx)
+case LongType => (idx: Int) => arrayData.getLong(idx)
+case FloatType => (idx: Int) => arrayData.getFloat(idx)
+case DoubleType => (idx: Int) => arrayData.getDouble(idx)
+case d: DecimalType => (idx: Int) => arrayData.getDecimal(idx, 
d.precision, d.scale)
+case CalendarIntervalType => (idx: Int) => arrayData.getInterval(idx)
+case StringType => (idx: Int) => arrayData.getUTF8String(idx)
+case BinaryType => (idx: Int) => arrayData.getBinary(idx)
+case s: StructType => (idx: Int) => arrayData.getStruct(idx, s.length)
+case _: ArrayType => (idx: Int) => arrayData.getArray(idx)
+case _: MapType => (idx: Int) => arrayData.getMap(idx)
+case _ => (idx: Int) => arrayData.get(idx, dataType)
+  }
+
+  override def apply(idx: Int): T = if (idx < arrayData.numElements()) {
--- End diff --

Do we need a check `0 <= idx`, too? If so, it would be good to update a 
message in the exception.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21003
  
**[Test build #89013 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89013/testReport)**
 for PR 21003 at commit 
[`2b588ef`](https://github.com/apache/spark/commit/2b588ef02131521653dd48433d2d7296eacaf30d).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol, 
HasHandleInvalid, JavaMLReadable,`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89013/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2062/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21003
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21003
  
**[Test build #89013 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89013/testReport)**
 for PR 21003 at commit 
[`2b588ef`](https://github.com/apache/spark/commit/2b588ef02131521653dd48433d2d7296eacaf30d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21003: [SPARK-23871][ML][PYTHON]add python api for Vecto...

2018-04-07 Thread huaxingao
GitHub user huaxingao opened a pull request:

https://github.com/apache/spark/pull/21003

[SPARK-23871][ML][PYTHON]add python api for VectorAssembler handleInvalid

## What changes were proposed in this pull request?

add python api for VectorAssembler handleInvalid

## How was this patch tested?

Add doctest


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huaxingao/spark spark-23871

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21003.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21003


commit 2b588ef02131521653dd48433d2d7296eacaf30d
Author: Huaxin Gao 
Date:   2018-04-07T15:24:16Z

[SPARK-23871][ML][PYTHON]add python api for VectorAssembler handleInvalid




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89011/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21001
  
**[Test build #89011 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89011/testReport)**
 for PR 21001 at commit 
[`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2061/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21002
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21002
  
**[Test build #89012 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89012/testReport)**
 for PR 21002 at commit 
[`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...

2018-04-07 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21002
  
cc @gatorsmile @hvanhovell


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21002: initial commit

2018-04-07 Thread kiszk
GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/21002

initial commit

## What changes were proposed in this pull request?

This PR avoids possible overflow at an operation `long = (long)(int * 
int)`. The multiplication of large positive integer values may set one to MSB. 
This leads to a negative value in long while we expected a positive value (e.g. 
`0111___ * ___0010`).

This PR performs long cast before the multiplication to avoid this 
situation.

## How was this patch tested?

Existing UTs

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-23893

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21002.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21002


commit ac24549d190a7c203d0a5a2e8f589b0ba797b0ba
Author: Kazuaki Ishizaki 
Date:   2018-04-07T14:16:18Z

initial commit




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >