[GitHub] spark issue #20625: [SPARK-23446][PYTHON] Explicitly check supported types i...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20625
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20626: [SPARK-23447][SQL] Cleanup codegen template for Literal

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20626
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20511#discussion_r168691737
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -160,6 +160,15 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
   }
 }
   }
+
+  test("SPARK-23340 Empty float/double array columns raise EOFException") {
+Seq(Seq(Array.empty[Float]).toDF(), 
Seq(Array.empty[Double]).toDF()).foreach { df =>
+  withTempPath { path =>
--- End diff --

? I already added the test case at 
[HiveOrcQuerySuite.scala](https://github.com/apache/spark/pull/20511/files#diff-1569b2874975978ed62a01aab108d093R212),
 too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20626: [SPARK-23447][SQL] Cleanup codegen template for L...

2018-02-15 Thread rednaxelafx
GitHub user rednaxelafx opened a pull request:

https://github.com/apache/spark/pull/20626

[SPARK-23447][SQL] Cleanup codegen template for Literal

## What changes were proposed in this pull request?

Cleaned up the codegen templates for `Literal`s, to make sure that the 
`ExprCode` returned from `Literal.doGenCode()` has:
1. an empty `code` field;
2. an `isNull` field of either literal `true` or `false`;
3. a `value` field that is just a simple literal/constant.

Before this PR, there are a couple of paths that would return a non-trivial 
`code` and all of them are actually unnecessary. The `NaN` and `Infinity` 
constants for `double` and `float` can be accessed through constants directly 
available so there's no need to add a reference for them.

Also took the opportunity to add a new util method for ease of creating 
`ExprCode` for inline-able non-null values.

## How was this patch tested?

Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rednaxelafx/apache-spark codegen-literal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20626.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20626


commit 68edf0f3463daed3bb7042becb333788b22b23b0
Author: Kris Mok 
Date:   2018-02-16T07:44:43Z

Cleanup codegen templates for Literals: make sure the `code` field is empty 
and the `value` field is a simple literal.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20625: [SPARK-23446][PYTHON] Explicitly check supported types i...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20625
  
**[Test build #87502 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87502/testReport)**
 for PR 20625 at commit 
[`c79c6df`](https://github.com/apache/spark/commit/c79c6df7284b9717fe4e4c26090dcb51bf7712da).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20567: [SPARK-23380][PYTHON] Make toPandas fallback to non-Arro...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20567
  
I just opened https://github.com/apache/spark/pull/20625. I believe this is 
the smallest and simplest change .. 

Will turn this PR to add a configuration later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20625: [SPARK-23446][PYTHON] Explicitly check supported ...

2018-02-15 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/20625

[SPARK-23446][PYTHON] Explicitly check supported types in toPandas

## What changes were proposed in this pull request?

This PR explicitly specifies the types we supported in `toPandas`. This was 
a hole. For example, we haven't finished the binary type support in Python side 
yet but now it allows as below:

```python
spark.conf.set("spark.sql.execution.arrow.enabled", "false")
df = spark.createDataFrame([[bytearray("a")]])
df.toPandas()
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
df.toPandas()
```

```
 _1
0  [97]
  _1
0  a
```

This should be disallowed. I think the same things also apply to nested 
timestamps too.

I also added some nicer message about `spark.sql.execution.arrow.enabled` 
in the error message.

## How was this patch tested?

Manually tested and tests added in `python/pyspark/sql/tests.py`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark 
pandas_convertion_supported_type

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20625.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20625


commit c79c6df7284b9717fe4e4c26090dcb51bf7712da
Author: hyukjinkwon 
Date:   2018-02-16T07:45:52Z

Explicitly specify supported types in toPandas




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20567: [SPARK-23380][PYTHON] Make toPandas fallback to non-Arro...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20567
  
Yup, I will. Sorry for delaying it. I was trying to make the fix small as 
possible as I can. Let me just open it as a simplest way.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20567: [SPARK-23380][PYTHON] Make toPandas fallback to non-Arro...

2018-02-15 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20567
  
@HyukjinKwon Will you submit a fix for the binary type today? We are very 
close to RC4. This is kind of urgent if we still want to block it in the Spark 
2.3.0 release. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20511#discussion_r168686683
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -160,6 +160,15 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
   }
 }
   }
+
+  test("SPARK-23340 Empty float/double array columns raise EOFException") {
+Seq(Seq(Array.empty[Float]).toDF(), 
Seq(Array.empty[Double]).toDF()).foreach { df =>
+  withTempPath { path =>
--- End diff --

Please also test both readers as we discussed above. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20295
  
Don't worry, I am keeping my eyes on this and I believe @ueshin too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20568
  
Jenkins, retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20568
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20568
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87501/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20568
  
**[Test build #87501 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)**
 for PR 20568 at commit 
[`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20621#discussion_r168680871
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
 ---
@@ -407,6 +407,29 @@ object PartitioningUtils {
   Literal(bigDecimal)
 }
 
+val dateTry = Try {
+  // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
+  // DateType
+  DateTimeUtils.getThreadLocalDateFormat.parse(raw)
+  // SPARK-23436: Casting the string to date may still return null if 
a bad Date is provided.
+  // We need to check that we can cast the raw string since we later 
can use Cast to get
+  // the partition values with the right DataType (see
+  // 
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.inferPartitioning)
+  val dateOption = Option(Cast(Literal(raw), DateType).eval())
--- End diff --

Can we add `require(dateOption.isDefine)` with some comments explicitly?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20621#discussion_r168680397
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
 ---
@@ -407,6 +407,29 @@ object PartitioningUtils {
   Literal(bigDecimal)
 }
 
+val dateTry = Try {
+  // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
+  // DateType
+  DateTimeUtils.getThreadLocalDateFormat.parse(raw)
--- End diff --

Ah, so the root cause is more specific to `SimpleDateFormat` because it 
allows invalid dates like `2018-01-01-04` to be parsed fine ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20624
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20624
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87500/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20624
  
**[Test build #87500 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87500/testReport)**
 for PR 20624 at commit 
[`cf36020`](https://github.com/apache/spark/commit/cf3602075dcee35494c72975e361b739939079b4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class CatalogColumnStat(`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20501: [SPARK-22430][Docs] Unknown tag warnings when building R...

2018-02-15 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20501
  
@rekhajoshm feel free to follow up after we are through with 2.3.0, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-02-15 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20464
  
Sorry, I'm a bit occupied with testing 2.3 RC, will get back to this after.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20618: [SPARK-23329][SQL] Fix documentation of trigonome...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20618#discussion_r168670009
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 ---
@@ -196,7 +208,13 @@ case class Asin(child: Expression) extends 
UnaryMathExpression(math.asin, "ASIN"
 
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(expr) - Returns the inverse tangent (a.k.a. 
arctangent).",
+  usage = "_FUNC_(expr) - Returns the inverse tangent (a.k.a. arc tangent) 
of `expr`, " +
+"as if computed by `java.lang.Math._FUNC_`.",
+  arguments =
--- End diff --

Could we just save one line and stick to the same indentation?

```scala
arguments = """
  Arguments:
* expr - number whose arc tangent is to be returned.
""",
examples = """
...
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20618: [SPARK-23329][SQL] Fix documentation of trigonome...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20618#discussion_r168670792
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 ---
@@ -521,7 +554,13 @@ case class Signum(child: Expression) extends 
UnaryMathExpression(math.signum, "S
 case class Sin(child: Expression) extends UnaryMathExpression(math.sin, 
"SIN")
 
 @ExpressionDescription(
-  usage = "_FUNC_(expr) - Returns the hyperbolic sine of `expr`.",
+  usage = "_FUNC_(expr) - Returns hyperbolic sine of `expr`, " +
--- End diff --

I think we can just do as below:

```scala
...
  usage = """
_FUNC_(expr) - Returns hyperbolic sine of `expr`, as if computed by
  `java.lang.Math._FUNC_`.
  """,
...
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87499/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20622
  
**[Test build #87499 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87499/testReport)**
 for PR 20622 at commit 
[`35f5b4a`](https://github.com/apache/spark/commit/35f5b4a495517d4f11998d6b7fb463851304712d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20618: [SPARK-23329][SQL] Fix documentation of trigonome...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20618#discussion_r168669517
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -1313,131 +1313,178 @@ object functions {
   
//
 
   /**
-   * Computes the cosine inverse of the given value; the returned angle is 
in the range
-   * 0.0 through pi.
+   * @param e the value whose arc cosine is to be returned
+   * @return  cosine inverse of the given value in the range of 0.0 
through pi,
+   *  as if computed by [[java.lang.Math#acos]]
*
* @group math_funcs
* @since 1.4.0
*/
   def acos(e: Column): Column = withExpr { Acos(e.expr) }
 
   /**
-   * Computes the cosine inverse of the given column; the returned angle 
is in the range
-   * 0.0 through pi.
+   * @param colName the value whose arc cosine is to be returned
+   * @returncosine inverse of the given value in the range of 0.0 
through pi,
+   *as if computed by [[java.lang.Math#acos]]
*
* @group math_funcs
* @since 1.4.0
*/
-  def acos(columnName: String): Column = acos(Column(columnName))
+  def acos(colName: String): Column = acos(Column(colName))
--- End diff --

I don't think we should change the name for that reason ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20568
  
**[Test build #87501 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)**
 for PR 20568 at commit 
[`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/20568
  
Jenkins, retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87498/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20554
  
**[Test build #87498 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87498/testReport)**
 for PR 20554 at commit 
[`2dea08a`](https://github.com/apache/spark/commit/2dea08a4c5f85991e4ad4c7da886c2e0bf456bb8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20567: [SPARK-23380][PYTHON] Make toPandas fallback to n...

2018-02-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20567#discussion_r168665914
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1941,12 +1941,24 @@ def toPandas(self):
 timezone = None
 
 if self.sql_ctx.getConf("spark.sql.execution.arrow.enabled", 
"false").lower() == "true":
+should_fallback = False
 try:
-from pyspark.sql.types import 
_check_dataframe_convert_date, \
-_check_dataframe_localize_timestamps
+from pyspark.sql.types import to_arrow_schema
 from pyspark.sql.utils import 
require_minimum_pyarrow_version
-import pyarrow
 require_minimum_pyarrow_version()
+# Check if its schema is convertible in Arrow format.
+to_arrow_schema(self.schema)
+except Exception as e:
--- End diff --

Hm, it might depend on which message we want to show. Will open another PR 
as discussed above.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20619: [SPARK-23390][SQL] Register task completion listerners f...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20619
  
It looks good to me that we move the registrations to the new (earlier) 
places.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87497/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20554
  
**[Test build #87497 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87497/testReport)**
 for PR 20554 at commit 
[`7f5df22`](https://github.com/apache/spark/commit/7f5df222da2e6cf59ed632b1c05165f1035202f3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20604
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20604
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87496/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20604
  
**[Test build #87496 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87496/testReport)**
 for PR 20604 at commit 
[`4d0b52e`](https://github.com/apache/spark/commit/4d0b52edc89bf98e3dccf4e6b044712bc09547ef).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20624
  
**[Test build #87500 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87500/testReport)**
 for PR 20624 at commit 
[`cf36020`](https://github.com/apache/spark/commit/cf3602075dcee35494c72975e361b739939079b4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20624
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20624
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/929/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue:

https://github.com/apache/spark/pull/20624
  
cc @gatorsmile @cloud-fan @marmbrus 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request:

https://github.com/apache/spark/pull/20624

[SPARK-23445] ColumnStat refactoring

## What changes were proposed in this pull request?

Refactor ColumnStat to be more flexible.

* Split `ColumnStat` and `CatalogColumnStat` just like `CatalogStatistics` 
is split from `Statistics`. This detaches how the statistics are stored from 
how they are processed in the query plan. `CatalogColumnStat` keeps `min` and 
`max` as `String`, making it not depend on dataType information.
* For `CatalogColumnStat`, parse column names from property names in the 
metastore (`KEY_VERSION` property), not from metastore schema. This means that 
`CatalogColumnStat`s can be created for columns even if the schema itself is 
not stored in the metastore.
* Make all fields optional. `min`, `max` and `histogram` for columns were 
optional already. Having them all optional is more consistent, and gives 
flexibility to e.g. drop some of the fields through transformations if they are 
difficult / impossible to calculate.

The added flexibility will make it possible to have alternative 
implementations for stats, and separates stats collection from stats and 
estimation processing in plans.

## How was this patch tested?

Refactored existing tests to work with refactored `ColumnStat` and 
`CatalogColumnStat`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/juliuszsompolski/apache-spark SPARK-23445

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20624.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20624


commit cf3602075dcee35494c72975e361b739939079b4
Author: Juliusz Sompolski 
Date:   2018-01-19T13:57:46Z

column stat refactoring




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20568
  
Retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20568
  
@mrkm4ntr Do not worry about these failures. Since we know there are some 
unstable tests, our community is trying to fix them. For a while, we have to 
kick test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue:

https://github.com/apache/spark/pull/20568
  
I cannot reproduce this failure of the test in my environment.
It seems to me that this is not related to this change...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20568: [SPARK-23381][CORE] Murmur3 hash generates a diff...

2018-02-15 Thread mrkm4ntr
Github user mrkm4ntr commented on a diff in the pull request:

https://github.com/apache/spark/pull/20568#discussion_r168659153
  
--- Diff: 
common/sketch/src/main/java/org/apache/spark/util/sketch/Murmur3_x86_32.java ---
@@ -71,6 +73,20 @@ public static int hashUnsafeBytes(Object base, long 
offset, int lengthInBytes, i
 return fmix(h1, lengthInBytes);
   }
 
+  public static int hashUnsafeBytes2(Object base, long offset, int 
lengthInBytes, int seed) {
+// This is compatible with original and another implementations.
+// Use this method after 2.3.0.
--- End diff --

Thanks, fixed it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20382: [SPARK-23097][SQL][SS] Migrate text socket source to V2

2018-02-15 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/20382
  
Hi @tdas, I'm on vacation this week, will update the code when I have time. 
Sorry for the delay.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20622
  
**[Test build #87499 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87499/testReport)**
 for PR 20622 at commit 
[`35f5b4a`](https://github.com/apache/spark/commit/35f5b4a495517d4f11998d6b7fb463851304712d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/20554
  
LGTM again


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87495/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20622
  
**[Test build #87495 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87495/testReport)**
 for PR 20622 at commit 
[`3ad7b3f`](https://github.com/apache/spark/commit/3ad7b3f547dac787022262a2f55bc7a7a6c30cd7).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-02-15 Thread bersprockets
Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/20424
  
@squito We made a few adjustments since your "lgtm". Do you want to take a 
quick look? @HyukjinKwon also gave his "lgtm" after the adjustments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20511#discussion_r168643357
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -160,6 +160,15 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
   }
 }
   }
+
+  test("SPARK-23340 Empty float/double array columns raise EOFException") {
+Seq(Seq(Array.empty[Float]).toDF(), 
Seq(Array.empty[Double]).toDF()).foreach { df =>
+  withTempPath { path =>
--- End diff --

Sure. This suite is in `sql/core` and inherits `OrcTest.scala`'s `val 
orcImp: String = "native"`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20623: [SPARK-23413][UI] Fix sorting tasks by Host / Executor I...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20623
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20623: [SPARK-23413][UI] Fix sorting tasks by Host / Executor I...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20623
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87492/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20623: [SPARK-23413][UI] Fix sorting tasks by Host / Executor I...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20623
  
**[Test build #87492 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87492/testReport)**
 for PR 20623 at commit 
[`f7a2282`](https://github.com/apache/spark/commit/f7a22827694a3aa92e8a7dd20195e2895e86880a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8745: [SPARK-10589] [WEBUI] Add defense against external site f...

2018-02-15 Thread alexmnyc
Github user alexmnyc commented on the issue:

https://github.com/apache/spark/pull/8745
  
Now I am not able to embed it on my grafana dashboard... That should be a 
configuration parameter


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/20554
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-02-15 Thread icexelloss
Github user icexelloss commented on the issue:

https://github.com/apache/spark/pull/20295
  
Resolved conflict and addressed @ueshin's comment. (Btw, I am fine with 
merging after Spark 2.3 RC passes, as that seems to be the priority now, just 
want to make sure this PR doesn't sit forever...)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20511#discussion_r168638420
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -160,6 +160,15 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
   }
 }
   }
+
+  test("SPARK-23340 Empty float/double array columns raise EOFException") {
+Seq(Seq(Array.empty[Float]).toDF(), 
Seq(Array.empty[Double]).toDF()).foreach { df =>
+  withTempPath { path =>
--- End diff --

Are we testing the native readers?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20554
  
**[Test build #87498 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87498/testReport)**
 for PR 20554 at commit 
[`2dea08a`](https://github.com/apache/spark/commit/2dea08a4c5f85991e4ad4c7da886c2e0bf456bb8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/928/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20382: [SPARK-23097][SQL][SS] Migrate text socket source to V2

2018-02-15 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/20382
  
@jerryshao any updates?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20554
  
**[Test build #87497 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87497/testReport)**
 for PR 20554 at commit 
[`7f5df22`](https://github.com/apache/spark/commit/7f5df222da2e6cf59ed632b1c05165f1035202f3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source to v2

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20554
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/927/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20604
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [SPARK-23365][CORE] Do not adjust num executors when kil...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20604
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/926/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20604: [WIP][SPARK-23365][CORE] Do not adjust num executors whe...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20604
  
**[Test build #87496 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87496/testReport)**
 for PR 20604 at commit 
[`4d0b52e`](https://github.com/apache/spark/commit/4d0b52edc89bf98e3dccf4e6b044712bc09547ef).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20295
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87490/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20295
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20295
  
**[Test build #87490 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87490/testReport)**
 for PR 20295 at commit 
[`9ed3779`](https://github.com/apache/spark/commit/9ed3779b665c90e5bb25bc6636997a4b080c3d34).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20057
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20057
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87493/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20057
  
**[Test build #87493 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87493/testReport)**
 for PR 20057 at commit 
[`6c0d3df`](https://github.com/apache/spark/commit/6c0d3dfd415e5630dbb02ce65c6adf3db419bdec).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source...

2018-02-15 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20554#discussion_r168626487
  
--- Diff: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala
 ---
@@ -112,14 +112,18 @@ abstract class KafkaSourceTest extends StreamTest 
with SharedSQLContext {
 query.nonEmpty,
 "Cannot add data when there is no query for finding the active 
kafka source")
 
-  val sources = query.get.logicalPlan.collect {
-case StreamingExecutionRelation(source: KafkaSource, _) => source
-  } ++ (query.get.lastExecution match {
-case null => Seq()
-case e => e.logical.collect {
-  case DataSourceV2Relation(_, reader: KafkaContinuousReader) => 
reader
-}
-  })
+  val sources = {
+query.get.logicalPlan.collect {
+  case StreamingExecutionRelation(source: KafkaSource, _) => source
+  case StreamingExecutionRelation(source: KafkaMicroBatchReader, 
_) => source
+} ++ (query.get.lastExecution match {
+  case null => Seq()
+  case e => e.logical.collect {
+case DataSourceV2Relation(_, reader: KafkaContinuousReader) => 
reader
+  }
+})
+  }.distinct
--- End diff --

yes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87491/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20622
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20622
  
**[Test build #87491 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87491/testReport)**
 for PR 20622 at commit 
[`3d8acd2`](https://github.com/apache/spark/commit/3d8acd2974d11a790ab9cd9338673bba18d683ac).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source...

2018-02-15 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20554#discussion_r168625863
  
--- Diff: 
external/kafka-0-10-sql/src/test/resources/kafka-source-initial-offset-future-version.bin
 ---
@@ -0,0 +1,2 @@
+0v9
+{"kafka-initial-offset-future-version":{"2":2,"1":1,"0":0}}
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20554: [SPARK-23362][SS] Migrate Kafka Microbatch source...

2018-02-15 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20554#discussion_r168625742
  
--- Diff: 
external/kafka-0-10-sql/src/test/resources/kafka-source-initial-offset-version-2.1.0.bin
 ---
@@ -1 +1 @@
-2{"kafka-initial-offset-2-1-0":{"2":0,"1":0,"0":0}}
\ No newline at end of file
+2{"kafka-initial-offset-2-1-0":{"2":2,"1":1,"0":0}}
--- End diff --

I modified the to make the test "deserialization of initial offset written 
by Spark 2.1.0 " stronger. See the updated test. The way it goes now is that we 
start the query from earliest offset, and simultaneous have this initial 
offsets that are NOT at 0 offset. And we check that the query is reading the 
first offset as given in the initial offset and not the earliest available in 
the topic. Hence I am changing the file a little bit, the values not the format.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread jose-torres
Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/20622
  
@zsxwing pointed out that the original behavior was more subtly wrong than 
I expected.

What we want to do is cancel the Spark job, and then cleanly restart it 
from the last checkpoint. But in fact, this was not working, since cancelling a 
Spark job throws an opaque SparkException which we didn't anticipate.

The reason things seemed to work was that the interrupt() call would almost 
always (but was not guaranteed to) interrupt the job cancellation, thus 
preventing the SparkException. So I've updated the PR to anticipate that 
SparkException, and filed SPARK-23444 to ask for a better handle for job 
cancellations.

Note that the continuous processing reconfiguration tests will always 
deterministically fail if they don't properly catch this exception, so the 
checking logic isn't really fragile despite being weird.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20622
  
**[Test build #87495 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87495/testReport)**
 for PR 20622 at commit 
[`3ad7b3f`](https://github.com/apache/spark/commit/3ad7b3f547dac787022262a2f55bc7a7a6c30cd7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20621: [SPARK-23436][SQL] Infer partition as Date only if it ca...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20621
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87488/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20621: [SPARK-23436][SQL] Infer partition as Date only if it ca...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20621
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20621: [SPARK-23436][SQL] Infer partition as Date only if it ca...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20621
  
**[Test build #87488 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87488/testReport)**
 for PR 20621 at commit 
[`6b56408`](https://github.com/apache/spark/commit/6b5640833a2d45986a0cf6074d7211a8ba9d2b3e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20620: [SPARK-23438][DSTREAMS] Fix DStreams data loss with WAL ...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20620
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87494/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20620: [SPARK-23438][DSTREAMS] Fix DStreams data loss with WAL ...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20620
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20620: [SPARK-23438][DSTREAMS] Fix DStreams data loss with WAL ...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20620
  
**[Test build #87494 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87494/testReport)**
 for PR 20620 at commit 
[`bd46d1c`](https://github.com/apache/spark/commit/bd46d1cb63e7a04e0236f7b1bf70b46fb55f3ea4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20511
  
Thank you, All.
Now, it's ready for review again for Apache Spark 2.4.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20601: [SPARK-23413][UI] Fix sorting tasks by Host / Executor I...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20601
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20511
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87483/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20601: [SPARK-23413][UI] Fix sorting tasks by Host / Executor I...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20601
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87487/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20511
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >