[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23040
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23040
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98989/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23040
  
**[Test build #98989 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98989/testReport)**
 for PR 23040 at commit 
[`3c6d349`](https://github.com/apache/spark/commit/3c6d349b26e54ead7c345e11ffacf14edcd072c1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5132/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23025
  
**[Test build #98992 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98992/testReport)**
 for PR 23025 at commit 
[`7ca4821`](https://github.com/apache/spark/commit/7ca48214cda312d78c22ad4305d2e490c46535f5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-18 Thread JulienPeloton
Github user JulienPeloton commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234512932
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -732,6 +732,11 @@ def repartitionByRange(self, numPartitions, *cols):
 At least one partition-by expression must be specified.
 When no explicit sort order is specified, "ascending nulls first" 
is assumed.
 
+Note that due to performance reasons this method uses sampling to 
estimate the ranges.
--- End diff --

Oh right, I missed it! Pushed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-18 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234511357
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -732,6 +732,11 @@ def repartitionByRange(self, numPartitions, *cols):
 At least one partition-by expression must be specified.
 When no explicit sort order is specified, "ascending nulls first" 
is assumed.
 
+Note that due to performance reasons this method uses sampling to 
estimate the ranges.
--- End diff --

Besides Python, we also have `repartitionByRange` API in R. Can you also 
update it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5131/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-18 Thread JulienPeloton
Github user JulienPeloton commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234509708
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2789,6 +2789,12 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   *
+   * Note that due to performance reasons this method uses sampling to 
estimate the ranges.
+   * Hence, the output may not be consistent, since sampling can return 
different values.
+   * The sample size can be controlled by setting the value of the 
parameter
+   * `spark.sql.execution.rangeExchange.sampleSizePerPartition`.
--- End diff --

@cloud-fan the sentence has been changed according to your suggestion (in 
both Spark & PySpark).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23025
  
**[Test build #98991 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98991/testReport)**
 for PR 23025 at commit 
[`f829dfe`](https://github.com/apache/spark/commit/f829dfe0ce5c4d6be68c1247102d58a99b21ad56).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInP...

2018-11-18 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request:

https://github.com/apache/spark/pull/23079#discussion_r234508866
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala
 ---
@@ -298,6 +299,45 @@ class ReplaceNullWithFalseSuite extends PlanTest {
 testProjection(originalExpr = column, expectedExpr = column)
   }
 
+  test("replace nulls in lambda function of ArrayFilter") {
+val cond = GreaterThan(UnresolvedAttribute("e"), Literal(0))
--- End diff --

Actually I intentionally made all three lambda the same (the `MapFilter` 
one only differs in the lambda parameter). I can encapsulate this lambda 
function into a test utility function. Let me update the PR and see what you 
think.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInP...

2018-11-18 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request:

https://github.com/apache/spark/pull/23079#discussion_r234508561
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -767,6 +767,15 @@ object ReplaceNullWithFalse extends Rule[LogicalPlan] {
   replaceNullWithFalse(cond) -> value
 }
 cw.copy(branches = newBranches)
+  case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
--- End diff --

I'm not sure if that's useful or not. First of all, the 
`replaceNullWithFalse` handling doesn't apply to all higher-order functions. In 
fact it only applies to a very narrow set, ones where a lambda function returns 
`BooleanType` and is immediately used as a predicate. So having a generic 
utility can certainly help make this PR slightly simpler, but I don't know how 
useful it is for other cases.
I'd prefer waiting for more such transformation cases to introduce a new 
utility for the pattern. WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23082: [SPARK-26112][SQL] Update since versions of new built-in...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23082
  
**[Test build #98990 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98990/testReport)**
 for PR 23082 at commit 
[`f26db66`](https://github.com/apache/spark/commit/f26db66986a12049e14d1b234840b66f0b96767f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23082: [SPARK-26112][SQL] Update since versions of new built-in...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23082
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23082: [SPARK-26112][SQL] Update since versions of new built-in...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23082
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5130/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23082: [SPARK-26112][SQL] Update since versions of new built-in...

2018-11-18 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/23082
  
cc @cloud-fan @gatorsmile @dongjoon-hyun 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23082: [SPARK-26112][SQL] Update since versions of new b...

2018-11-18 Thread ueshin
GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/23082

[SPARK-26112][SQL] Update since versions of new built-in functions.

## What changes were proposed in this pull request?

The following 5 functions were removed from branch-2.4:

- map_entries
- map_filter
- transform_values
- transform_keys
- map_zip_with

We should update the since version to 3.0.0.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-26112/since

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23082.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23082


commit f26db66986a12049e14d1b234840b66f0b96767f
Author: Takuya UESHIN 
Date:   2018-11-19T06:36:38Z

Update since version to 3.0.0.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-18 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/23045
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-18 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23045#discussion_r234502854
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -521,13 +521,18 @@ case class MapEntries(child: Expression) extends 
UnaryExpression with ExpectsInp
 case class MapConcat(children: Seq[Expression]) extends 
ComplexTypeMergingExpression {
 
   override def checkInputDataTypes(): TypeCheckResult = {
-var funcName = s"function $prettyName"
+val funcName = s"function $prettyName"
 if (children.exists(!_.dataType.isInstanceOf[MapType])) {
   TypeCheckResult.TypeCheckFailure(
 s"input to $funcName should all be of type map, but it's " +
   children.map(_.dataType.catalogString).mkString("[", ", ", "]"))
 } else {
-  TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
funcName)
+  val sameTypeCheck = 
TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), funcName)
+  if (sameTypeCheck.isFailure) {
+sameTypeCheck
+  } else {
+TypeUtils.checkForMapKeyType(dataType.keyType)
--- End diff --

oh, I see. thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98988/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98988 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98988/testReport)**
 for PR 23054 at commit 
[`b5cfda4`](https://github.com/apache/spark/commit/b5cfda40cf0939e03900e571b1642285fea9a528).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-18 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23043
  
Do we need to consider `GenerateSafeProjection`, too? In other words, if 
the generated code or runtime does not use data in `Unsafe`, this `+0.0/-0.0` 
problem may still exist.  
Am I correct?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98987/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23025
  
**[Test build #98987 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98987/testReport)**
 for PR 23025 at commit 
[`654fed9`](https://github.com/apache/spark/commit/654fed90997140715d2d52578ca6e4f0661d4e69).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/23076
  
I'm seeing both sides of needs: while I think dumping full plan into file 
is a good feature for debugging specific issue, retaining full plans for 
representing them to UI page have been a headache and three regarding issues 
([SPARK-23904](https://issues.apache.org/jira/browse/SPARK-23904), 
[SPARK-25380](https://issues.apache.org/jira/browse/SPARK-25380), 
[SPARK-26103](https://issues.apache.org/jira/browse/SPARK-26103)) are filed in 
3 months which doesn't look like a thing we can say end users should take a 
workaround.

One thing we may be aware is that huge plan is not generated not only from 
nested join, but also from lots of columns, like SPARK-23904. For SPARK-25380 
we are not aware of which parts generate huge plan. So we might feel easier and 
flexible to just truncate to specific size rather than applying conditions.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23045#discussion_r234494542
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -521,13 +521,18 @@ case class MapEntries(child: Expression) extends 
UnaryExpression with ExpectsInp
 case class MapConcat(children: Seq[Expression]) extends 
ComplexTypeMergingExpression {
 
   override def checkInputDataTypes(): TypeCheckResult = {
-var funcName = s"function $prettyName"
+val funcName = s"function $prettyName"
 if (children.exists(!_.dataType.isInstanceOf[MapType])) {
   TypeCheckResult.TypeCheckFailure(
 s"input to $funcName should all be of type map, but it's " +
   children.map(_.dataType.catalogString).mkString("[", ", ", "]"))
 } else {
-  TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
funcName)
+  val sameTypeCheck = 
TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), funcName)
+  if (sameTypeCheck.isFailure) {
+sameTypeCheck
+  } else {
+TypeUtils.checkForMapKeyType(dataType.keyType)
--- End diff --

see 
https://github.com/apache/spark/pull/23045/files#diff-3f19ec3d15dcd8cd42bb25dde1c5c1a9R20
 . The child may be read from parquet files, so map of map is still possible.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-18 Thread LinhongLiu
Github user LinhongLiu commented on the issue:

https://github.com/apache/spark/pull/23040
  
cc @cloud-fan @srowen 
review is fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23040: [SPARK-26068][Core]ChunkedByteBufferInputStream should h...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23040
  
**[Test build #98989 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98989/testReport)**
 for PR 23040 at commit 
[`3c6d349`](https://github.com/apache/spark/commit/3c6d349b26e54ead7c345e11ffacf14edcd072c1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23058: [SPARK-25905][CORE] When getting a remote block, avoid f...

2018-11-18 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/23058
  
@attilapiros can you review this please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23027: [SPARK-26049][SQL][TEST] FilterPushdownBenchmark ...

2018-11-18 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/23027#discussion_r234482766
  
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -2,669 +2,809 @@
 Pushdown for many distinct value case
 

 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
 Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
 

-Parquet Vectorized  11405 / 11485  1.4 
725.1   1.0X
-Parquet Vectorized (Pushdown)  675 /  690 23.3 
 42.9  16.9X
-Native ORC Vectorized 7127 / 7170  2.2 
453.1   1.6X
-Native ORC Vectorized (Pushdown)   519 /  541 30.3 
 33.0  22.0X
+Parquet Vectorized7823 / 7996  2.0 
497.4   1.0X
+Parquet Vectorized (Pushdown)  460 /  468 34.2 
 29.2  17.0X
+Native ORC Vectorized 5412 / 5550  2.9 
344.1   1.4X
+Native ORC Vectorized (Pushdown)   551 /  563 28.6 
 35.0  14.2X
+InMemoryTable Vectorized 6 /6   2859.1 
  0.31422.0X
+InMemoryTable Vectorized (Pushdown)  5 /6   3023.0 
  0.31503.6X
 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
 Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
 

-Parquet Vectorized  11457 / 11473  1.4 
728.4   1.0X
-Parquet Vectorized (Pushdown)  656 /  686 24.0 
 41.7  17.5X
-Native ORC Vectorized 7328 / 7342  2.1 
465.9   1.6X
-Native ORC Vectorized (Pushdown)   539 /  565 29.2 
 34.2  21.3X
+Parquet Vectorized   8322 / 11160  1.9 
529.1   1.0X
+Parquet Vectorized (Pushdown)  463 /  472 34.0 
 29.4  18.0X
+Native ORC Vectorized 5622 / 5635  2.8 
357.4   1.5X
+Native ORC Vectorized (Pushdown)   563 /  595 27.9 
 35.8  14.8X
+InMemoryTable Vectorized  4831 / 4881  3.3 
307.2   1.7X
+InMemoryTable Vectorized (Pushdown)   1980 / 2027  7.9 
125.9   4.2X
 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
 Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
 

-Parquet Vectorized  11878 / 11888  1.3 
755.2   1.0X
-Parquet Vectorized (Pushdown)  630 /  654 25.0 
 40.1  18.9X
-Native ORC Vectorized 7342 / 7362  2.1 
466.8   1.6X
-Native ORC Vectorized (Pushdown)   519 /  537 30.3 
 33.0  22.9X
+Parquet Vectorized8322 / 8386  1.9 
529.1   1.0X
+Parquet Vectorized (Pushdown)  434 /  441 36.2 
 27.6  19.2X
+Native ORC Vectorized 5659 / 5944  2.8 
359.8   1.5X
+Native ORC Vectorized (Pushdown)   535 /  567 29.4 
 34.0  15.6X
+InMemoryTable Vectorized  4784 / 4879  3.3 
304.1   1.7X
+InMemoryTable Vectorized (Pushdown)   1950 / 1985  8.1 
124.0   4.3X
 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 

[GitHub] spark pull request #23045: [SPARK-26071][SQL] disallow map as map key

2018-11-18 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23045#discussion_r234481249
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -521,13 +521,18 @@ case class MapEntries(child: Expression) extends 
UnaryExpression with ExpectsInp
 case class MapConcat(children: Seq[Expression]) extends 
ComplexTypeMergingExpression {
 
   override def checkInputDataTypes(): TypeCheckResult = {
-var funcName = s"function $prettyName"
+val funcName = s"function $prettyName"
 if (children.exists(!_.dataType.isInstanceOf[MapType])) {
   TypeCheckResult.TypeCheckFailure(
 s"input to $funcName should all be of type map, but it's " +
   children.map(_.dataType.catalogString).mkString("[", ", ", "]"))
 } else {
-  TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
funcName)
+  val sameTypeCheck = 
TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), funcName)
+  if (sameTypeCheck.isFailure) {
+sameTypeCheck
+  } else {
+TypeUtils.checkForMapKeyType(dataType.keyType)
--- End diff --

I don't think we need this. The children already should not have map type 
keys?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23081
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98984/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23081
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23081
  
**[Test build #98984 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98984/testReport)**
 for PR 23081 at commit 
[`131164c`](https://github.com/apache/spark/commit/131164c2104a119468e782fb1d484f2d15274e33).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21363: [SPARK-19228][SQL] Migrate on Java 8 time from FastDateF...

2018-11-18 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/21363
  
@MaxGekk Sorry for the late, something inserted in the my scheduler, I plan 
to start this PR in this weekend, if its too late please just take it, sorry 
for the late again.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-18 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23043
  
Is it better to update this PR title now?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of non-struct ty...

2018-11-18 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23054#discussion_r234477289
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -459,7 +460,11 @@ class KeyValueGroupedDataset[K, V] private[sql](
   columns.map(_.withInputType(vExprEnc, dataAttributes).named)
 val keyColumn = if (!kExprEnc.isSerializedAsStruct) {
   assert(groupingAttributes.length == 1)
-  groupingAttributes.head
+  if (SQLConf.get.aliasNonStructGroupingKey) {
--- End diff --

hmm, don't we want to have "key" attribute and only have old "value" 
attribute when we turn on legacy config?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23043: [SPARK-26021][SQL] replace minus zero with zero in Unsaf...

2018-11-18 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23043
  
@srowen #21794 is what I thought.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23080
  
Ah, also, `CsvParser.beginParsing` takes an additional argument `Charset`. 
It should rather be easily able to support encoding in `multiLine`. @MaxGekk, 
would you be able to find some time to work on it? If that change can make the 
current PR easier. we can merge that one first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of non-struct ty...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23054#discussion_r234476607
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -459,7 +460,11 @@ class KeyValueGroupedDataset[K, V] private[sql](
   columns.map(_.withInputType(vExprEnc, dataAttributes).named)
 val keyColumn = if (!kExprEnc.isSerializedAsStruct) {
   assert(groupingAttributes.length == 1)
-  groupingAttributes.head
+  if (SQLConf.get.aliasNonStructGroupingKey) {
--- End diff --

we should do the lias when config is true...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23080#discussion_r234476318
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala 
---
@@ -192,6 +192,20 @@ class CSVOptions(
*/
   val emptyValueInWrite = emptyValue.getOrElse("\"\"")
 
+  /**
+   * A string between two consecutive JSON records.
+   */
+  val lineSeparator: Option[String] = parameters.get("lineSep").map { sep 
=>
+require(sep.nonEmpty, "'lineSep' cannot be an empty string.")
+require(sep.length <= 2, "'lineSep' can contain 1 or 2 characters.")
+sep
+  }
+
+  val lineSeparatorInRead: Option[Array[Byte]] = lineSeparator.map { 
lineSep =>
+lineSep.getBytes("UTF-8")
--- End diff --

@MaxGekk, CSV's multiline does not support encoding but I think normal mode 
supports `encoding`. It should be okay to get bytes from it. We can just throw 
an exception when multiline is enabled.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23043#discussion_r234476361
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -723,4 +723,32 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
   "grouping expressions: [current_date(None)], value: [key: int, 
value: string], " +
 "type: GroupBy]"))
   }
+
+  test("SPARK-26021: Double and Float 0.0/-0.0 should be equal when 
grouping") {
+val colName = "i"
+def groupByCollect(df: DataFrame): Array[Row] = {
+  df.groupBy(colName).count().collect()
+}
+def assertResult[T](result: Array[Row], zero: T)(implicit ordering: 
Ordering[T]): Unit = {
+  assert(result.length == 1)
+  // using compare since 0.0 == -0.0 is true
+  assert(ordering.compare(result(0).getAs[T](0), zero) == 0)
--- End diff --

Instead of checking the result, I prefer the code snippet in the JIRA 
ticket, which is more obvious about where is the problem.

Let's run a group-by query, with both 0.0 and -0.0 in the input. Then we 
check the number of result rows, as ideally 0.0 and -0.0 is same, so we should 
only have one group(one result row).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98988 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98988/testReport)**
 for PR 23054 at commit 
[`b5cfda4`](https://github.com/apache/spark/commit/b5cfda40cf0939e03900e571b1642285fea9a528).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of non-struct type unde...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5129/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21888: [SPARK-24253][SQL][WIP] Implement DeleteFrom for v2 tabl...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21888
  
**[Test build #98986 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98986/testReport)**
 for PR 21888 at commit 
[`f8b178d`](https://github.com/apache/spark/commit/f8b178d34b870e779ec061175f01ba63a5adc076).
 * This patch **fails to build**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedRelation(table: CatalogTableIdentifier) extends 
LeafNode with NamedRelation `
  * `sealed trait IdentifierWithOptionalDatabaseAndCatalog `
  * `case class CatalogTableIdentifier(table: String, database: 
Option[String], catalog: Option[String])`
  * `class TableIdentifier(name: String, db: Option[String])`
  * `  implicit class CatalogHelper(catalog: CatalogProvider) `
  * `case class ResolveCatalogV2Relations(sparkSession: SparkSession) 
extends Rule[LogicalPlan] `
  * `case class DeleteFromV2Exec(rel: TableV2Relation, expr: Expression)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21888: [SPARK-24253][SQL][WIP] Implement DeleteFrom for v2 tabl...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21888
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98986/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21888: [SPARK-24253][SQL][WIP] Implement DeleteFrom for v2 tabl...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21888
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23043#discussion_r234475978
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
@@ -157,4 +159,15 @@ public void heapMemoryReuse() {
 Assert.assertEquals(onheap4.size(), 1024 * 1024 + 7);
 Assert.assertEquals(obj3, onheap4.getBaseObject());
   }
+
+  @Test
+  // SPARK-26021
+  public void writeMinusZeroIsReplacedWithZero() {
+byte[] doubleBytes = new byte[Double.BYTES];
+byte[] floatBytes = new byte[Float.BYTES];
+Platform.putDouble(doubleBytes, Platform.BYTE_ARRAY_OFFSET, -0.0d);
+Platform.putFloat(floatBytes, Platform.BYTE_ARRAY_OFFSET, -0.0f);
+Assert.assertEquals(0, Double.compare(0.0d, 
ByteBuffer.wrap(doubleBytes).getDouble()));
--- End diff --

are you sure this test fails before the fix? IIUC `0.0 == -0.0` is ture, 
but they have different binary format


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23043#discussion_r234476055
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
@@ -157,4 +159,15 @@ public void heapMemoryReuse() {
 Assert.assertEquals(onheap4.size(), 1024 * 1024 + 7);
 Assert.assertEquals(obj3, onheap4.getBaseObject());
   }
+
+  @Test
+  // SPARK-26021
+  public void writeMinusZeroIsReplacedWithZero() {
+byte[] doubleBytes = new byte[Double.BYTES];
+byte[] floatBytes = new byte[Float.BYTES];
+Platform.putDouble(doubleBytes, Platform.BYTE_ARRAY_OFFSET, -0.0d);
+Platform.putFloat(floatBytes, Platform.BYTE_ARRAY_OFFSET, -0.0f);
+Assert.assertEquals(0, Double.compare(0.0d, 
ByteBuffer.wrap(doubleBytes).getDouble()));
--- End diff --

BTW thanks for adding the unit test! It's a good complementary to the 
end-to-end test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23043: [SPARK-26021][SQL] replace minus zero with zero i...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23043#discussion_r234475858
  
--- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java 
---
@@ -120,6 +120,9 @@ public static float getFloat(Object object, long 
offset) {
   }
 
   public static void putFloat(Object object, long offset, float value) {
+if(value == -0.0f) {
--- End diff --

I'm fine to put this trick here, shall we also move the IsNaN logic to here 
as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23025
  
**[Test build #98987 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98987/testReport)**
 for PR 23025 at commit 
[`654fed9`](https://github.com/apache/spark/commit/654fed90997140715d2d52578ca6e4f0661d4e69).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5128/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23080#discussion_r234475595
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala 
---
@@ -192,6 +192,20 @@ class CSVOptions(
*/
   val emptyValueInWrite = emptyValue.getOrElse("\"\"")
 
+  /**
+   * A string between two consecutive JSON records.
+   */
+  val lineSeparator: Option[String] = parameters.get("lineSep").map { sep 
=>
+require(sep.nonEmpty, "'lineSep' cannot be an empty string.")
+require(sep.length <= 2, "'lineSep' can contain 1 or 2 characters.")
--- End diff --

We could say the line separator should be 1 or 2 bytes (UTF-8) in read path 
specifically.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r234475550
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2789,6 +2789,12 @@ class Dataset[T] private[sql](
* When no explicit sort order is specified, "ascending nulls first" is 
assumed.
* Note, the rows are not sorted in each partition of the resulting 
Dataset.
*
+   *
+   * Note that due to performance reasons this method uses sampling to 
estimate the ranges.
+   * Hence, the output may not be consistent, since sampling can return 
different values.
+   * The sample size can be controlled by setting the value of the 
parameter
+   * `spark.sql.execution.rangeExchange.sampleSizePerPartition`.
--- End diff --

It's not a parameter but a config. So I'd like to propose
```
The sample size can be controlled by the config `xxx`
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-18 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23054#discussion_r234475488
  
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -17,6 +17,9 @@ displayTitle: Spark SQL Upgrading Guide
 
   - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
 
+  - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a 
grouped dataset with key attribute wrongly named as "value", if the key is 
atomic type, e.g. int, string, etc. This is counterintuitive and makes the 
schema of aggregation queries weird. For example, the schema of 
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the 
grouping attribute to "key". The old behaviour is preserved under a newly added 
configuration `spark.sql.legacy.atomicKeyAttributeGroupByKey` with a default 
value of `false`.
--- End diff --

Ok. More accurate.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23025: [SPARK-26024][SQL]: Update documentation for repartition...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23025
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23054#discussion_r234475321
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1594,6 +1594,15 @@ object SQLConf {
 "WHERE, which does not follow SQL standard.")
   .booleanConf
   .createWithDefault(false)
+
+  val LEGACY_ATOMIC_KEY_ATTRIBUTE_GROUP_BY_KEY =
+buildConf("spark.sql.legacy.atomicKeyAttributeGroupByKey")
--- End diff --

`spark.sql.legacy.dataset.aliasNonStructGroupingKey`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23080: [SPARK-26108][SQL] Support custom lineSep in CSV ...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23080#discussion_r234475228
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala 
---
@@ -192,6 +192,20 @@ class CSVOptions(
*/
   val emptyValueInWrite = emptyValue.getOrElse("\"\"")
 
+  /**
+   * A string between two consecutive JSON records.
+   */
+  val lineSeparator: Option[String] = parameters.get("lineSep").map { sep 
=>
+require(sep.nonEmpty, "'lineSep' cannot be an empty string.")
+require(sep.length <= 2, "'lineSep' can contain 1 or 2 characters.")
--- End diff --

@MaxGekk, might not be a super big deal but I believe this should be 
counted after converting it into `UTF-8`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23054: [SPARK-26085][SQL] Key attribute of primitive typ...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23054#discussion_r234475156
  
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -17,6 +17,9 @@ displayTitle: Spark SQL Upgrading Guide
 
   - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
 
+  - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a 
grouped dataset with key attribute wrongly named as "value", if the key is 
atomic type, e.g. int, string, etc. This is counterintuitive and makes the 
schema of aggregation queries weird. For example, the schema of 
`ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the 
grouping attribute to "key". The old behaviour is preserved under a newly added 
configuration `spark.sql.legacy.atomicKeyAttributeGroupByKey` with a default 
value of `false`.
--- End diff --

I realized that, only struct type key has the `key` alias. So here we 
should say: `if the key is non-struct type, e.g. int, string, array, etc.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21888: [SPARK-24253][SQL][WIP] Implement DeleteFrom for v2 tabl...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21888
  
**[Test build #98986 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98986/testReport)**
 for PR 21888 at commit 
[`f8b178d`](https://github.com/apache/spark/commit/f8b178d34b870e779ec061175f01ba63a5adc076).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInP...

2018-11-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23079#discussion_r234474562
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -767,6 +767,15 @@ object ReplaceNullWithFalse extends Rule[LogicalPlan] {
   replaceNullWithFalse(cond) -> value
 }
 cw.copy(branches = newBranches)
+  case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) =>
--- End diff --

shall we add a `withNewFunctions` method in `HigherOrderFunction`? Then we 
can simplify this rule to
```
case f: HigherOrderFunction => 
f.withNewFunctions(f.functions.map(replaceNullWithFalse))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23077: [SPARK-26105][PYTHON] Clean unittest2 imports up ...

2018-11-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23077


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-26105][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23077
  
Merged to master.

Thanks for reviewing this, @BryanCutler and @srowen.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23077
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98985/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23077
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23077
  
**[Test build #98985 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98985/testReport)**
 for PR 23077 at commit 
[`a188076`](https://github.com/apache/spark/commit/a1880767041b325e4343bd6a1737cdccfe614792).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-18 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23054
  
For non-primitive types there is a struct named "key".




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23077
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23077
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5127/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23077
  
**[Test build #98985 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98985/testReport)**
 for PR 23077 at commit 
[`a188076`](https://github.com/apache/spark/commit/a1880767041b325e4343bd6a1737cdccfe614792).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23077
  
Oh, I think the PR title should be SPARK-26105 too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23081
  
**[Test build #98984 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98984/testReport)**
 for PR 23081 at commit 
[`131164c`](https://github.com/apache/spark/commit/131164c2104a119468e782fb1d484f2d15274e33).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23081
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23081: [SPARK-26109][WebUI]Duration in the task summary metrics...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23081
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23077
  
>BTW, Bryan, do you have some time to work on the has_numpy stuff 

Yup, I can do that


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23077: [SPARK-25344][PYTHON] Clean unittest2 imports up that we...

2018-11-18 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/23077
  
Oops, actually I think there is one more here 
https://github.com/apache/spark/blob/master/python/pyspark/testing/mllibutils.py#L20

Other than that, looks good


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23081: [SPARK-26109][WebUI]Duration in the task summary ...

2018-11-18 Thread shahidki31
GitHub user shahidki31 opened a pull request:

https://github.com/apache/spark/pull/23081

[SPARK-26109][WebUI]Duration in the task summary metrics table and the task 
table are different

## What changes were proposed in this pull request?
Task summary displays the summary of the task table in the stage page. 
However, the duration metrics of task summary and task table are not matching. 
The reason is because, in the task summary we display executorRunTime as the 
duration and in task table, the actual duration.
Except duration metrics, all other metrics are properly displaying in the 
task summary.

In Spark2.2, we used to show executorRunTime as duration in the taskTable. 
That is why, in summary metrics also the exeuctorRunTime shows as the duration. 
In Spark2.3, it changed to the actual duration of task. So, summary metrics 
also should change according to that.

## How was this patch tested?
Before patch:

![screenshot from 2018-11-19 
04-32-06](https://user-images.githubusercontent.com/23054875/48679263-1e4fff80-ebb4-11e8-9ed5-16d892039e01.png)

After patch:
![screenshot from 2018-11-19 
04-37-39](https://user-images.githubusercontent.com/23054875/48679343-e39a9700-ebb4-11e8-8df9-9dc3a28d4bce.png)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shahidki31/spark duratinSummary

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23081.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23081


commit 131164c2104a119468e782fb1d484f2d15274e33
Author: Shahid 
Date:   2018-11-18T22:38:21Z

taskMetrics duration




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInP...

2018-11-18 Thread aokolnychyi
Github user aokolnychyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/23079#discussion_r234467085
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala
 ---
@@ -298,6 +299,45 @@ class ReplaceNullWithFalseSuite extends PlanTest {
 testProjection(originalExpr = column, expectedExpr = column)
   }
 
+  test("replace nulls in lambda function of ArrayFilter") {
+val cond = GreaterThan(UnresolvedAttribute("e"), Literal(0))
--- End diff --

Test cases for `ArrayFilter` and `ArrayExists` seem to be identical. As we 
have those tests anyway, would it make sense to cover different lambda 
functions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23079: [SPARK-26107][SQL] Extend ReplaceNullWithFalseInPredicat...

2018-11-18 Thread aokolnychyi
Github user aokolnychyi commented on the issue:

https://github.com/apache/spark/pull/23079
  
@rednaxelafx I am glad the rule gets more adoption. Renaming also makes 
sense to me.

Shall we extend `ReplaceNullWithFalseEndToEndSuite` as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23065
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98983/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23065
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23065
  
**[Test build #98983 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98983/testReport)**
 for PR 23065 at commit 
[`0cfcd90`](https://github.com/apache/spark/commit/0cfcd9056f4d93dfdeb447110e5e26030ad4ad3a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-18 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/23054
  
BTW what does the non-primitive types look like? Do they get flattened, or 
is there a strict?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98981/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23054
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23054: [SPARK-26085][SQL] Key attribute of primitive type under...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23054
  
**[Test build #98981 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98981/testReport)**
 for PR 23054 at commit 
[`6e3c37a`](https://github.com/apache/spark/commit/6e3c37ae454b83075707040d85813587cc92cccb).
 * This patch **fails from timeout after a configured wait of `400m`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23080
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98982/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23080
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23080
  
**[Test build #98982 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98982/testReport)**
 for PR 23080 at commit 
[`12022ad`](https://github.com/apache/spark/commit/12022ad1a0194a4bab9007d66145071562e066a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23075: [SPARK-26084][SQL] Fixes unresolved AggregateExpression....

2018-11-18 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/23075
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23075: [SPARK-26084][SQL] Fixes unresolved AggregateExpression....

2018-11-18 Thread ssimeonov
Github user ssimeonov commented on the issue:

https://github.com/apache/spark/pull/23075
  
@MaxGekk done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23073: [SPARK-26104] [Hydrogen] expose pci info to task ...

2018-11-18 Thread chenqin
Github user chenqin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23073#discussion_r234453776
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -27,12 +27,14 @@ import org.apache.spark.rpc.{RpcAddress, RpcEndpointRef}
  * @param executorHost The hostname that this executor is running on
  * @param freeCores  The current number of cores available for work on the 
executor
  * @param totalCores The total number of cores available to the executor
+ * @param pcis The external devices avaliable to the executor
--- End diff --

fixed


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23065
  
**[Test build #98983 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98983/testReport)**
 for PR 23065 at commit 
[`0cfcd90`](https://github.com/apache/spark/commit/0cfcd9056f4d93dfdeb447110e5e26030ad4ad3a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23065: [SPARK-26090][CORE][SQL][ML] Resolve most miscellaneous ...

2018-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23065
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5126/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >