[GitHub] spark issue #21760: [SPARK-24776][SQL]Avro unit test: use SQLTestUtils and r...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21760
  
**[Test build #92971 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92971/testReport)**
 for PR 21760 at commit 
[`26b88ca`](https://github.com/apache/spark/commit/26b88ca201a70283528f289cdd2e1e216fce6e7a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21760: [SPARK-24776][SQL]Avro unit test: use SQLTestUtils and r...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21760
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/924/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21760: [SPARK-24776][SQL]Avro unit test: use SQLTestUtils and r...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21760
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21760: [SPARK-24776][SQL]Avro unit test: use SQLTestUtil...

2018-07-13 Thread gengliangwang
GitHub user gengliangwang opened a pull request:

https://github.com/apache/spark/pull/21760

[SPARK-24776][SQL]Avro unit test: use SQLTestUtils and replace deprecated 
methods

## What changes were proposed in this pull request?
Improve Avro unit test:
1. use QueryTest/SharedSQLContext/SQLTestUtils, instead of the duplicated 
test utils.
2. replace deprecated methods

## How was this patch tested?

Unit test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gengliangwang/spark improve_avro_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21760.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21760


commit 26b88ca201a70283528f289cdd2e1e216fce6e7a
Author: Gengliang Wang 
Date:   2018-07-13T11:41:56Z

improve AvroSuite




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92964/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21745
  
**[Test build #92964 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92964/testReport)**
 for PR 21745 at commit 
[`9e00db9`](https://github.com/apache/spark/commit/9e00db938ddc6293899170e19b41530b22fb525a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21759: sfas

2018-07-13 Thread marymwu
Github user marymwu closed the pull request at:

https://github.com/apache/spark/pull/21759


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21759: sfas

2018-07-13 Thread marymwu
GitHub user marymwu opened a pull request:

https://github.com/apache/spark/pull/21759

sfas

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marymwu/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21759.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21759


commit dcf36ad54598118408c1425e81aa6552f42328c8
Author: Dongjoon Hyun 
Date:   2016-05-03T13:02:04Z

[SPARK-15057][GRAPHX] Remove stale TODO comment for making `enum` in 
GraphGenerators

This PR removes a stale TODO comment in `GraphGenerators.scala`

Just comment removed.

Author: Dongjoon Hyun 

Closes #12839 from dongjoon-hyun/SPARK-15057.

(cherry picked from commit 46965cd014fd4ba68bdec15156ec9bcc27d9b217)
Signed-off-by: Reynold Xin 

commit 1dc30f189ac30f070068ca5f60b7b4c85f2adc9e
Author: Bryan Cutler 
Date:   2016-05-19T02:48:36Z

[DOC][MINOR] ml.feature Scala and Python API sync

I reviewed Scala and Python APIs for ml.feature and corrected discrepancies.

Built docs locally, ran style checks

Author: Bryan Cutler 

Closes #13159 from BryanCutler/ml.feature-api-sync.

(cherry picked from commit b1bc5ebdd52ed12aea3fdc7b8f2fa2d00ea09c6b)
Signed-off-by: Reynold Xin 

commit 642f00980f1de13a0f6d1dc8bc7ed5b0547f3a9d
Author: Zheng RuiFeng 
Date:   2016-05-15T14:59:49Z

[MINOR] Fix Typos

1,Rename matrix args in BreezeUtil to upper to match the doc
2,Fix several typos in ML and SQL

manual tests

Author: Zheng RuiFeng 

Closes #13078 from zhengruifeng/fix_ann.

(cherry picked from commit c7efc56c7b6fc99c005b35c335716ff676856c6c)
Signed-off-by: Reynold Xin 

commit 2126fb0c2b2bb8ac4c5338df15182fcf8713fb2f
Author: Sandeep Singh 
Date:   2016-05-19T09:44:26Z

[CORE][MINOR] Remove redundant set master in 
OutputCommitCoordinatorIntegrationSuite

Remove redundant set master in OutputCommitCoordinatorIntegrationSuite, as 
we are already setting it in SparkContext below on line 43.

existing tests

Author: Sandeep Singh 

Closes #13168 from techaddict/minor-1.

(cherry picked from commit 3facca5152e685d9c7da96bff5102169740a4a06)
Signed-off-by: Reynold Xin 

commit 1fc0f95eb8abbb9cc8ede2139670e493e6939317
Author: Andrew Or 
Date:   2016-05-20T05:40:03Z

[HOTFIX] Test compilation error from 52b967f

commit dd0c7fb39cac44e8f0d73f9884fd1582c25e9cf4
Author: Reynold Xin 
Date:   2016-05-20T05:46:08Z

Revert "[HOTFIX] Test compilation error from 52b967f"

This reverts commit 1fc0f95eb8abbb9cc8ede2139670e493e6939317.

commit f8d0177c31d43eab59a7535945f3dfa24e906273
Author: Davies Liu 
Date:   2016-05-18T23:02:52Z

Revert "[SPARK-15392][SQL] fix default value of size estimation of logical 
plan"

This reverts commit fc29b896dae08b957ed15fa681b46162600a4050.

(cherry picked from commit 84b23453ddb0a97e3d81306de0a5dcb64f88bdd0)
Signed-off-by: Reynold Xin 

commit 2ef645724a7f229309a87c5053b0fbdf45d06f52
Author: Takuya UESHIN 
Date:   2016-05-20T05:55:44Z

[SPARK-15313][SQL] EmbedSerializerInFilter rule should keep exprIds of 
output of surrounded SerializeFromObject.

## What changes were proposed in this pull request?

The following code:

```
val ds = Seq(("a", 1), ("b", 2), ("c", 3)).toDS()
ds.filter(_._1 == "b").select(expr("_1").as[String]).foreach(println(_))
```

throws an Exception:

```
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding 
attribute, tree: _1#420
 at 
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:50)
 at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:88)
 at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:87)

...
 Cause: java.lang.RuntimeException: Couldn't find _1#420 in [_1#416,_2#417]
 at scala.sys.package$.error(package.scala:27)
 at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$anonfun$applyOrElse$1.apply(BoundAttribute.scala:94)
 at 
org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$

[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-07-13 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r202302865
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ---
@@ -222,6 +225,14 @@ private[parquet] class ParquetFilters(pushDownDate: 
Boolean, pushDownStartWith:
 // See SPARK-20364.
 def canMakeFilterOn(name: String): Boolean = nameToType.contains(name) 
&& !name.contains(".")
 
+// All DataTypes that support `makeEq` can provide better performance.
+def shouldConvertInPredicate(name: String): Boolean = nameToType(name) 
match {
--- End diff --

@HyukjinKwon  How about remove this?
`Timestamp` type and `Decimal` type will be support soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/20611
  
@srowen  Thanks for the review. all comments has been addressed from my 
side. let me know for any clarifications


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/923/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21102
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21102
  
**[Test build #92970 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92970/testReport)**
 for PR 21102 at commit 
[`fce9eb0`](https://github.com/apache/spark/commit/fce9eb09bf0666711dbb5584c56b2534e495dffc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21505
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21505
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92969/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21505
  
**[Test build #92969 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92969/testReport)**
 for PR 21505 at commit 
[`c940381`](https://github.com/apache/spark/commit/c940381a0be36fd227e8f63caf32d3be86c5aa69).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21505
  
**[Test build #92969 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92969/testReport)**
 for PR 21505 at commit 
[`c940381`](https://github.com/apache/spark/commit/c940381a0be36fd227e8f63caf32d3be86c5aa69).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21505: [SPARK-24457][SQL] Improving performance of stringToTime...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21505
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19789: [SPARK-22562][Streaming] CachedKafkaConsumer unsafe evic...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19789
  
@daroo, mind reopening this if you have some time to update?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19789: [SPARK-22562][Streaming] CachedKafkaConsumer unsafe evic...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19789
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18113: [SPARK-20890][SQL] Added min and max typed aggregation f...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18113
  
@setjet, mind updating this please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202287558
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -378,6 +378,15 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_TIMESTAMP_ENABLED =
+buildConf("spark.sql.parquet.filterPushdown.timestamp")
+  .doc("If true, enables Parquet filter push-down optimization for 
Timestamp. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is " +
+"enabled and Timestamp stored as TIMESTAMP_MICROS or 
TIMESTAMP_MILLIS type.")
--- End diff --

... I don't think users will understand any of them .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r202286983
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -386,6 +386,17 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_INFILTERTHRESHOLD =
+buildConf("spark.sql.parquet.pushdown.inFilterThreshold")
+  .doc("The maximum number of values to filter push-down optimization 
for IN predicate. " +
+"Large threshold won't necessarily provide much better 
performance. " +
+"The experiment argued that 300 is the limit threshold. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is enabled.")
+  .internal()
+  .intConf
+  .checkValue(threshold => threshold > 0, "The threshold must be 
greater than 0.")
--- End diff --

Yup.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r202286636
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -386,6 +386,17 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_INFILTERTHRESHOLD =
+buildConf("spark.sql.parquet.pushdown.inFilterThreshold")
+  .doc("The maximum number of values to filter push-down optimization 
for IN predicate. " +
+"Large threshold won't necessarily provide much better 
performance. " +
+"The experiment argued that 300 is the limit threshold. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is enabled.")
+  .internal()
+  .intConf
+  .checkValue(threshold => threshold > 0, "The threshold must be 
greater than 0.")
--- End diff --

Let's use `-1`. Seems that's more consistent in the configurations.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20915: [SPARK-23803][SQL] Support bucket pruning

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20915
  
@cloud-fan, how does it relate to SPARK-23803, SPARK-12850 and SPARK-23507? 
Was about to take an action to the JIRAs but felt better making sure ahead.

SPARK-12850 was merged in 2.0.0 but reverted by SPARK-14535 in 2.0.0 so 
it's no problem but SPARK-23803 duplicates SPARK-12850.
and .. you plan to migrate file-based source to datasource v2 which 
includes refactoring this feature?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-07-13 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r202283085
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -386,6 +386,17 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_INFILTERTHRESHOLD =
+buildConf("spark.sql.parquet.pushdown.inFilterThreshold")
+  .doc("The maximum number of values to filter push-down optimization 
for IN predicate. " +
+"Large threshold won't necessarily provide much better 
performance. " +
+"The experiment argued that 300 is the limit threshold. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is enabled.")
+  .internal()
+  .intConf
+  .checkValue(threshold => threshold > 0, "The threshold must be 
greater than 0.")
--- End diff --

```scala
case sources.In(name, values) if canMakeFilterOn(name) && 
shouldConvertInPredicate(name) 
  && values.distinct.length <= pushDownInFilterThreshold =>
```
How about `0`.  `values.distinct.length` will not be less than `0`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21386: [SPARK-23928][SQL][WIP] Add shuffle collection function.

2018-07-13 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/21386
  
@pkuwm Hi, any updates on this? If you have any questions, please let us 
know. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21704: [SPARK-24734][SQL] Fix type coercions and nullabilities ...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21704
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/922/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21704: [SPARK-24734][SQL] Fix type coercions and nullabilities ...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21704
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21704: [SPARK-24734][SQL] Fix type coercions and nullabilities ...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21704
  
**[Test build #92967 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92967/testReport)**
 for PR 21704 at commit 
[`5115961`](https://github.com/apache/spark/commit/5115961fb0503cabbdbdead7c29c1521ab4f76cb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20611
  
**[Test build #92968 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92968/testReport)**
 for PR 20611 at commit 
[`bee161f`](https://github.com/apache/spark/commit/bee161f07ae4f76a0f090f64ac84c39f752652ce).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21704: [SPARK-24734][SQL] Fix type coercions and nullabi...

2018-07-13 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21704#discussion_r202278265
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -259,8 +270,22 @@ object TypeCoercion {
 }
   }
 
-  private def haveSameType(exprs: Seq[Expression]): Boolean =
-exprs.map(_.dataType).distinct.length == 1
+  private def haveSameType(exprs: Seq[Expression]): Boolean = {
--- End diff --

Since we have `CreateMap`, we can't make all such expressions 
`ComplexTypeMergingExpression`.
I'd apply 2) approach.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202277812
  
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -578,3 +578,127 @@ Native ORC Vectorized   11622 / 
12196  1.4 7
 Native ORC Vectorized (Pushdown)11377 / 11654  1.4 
723.3   1.0X
 
 

+
+Pushdown benchmark for Timestamp

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+Select 1 timestamp stored as INT96 row (value = CAST(7864320 AS 
timestamp)): Best/Avg Time(ms)Rate(M/s)   Per Row(ns)   Relative
--- End diff --

OK. I'll send a follow-up PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202277658
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
 ---
@@ -517,7 +585,6 @@ class ParquetFilterSuite extends QueryTest with 
ParquetTest with SharedSQLContex
 }
   }
 
-
--- End diff --

OK


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202277483
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -378,6 +378,15 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_TIMESTAMP_ENABLED =
+buildConf("spark.sql.parquet.filterPushdown.timestamp")
+  .doc("If true, enables Parquet filter push-down optimization for 
Timestamp. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is " +
+"enabled and Timestamp stored as TIMESTAMP_MICROS or 
TIMESTAMP_MILLIS type.")
--- End diff --

I think end users have a better understanding of `TIMESTAMP_MICROS` and 
`TIMESTAMP_MILLIS`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-13 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/21698
  
@jiangxb1987 data loss comes because a re-execution of zip might generate a 
key for which corresponding reducer has already finished.
Hence re-execution of stage will not result in subsequent child stage's 
reducer partition getting re-executed : resulting in data loss.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-07-13 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/21698
  
@cloud-fan That depends on what the computeKey is doing - which is user 
defined. It can have different values, or it need not (again, depends on user 
data and closure being applied).




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...

2018-07-13 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21537#discussion_r202271473
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -579,6 +579,18 @@ class CodegenContext {
 s"${fullName}_$id"
   }
 
+  /**
+   * Creates an `ExprValue` representing a local java variable of required 
data type.
+   */
+  def freshVariable(name: String, dt: DataType): VariableValue =
+JavaCode.variable(freshName(name), dt)
+
+  /**
+   * Creates an `ExprValue` representing a local java variable of required 
data type.
--- End diff --

nit: `data type` -> `Java class`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...

2018-07-13 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21537#discussion_r202269577
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -720,31 +719,36 @@ case class Cast(child: Expression, dataType: 
DataType, timeZoneId: Option[String
   private def writeMapToStringBuilder(
   kt: DataType,
   vt: DataType,
-  map: String,
-  buffer: String,
-  ctx: CodegenContext): String = {
+  map: ExprValue,
+  buffer: ExprValue,
+  ctx: CodegenContext): Block = {
 
 def dataToStringFunc(func: String, dataType: DataType) = {
   val funcName = ctx.freshName(func)
   val dataToStringCode = castToStringCode(dataType, ctx)
+  val data = JavaCode.variable("data", dataType)
+  val dataStr = JavaCode.variable("dataStr", StringType)
   ctx.addNewFunction(funcName,
--- End diff --

Since this method `dataToStringFunc()` is not used in other files, it would 
be good to address it in this PR. WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92965/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #92965 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92965/testReport)**
 for PR 21758 at commit 
[`c8d67e4`](https://github.com/apache/spark/commit/c8d67e434426d6f7c0b6ff4a9899096e40355325).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20611
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20611
  
**[Test build #92966 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92966/testReport)**
 for PR 20611 at commit 
[`7900aaf`](https://github.com/apache/spark/commit/7900aaf7913a1b95527568ce54ff40f8a0c69148).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20611
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92966/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202261810
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
 ---
@@ -517,7 +585,6 @@ class ParquetFilterSuite extends QueryTest with 
ParquetTest with SharedSQLContex
 }
   }
 
-
--- End diff --

nit: I would revert this change if you are going to push more changes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #92965 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92965/testReport)**
 for PR 21758 at commit 
[`c8d67e4`](https://github.com/apache/spark/commit/c8d67e434426d6f7c0b6ff4a9899096e40355325).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20611
  
**[Test build #92966 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92966/testReport)**
 for PR 20611 at commit 
[`7900aaf`](https://github.com/apache/spark/commit/7900aaf7913a1b95527568ce54ff40f8a0c69148).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/921/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202261386
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -378,6 +378,15 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_TIMESTAMP_ENABLED =
+buildConf("spark.sql.parquet.filterPushdown.timestamp")
+  .doc("If true, enables Parquet filter push-down optimization for 
Timestamp. " +
+"This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is " +
+"enabled and Timestamp stored as TIMESTAMP_MICROS or 
TIMESTAMP_MILLIS type.")
--- End diff --

Shell we note `INT64` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21565: [SPARK-24558][Core]wrong Idle Timeout value is used in c...

2018-07-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21565
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20100: [SPARK-22913][SQL] Improved Hive Partition Pruning

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20100
  
@ameent BTW, we can't directly close this. I'd appreciate it if you 
manually close this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/920/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20100: [SPARK-22913][SQL] Improved Hive Partition Pruning

2018-07-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20100
  
Sorry for a late response. I am now checking PRs queued in my list.
I agree with @cloud-fan's for now and I think we should better leave this 
closed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21741: [SPARK-24718][SQL] Timestamp support pushdown to parquet...

2018-07-13 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21741
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20057
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92963/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20057
  
**[Test build #92963 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92963/testReport)**
 for PR 20057 at commit 
[`bc75051`](https://github.com/apache/spark/commit/bc75051f5f4a47ef045c93bd933b2f95635100ad).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20057
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21741: [SPARK-24718][SQL] Timestamp support pushdown to ...

2018-07-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21741#discussion_r202260518
  
--- Diff: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ---
@@ -578,3 +578,127 @@ Native ORC Vectorized   11622 / 
12196  1.4 7
 Native ORC Vectorized (Pushdown)11377 / 11654  1.4 
723.3   1.0X
 
 

+
+Pushdown benchmark for Timestamp

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+Select 1 timestamp stored as INT96 row (value = CAST(7864320 AS 
timestamp)): Best/Avg Time(ms)Rate(M/s)   Per Row(ns)   Relative
--- End diff --

shall we add a new line after the benchmark name? e.g.
```
Select 1 timestamp stored as INT96 row (value = CAST(7864320 AS timestamp)):
Best/Avg Time(ms)Rate(M/s)   Per Row(ns)   Relative
...
```

We can send a follow-up PR to fix this entire file.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18544: [SPARK-21318][SQL]Improve exception message throw...

2018-07-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18544#discussion_r202259987
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -129,14 +129,14 @@ private[sql] class HiveSessionCatalog(
 Try(super.lookupFunction(funcName, children)) match {
   case Success(expr) => expr
   case Failure(error) =>
-if (functionRegistry.functionExists(funcName)) {
-  // If the function actually exists in functionRegistry, it means 
that there is an
-  // error when we create the Expression using the given children.
+if (super.functionExists(name)) {
+  // If the function actually exists in functionRegistry or 
externalCatalog,
+  // it means that there is an error when we create the Expression 
using the given children.
   // We need to throw the original exception.
   throw error
 } else {
-  // This function is not in functionRegistry, let's try to load 
it as a Hive's
-  // built-in function.
+  // This function is not in functionRegistry or externalCatalog,
+  // let's try to load it as a Hive's built-in function.
   // Hive is case insensitive.
   val functionName = 
funcName.unquotedString.toLowerCase(Locale.ROOT)
   if (!hiveFunctions.contains(functionName)) {
--- End diff --

We do not need to change the other parts. We just need to throw the 
exception in `failFunctionLookup(funcName)`, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18544: [SPARK-21318][SQL]Improve exception message throw...

2018-07-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18544#discussion_r202257183
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1155,7 +1155,8 @@ class Analyzer(
 override def apply(plan: LogicalPlan): LogicalPlan = 
plan.transformAllExpressions {
   case f: UnresolvedFunction if !catalog.functionExists(f.name) =>
 withPosition(f) {
-  throw new 
NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName)
+  val db = f.name.database.getOrElse(catalog.getCurrentDatabase)
+  throw new NoSuchFunctionException(db, f.name.funcName)
--- End diff --

The issue has been resolved. Can you revert the changes?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21745
  
**[Test build #92964 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92964/testReport)**
 for PR 21745 at commit 
[`9e00db9`](https://github.com/apache/spark/commit/9e00db938ddc6293899170e19b41530b22fb525a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21745
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92961/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21745
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21745: [SPARK-24781][SQL] Using a reference from Dataset in Fil...

2018-07-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21745
  
**[Test build #92961 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92961/testReport)**
 for PR 21745 at commit 
[`9e00db9`](https://github.com/apache/spark/commit/9e00db938ddc6293899170e19b41530b22fb525a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4