[GitHub] spark issue #16116: [SPARK-18685][TESTS] Fix URI and release resources after...

2016-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16116
  
Build started: [Tests] `org.apache.spark.repl.ExecutorClassLoaderSuite` 
[![PR-16116](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=D9720E89-7333-4FBE-8F62-F387A249B5D8=true)](https://ci.appveyor.com/project/spark-test/spark/branch/D9720E89-7333-4FBE-8F62-F387A249B5D8)
Diff: 
https://github.com/apache/spark/compare/master...spark-test:D9720E89-7333-4FBE-8F62-F387A249B5D8


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16020: [SPARK-18596][ML] add checking and caching to bis...

2016-12-01 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/16020#discussion_r90599078
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
---
@@ -334,10 +334,10 @@ class KMeans @Since("1.5.0") (
 val summary = new KMeansSummary(
   model.transform(dataset), $(predictionCol), $(featuresCol), $(k))
 model.setSummary(Some(summary))
-instr.logSuccess(model)
 if (handlePersistence) {
   instances.unpersist()
 }
+instr.logSuccess(model)
--- End diff --

The `handlePersistence` check in `KMeans` at L309 should also be updated to 
use `dataset.storageLevel`. Since we're touching KMeans here anyway we may as 
well do it now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16116: [SPARK-18685][TESTS] Fix URI and release resources after...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16116
  
**[Test build #69552 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69552/consoleFull)**
 for PR 16116 at commit 
[`8d40848`](https://github.com/apache/spark/commit/8d40848fd5559a46561fd9bd3aefac4151d6321d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16116: [SPARK-18685][TESTS] Fix URI and release resource...

2016-12-01 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/16116

[SPARK-18685][TESTS] Fix URI and release resources after opening in tests 
at ExecutorClassLoaderSuite

## What changes were proposed in this pull request?

This PR fixes two problems as below:

- Close `BufferedSource` after `Source.fromInputStream(...)` to release 
resource and make the tests pass on Windows in `ExecutorClassLoaderSuite`

  ```
  [info] Exception encountered when attempting to run a suite with class 
name: org.apache.spark.repl.ExecutorClassLoaderSuite *** ABORTED *** (7 
seconds, 333 milliseconds)
  [info]   java.io.IOException: Failed to delete: 
C:\projects\spark\target\tmp\spark-77b2f37b-6405-47c4-af1c-4a6a206511f2
  [info]   at 
org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010)
  [info]   at 
org.apache.spark.repl.ExecutorClassLoaderSuite.afterAll(ExecutorClassLoaderSuite.scala:76)
  [info]   at 
org.scalatest.BeforeAndAfterAll$class.afterAll(BeforeAndAfterAll.scala:213)
  ...
  ```


- Fix URI correctly so that related tests can be passed on Windows.

  ```
  [info] - child first *** FAILED *** (78 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ...
  [info] - parent first *** FAILED *** (15 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ...
  [info] - child first can fall back *** FAILED *** (0 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ...
  [info] - child first can fail *** FAILED *** (0 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ...
  [info] - resource from parent *** FAILED *** (0 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ...
  [info] - resources from parent *** FAILED *** (0 milliseconds)
  [info]   java.net.URISyntaxException: Illegal character in authority at 
index 7: 
file://C:\projects\spark\target\tmp\spark-00b66070-0548-463c-b6f3-8965d173da9b
  [info]   at java.net.URI$Parser.fail(URI.java:2848)
  [info]   at java.net.URI$Parser.parseAuthority(URI.java:3186)
  ```

## How was this patch tested?

Manually tested via AppVeyor.

**Before**

https://ci.appveyor.com/project/spark-test/spark/build/102-rpel-ExecutorClassLoaderSuite

**After**

https://ci.appveyor.com/project/spark-test/spark/build/108-rpel-ExecutorClassLoaderSuite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark close-after-open

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16116


commit 8d40848fd5559a46561fd9bd3aefac4151d6321d
Author: hyukjinkwon 
Date:   2016-12-02T07:38:01Z

Fix URI and release resources after opening




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16108: [SPARK-18670][SS]Limit the number of StreamingQueryListe...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16108
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69545/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16108: [SPARK-18670][SS]Limit the number of StreamingQueryListe...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16108
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16108: [SPARK-18670][SS]Limit the number of StreamingQueryListe...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16108
  
**[Test build #69545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69545/consoleFull)**
 for PR 16108 at commit 
[`be3737f`](https://github.com/apache/spark/commit/be3737f3a49325d20401402e65288b5c39be3bed).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69543/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16088
  
**[Test build #69543 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69543/consoleFull)**
 for PR 16088 at commit 
[`a9f9710`](https://github.com/apache/spark/commit/a9f9710b7e0618d2e90d846be791ab80ea76883f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14638
  
I didn't read Hive code part for this. I referenced the Spark csv part code.
But, I'll search the corresponding one in Hive master branch tomorrow since 
you requested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15863: [SPARK-18419][SQL] `JDBCRelation.insert` should not remo...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69544/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15863: [SPARK-18419][SQL] `JDBCRelation.insert` should not remo...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15863
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15863: [SPARK-18419][SQL] `JDBCRelation.insert` should not remo...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15863
  
**[Test build #69544 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69544/consoleFull)**
 for PR 15863 at commit 
[`2375f7f`](https://github.com/apache/spark/commit/2375f7fc8d6e3ad879c129dc007c37eeeca3990e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69542/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16056
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16056
  
**[Test build #69542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69542/consoleFull)**
 for PR 16056 at commit 
[`c83919e`](https://github.com/apache/spark/commit/c83919e6ca75b3ea803a9227dc327cc34ec8e728).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16068: [SPARK-18637][SQL]Stateful UDF should be considered as n...

2016-12-01 Thread zhzhan
Github user zhzhan commented on the issue:

https://github.com/apache/spark/pull/16068
  
@hvanhovell Would you like take a look and let me know if you have any 
concern?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16069: [WIP][SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven p...

2016-12-01 Thread weiqingy
Github user weiqingy commented on the issue:

https://github.com/apache/spark/pull/16069
  
Hi, @srowen Thanks for the review. I have updated the PR to fix deprecation 
warnings (about '`previousArtifact`', '`/`', '`stringToReference`', '`<+= `' 
operator, '`t3ToTable3`', '`<<=`' operator) based on the following references:
- http://www.scala-sbt.org/0.13/docs/Migrating-from-sbt-012x.html
- http://www.scala-sbt.org/0.13/sxr/sbt/Reference.scala.html
- https://github.com/sbt/sbt/issues/971  

The current fixes failed the MiMa tests. I will look into it.

As to the sbt plugins, @JoshRosen suggested to update them in a separate 
PR, so I think it’s better to confirm with him again. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16086: [SPARK-18653][SQL] Fix incorrect space padding for unico...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16086
  
I agree - don't think this is worth the complexity.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16069: [WIP][SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven p...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16069
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69551/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16069: [WIP][SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven p...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16069
  
**[Test build #69551 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69551/consoleFull)**
 for PR 16069 at commit 
[`0c2d20b`](https://github.com/apache/spark/commit/0c2d20b8d919ec5854fe7ed222f1617948b6554a).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16069: [WIP][SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven p...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16069
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-12-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15998
  
What is the expected output? In the ExternalCatalogSuite, it sounds like we 
do not issue any error. 
```
catalog.listPartitionNames("db2", "tbl2", Some(Map("unknown" -> "unknown")))
```

In the `SessionCatalogSuite`, we also should add the test cases like 
[`listPartitions`](https://github.com/apache/spark/blob/e64a2047eaf02d65dcf98b6e0710e10196aa74b1/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala#L872-L889)
 and also the negative cases like what I mentioned above. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15620: [SPARK-18091] [SQL] Deep if expressions cause Gen...

2016-12-01 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15620#discussion_r90592789
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -97,6 +97,38 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 assert(actual(0) == cases)
   }
 
+  test("SPARK-18091: split large if expressions into blocks due to JVM 
code size limit") {
+val row = 
create_row("afafFAFFsqcategory2dadDADcategory8sasasadscategory24", 0)
--- End diff --

> Since the fix in this pull request is only in If expression, a working 
testcase with the fix will be one which consists (recursively) of If 
expressions.

I can't see why this is reasonable. You just need to construct enough big 
generated codes of condition + true expression + false expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69537/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16099: [SPARK-18665][SQL] set statement state to "ERROR" after ...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16099
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16099: [SPARK-18665][SQL] set statement state to "ERROR" after ...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16099
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69547/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16099: [SPARK-18665][SQL] set statement state to "ERROR" after ...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16099
  
**[Test build #69547 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69547/consoleFull)**
 for PR 16099 at commit 
[`196ab66`](https://github.com/apache/spark/commit/196ab66af1e73454b8b926386654e8498f2d5ce9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16088
  
**[Test build #69537 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69537/consoleFull)**
 for PR 16088 at commit 
[`a9f9710`](https://github.com/apache/spark/commit/a9f9710b7e0618d2e90d846be791ab80ea76883f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16069: [WIP][SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven p...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16069
  
**[Test build #69551 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69551/consoleFull)**
 for PR 16069 at commit 
[`0c2d20b`](https://github.com/apache/spark/commit/0c2d20b8d919ec5854fe7ed222f1617948b6554a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16114
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16114
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69546/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16114
  
**[Test build #69546 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69546/consoleFull)**
 for PR 16114 at commit 
[`4f17a32`](https://github.com/apache/spark/commit/4f17a322aa74aeb4308223121ea04a3754b3135d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15620: [SPARK-18091] [SQL] Deep if expressions cause Generated ...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15620
  
**[Test build #69550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69550/consoleFull)**
 for PR 15620 at commit 
[`260c8e8`](https://github.com/apache/spark/commit/260c8e85c613eaa458ff28caf0e987419dc8c895).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16068: [SPARK-18637][SQL]Stateful UDF should be considered as n...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16068
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69541/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16068: [SPARK-18637][SQL]Stateful UDF should be considered as n...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16068
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16068: [SPARK-18637][SQL]Stateful UDF should be considered as n...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16068
  
**[Test build #69541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69541/consoleFull)**
 for PR 16068 at commit 
[`a6292a9`](https://github.com/apache/spark/commit/a6292a93d9e28db1bdc82bba73cb1d50c54083a0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16090: [SPARK-18661] [SQL] Creating a partitioned dataso...

2016-12-01 Thread ericl
Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/16090#discussion_r90591997
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
 ---
@@ -58,13 +58,21 @@ case class CreateDataSourceTableCommand(table: 
CatalogTable, ignoreIfExists: Boo
 // Create the relation to validate the arguments before writing the 
metadata to the metastore,
 // and infer the table schema and partition if users didn't specify 
schema in CREATE TABLE.
 val pathOption = table.storage.locationUri.map("path" -> _)
+// Fill in some default table options from the session conf
+val uncreatedTable = table.copy(
+  identifier = table.identifier.copy(
+database = Some(
+  
table.identifier.database.getOrElse(sessionState.catalog.getCurrentDatabase))),
+  tracksPartitionsInCatalog = 
sparkSession.sessionState.conf.manageFilesourcePartitions)
 val dataSource: BaseRelation =
   DataSource(
 sparkSession = sparkSession,
 userSpecifiedSchema = if (table.schema.isEmpty) None else 
Some(table.schema),
+partitionColumns = table.partitionColumnNames,
--- End diff --

You do need to pass it in though.

```
val fileCatalog = if 
(sparkSession.sqlContext.conf.manageFilesourcePartitions &&
catalogTable.isDefined && 
catalogTable.get.tracksPartitionsInCatalog) {
  new CatalogFileIndex(
sparkSession,
catalogTable.get,
catalogTable.get.stats.map(_.sizeInBytes.toLong).getOrElse(0L))
} else {
  new InMemoryFileIndex(sparkSession, globbedPaths, options, 
Some(partitionSchema))
}
```

Otherwise, this code will perform a full filesystem scan, independent of 
the other change to prevent getOrInferFileFormatSchema from performing a scan 
as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggregate imp...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16093
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggregate imp...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16093
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69529/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggregate imp...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16093
  
**[Test build #69529 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69529/consoleFull)**
 for PR 16093 at commit 
[`2a30118`](https://github.com/apache/spark/commit/2a301188287726f4f87fffe33f57cd3a2ae36c30).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15927: [SPARK-18500][SQL] Make GenericStrategy be able to prune...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15927
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69533/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15927: [SPARK-18500][SQL] Make GenericStrategy be able to prune...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15927
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16112: [SPARK-18679] [SQL] Fix regression in file listing perfo...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16112
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16112: [SPARK-18679] [SQL] Fix regression in file listing perfo...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16112
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69531/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16115: [SPARK-18667][PySpark][SQL] Change the way to group row ...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16115
  
**[Test build #69549 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69549/consoleFull)**
 for PR 16115 at commit 
[`7cd606b`](https://github.com/apache/spark/commit/7cd606b6605ac75f311dca2cff988f20ba0ad7a0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15927: [SPARK-18500][SQL] Make GenericStrategy be able to prune...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15927
  
**[Test build #69533 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69533/consoleFull)**
 for PR 15927 at commit 
[`75d4012`](https://github.com/apache/spark/commit/75d4012b255ce0c59e5a826b4f8ea8dc87a38d2c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16112: [SPARK-18679] [SQL] Fix regression in file listing perfo...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16112
  
**[Test build #69531 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69531/consoleFull)**
 for PR 16112 at commit 
[`db66439`](https://github.com/apache/spark/commit/db664396b3892de45507d9c82eed7d070bdd82dc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16115: [SPARK-18667][PySpark][SQL] Change the way to group row ...

2016-12-01 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16115
  
cc @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14638
  
@dongjoon-hyun can you show a link to how Hive implements this? I'm just 
surprised that this is a general feature rather than something that is format 
specific.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16110: [SPARK-18674][SQL][Follow-Up] improve the error message ...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16110
  
LGTM pending tests.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16115: [SPARK-18667][PySpark][SQL] Change the way to gro...

2016-12-01 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/16115

[SPARK-18667][PySpark][SQL] Change the way to group row in 
BatchEvalPythonExec so input_file_name function can work with UDF in pyspark

## What changes were proposed in this pull request?

`input_file_name` doesn't return filename when working with UDF in PySpark. 
An example shows the problem:

from pyspark.sql.functions import *
from pyspark.sql.types import *

def filename(path):
return path

sourceFile = udf(filename, StringType())
spark.read.json("tmp.json").select(sourceFile(input_file_name())).show()

+---+
|filename(input_file_name())|
+---+
|   |
+---+

The cause of this issue is, we group rows in `BatchEvalPythonExec` for 
batching processing of PythonUDF. Currently we group rows first and then 
evaluate expressions on the rows. If the data is less than the required number 
of rows for a group, the iterator will be consumed to the end before the 
evaluation. However, once the iterator reaches the end, we will unset input 
filename. So the input_file_name expression can't return correct filename.

This patch fixes the approach to group the batch of rows. We evaluate the 
expression first and then group evaluated results to batch.

## How was this patch tested?

Added unit test to PySpark.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 fix-py-udf-input-filename

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16115.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16115


commit 7cd606b6605ac75f311dca2cff988f20ba0ad7a0
Author: Liang-Chi Hsieh 
Date:   2016-12-02T05:50:47Z

Change the way to group row in BatchEvalPythonExec so udf works with 
input_file_name in pyspark.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14638
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69539/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14638
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16110: [SPARK-18674][SQL][Follow-Up] improve the error message ...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16110
  
**[Test build #69548 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69548/consoleFull)**
 for PR 16110 at commit 
[`4d43bca`](https://github.com/apache/spark/commit/4d43bca86c08ac5ee1ba5960ee448db93445d8a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14638
  
**[Test build #69539 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69539/consoleFull)**
 for PR 14638 at commit 
[`6f602ba`](https://github.com/apache/spark/commit/6f602baaae820a558dc4a08e41516f1fc4fd1749).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16090
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69536/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16090
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16106: [SPARK-17213][SQL] Disable Parquet filter push-do...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16106


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16090
  
**[Test build #69536 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69536/consoleFull)**
 for PR 16090 at commit 
[`b405635`](https://github.com/apache/spark/commit/b405635bdc4f052a070a319c93ee0b777acd92c6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16106: [SPARK-17213][SQL] Disable Parquet filter push-down for ...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16106
  
Merging in master/branch-2.1. Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-01 Thread dav009
Github user dav009 commented on the issue:

https://github.com/apache/spark/pull/16114
  
:+1: just had a play with it, it solves my original issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16090: [SPARK-18661] [SQL] Creating a partitioned dataso...

2016-12-01 Thread ericl
Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/16090#discussion_r90590350
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
 ---
@@ -58,13 +58,21 @@ case class CreateDataSourceTableCommand(table: 
CatalogTable, ignoreIfExists: Boo
 // Create the relation to validate the arguments before writing the 
metadata to the metastore,
 // and infer the table schema and partition if users didn't specify 
schema in CREATE TABLE.
 val pathOption = table.storage.locationUri.map("path" -> _)
+// Fill in some default table options from the session conf
+val uncreatedTable = table.copy(
+  identifier = table.identifier.copy(
+database = Some(
+  
table.identifier.database.getOrElse(sessionState.catalog.getCurrentDatabase))),
+  tracksPartitionsInCatalog = 
sparkSession.sessionState.conf.manageFilesourcePartitions)
 val dataSource: BaseRelation =
   DataSource(
 sparkSession = sparkSession,
 userSpecifiedSchema = if (table.schema.isEmpty) None else 
Some(table.schema),
+partitionColumns = table.partitionColumnNames,
--- End diff --

You don't want to do that though. Resolve relation also does not always 
scan the filesystem if you pass in a user defined schema.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16099: [SPARK-18665][SQL] set statement state to "ERROR" after ...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16099
  
**[Test build #69547 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69547/consoleFull)**
 for PR 16099 at commit 
[`196ab66`](https://github.com/apache/spark/commit/196ab66af1e73454b8b926386654e8498f2d5ce9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15620: [SPARK-18091] [SQL] Deep if expressions cause Gen...

2016-12-01 Thread kapilsingh5050
Github user kapilsingh5050 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15620#discussion_r90590152
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -97,6 +97,38 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 assert(actual(0) == cases)
   }
 
+  test("SPARK-18091: split large if expressions into blocks due to JVM 
code size limit") {
+val row = 
create_row("afafFAFFsqcategory2dadDADcategory8sasasadscategory24", 0)
--- End diff --

I think I didn't make this clear. So the fix I've made in If expression is 
actually required in all of the expressions so that all the expressions break 
their generated code into methods whenever it grows beyond an appropriate 
threshold. Since the fix in this pull request is only in If expression, a 
working testcase with the fix will be one which consists (recursively) of If 
expressions. Because otherwise the test will need to be tuned so that the size 
of If expression's code is greater than threshold but of it's predicate, 
trueValue and falseValue doesn't grow beyond threshold


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16089: [SPARK-18658][SQL] Write text records directly to...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16089


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16110: [SPARK-18674][SQL][Follow-Up] improve the error message ...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16110
  
That SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggreg...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16093


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16089: [SPARK-18658][SQL] Write text records directly to a File...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16089
  
Merging in master. Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggregate imp...

2016-12-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16093
  
Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16114
  
**[Test build #69546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69546/consoleFull)**
 for PR 16114 at commit 
[`4f17a32`](https://github.com/apache/spark/commit/4f17a322aa74aeb4308223121ea04a3754b3135d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16090
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16110: [SPARK-18674][SQL][Follow-Up] improve the error message ...

2016-12-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16110
  
Just tried MySQL. The message is not good to me:
```
mysql> select * from t1 inner join t2 using (col2);
ERROR 1054 (42S22): Unknown column 'col2' in 'from clause'
```

Will try to improve it by:
```
Using column `col1.field1` cannot be resolved on the left side of the join. 
The left-side columns: [col1]
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16090
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69534/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-01 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/16030
  
Thanks your review, @brkyvz ! I'm checking your comments now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16090
  
**[Test build #69534 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69534/consoleFull)**
 for PR 16090 at commit 
[`2940d55`](https://github.com/apache/spark/commit/2940d55a427e559aab9338b233893b761c01b6e7).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16114: [SPARK-18620][Streaming][Kinesis] Flatten input r...

2016-12-01 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/16114

[SPARK-18620][Streaming][Kinesis] Flatten input rates in timeline for 
streaming + kinesis

## What changes were proposed in this pull request?
This pr is to make input rates in timeline more flat for spark streaming + 
kinesis.
Since kinesis workers fetch records and push them into block generators in 
bulk, timeline in web UI has many spikes when `maxRates` applied (See a 
Figure.1 below). This fix splits fetched input records into multiple 
`adRecords` calls.

Figure.1 Apply `maxRates=500` in vanilla Spark
https://cloud.githubusercontent.com/assets/692303/20823861/4602f300-b89b-11e6-95f3-164a37061305.png;>

Figure.2 Apply `maxRates=500` in Spark with my patch
https://cloud.githubusercontent.com/assets/692303/20823882/6c46352c-b89b-11e6-81ab-afd8abfe0cfe.png;>

## How was this patch tested?
Add tests to check to split input records into multiple `addRecords` calls.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-18620

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16114.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16114






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-01 Thread brkyvz
Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/16030
  
It would be great to get this into 2.1!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-01 Thread brkyvz
Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/16030
  
@maropu I would still keep the changes I proposed below L180 like I 
commented before. We don't need to use the inferred data type as the partition 
type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-12-01 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16030#discussion_r90588207
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -174,22 +185,18 @@ case class DataSource(
 StructType(partitionFields)
   }
 }
+
 if (justPartitioning) {
-  return (null, partitionSchema)
-}
-val dataSchema = userSpecifiedSchema.map { schema =>
--- End diff --

I would keep the code here like I mentioned above, but keep your changes. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-12-01 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16030#discussion_r90588223
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -174,22 +185,18 @@ case class DataSource(
 StructType(partitionFields)
   }
 }
+
 if (justPartitioning) {
-  return (null, partitionSchema)
-}
-val dataSchema = userSpecifiedSchema.map { schema =>
-  val equality = sparkSession.sessionState.conf.resolver
-  StructType(schema.filterNot(f => partitionSchema.exists(p => 
equality(p.name, f.name
-}.orElse {
-  format.inferSchema(
-sparkSession,
-caseInsensitiveOptions,
-tempFileIndex.allFiles())
-}.getOrElse {
-  throw new AnalysisException(
-s"Unable to infer schema for $format. It must be specified 
manually.")
+  (null, partitionSchema)
+} else if (userSpecifiedSchema.isDefined) {
--- End diff --

no need for this `else if`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-12-01 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16030#discussion_r90588156
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -132,13 +132,24 @@ case class DataSource(
   }.toArray
   new InMemoryFileIndex(sparkSession, globbedPaths, options, None)
 }
+
+val dataSchema = userSpecifiedSchema.orElse {
--- End diff --

I would still keep this below the `if (justPartitioning)` area, because 
otherwise everytime someone performs a `df.mode("append").saveAsTable()` we 
will perform schema inference. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-12-01 Thread brkyvz
Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16030#discussion_r90588172
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ---
@@ -132,13 +132,24 @@ case class DataSource(
   }.toArray
   new InMemoryFileIndex(sparkSession, globbedPaths, options, None)
 }
+
+val dataSchema = userSpecifiedSchema.orElse {
+  format.inferSchema(
+sparkSession,
+caseInsensitiveOptions,
+tempFileIndex.allFiles())
+}.getOrElse {
+  throw new AnalysisException(
+s"Unable to infer schema for $format. It must be specified 
manually.")
+}
+
 val partitionSchema = if (partitionColumns.isEmpty && 
catalogTable.isEmpty) {
   // Try to infer partitioning, because no DataSource in the read path 
provides the partitioning
   // columns properly unless it is a Hive DataSource
   val resolved = tempFileIndex.partitionSchema.map { partitionField =>
 val equality = sparkSession.sessionState.conf.resolver
-// SPARK-18510: try to get schema from userSpecifiedSchema, 
otherwise fallback to inferred
-userSpecifiedSchema.flatMap(_.find(f => equality(f.name, 
partitionField.name))).getOrElse(
+// SPARK-18510: try to get partition schema from data schema, 
otherwise fallback to inferred
+dataSchema.find(f => equality(f.name, 
partitionField.name)).getOrElse(
--- End diff --

this change is not necessary


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16108: [SPARK-18670][SS]Limit the number of StreamingQueryListe...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16108
  
**[Test build #69545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69545/consoleFull)**
 for PR 16108 at commit 
[`be3737f`](https://github.com/apache/spark/commit/be3737f3a49325d20401402e65288b5c39be3bed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16112: [SPARK-18679] [SQL] Fix regression in file listing perfo...

2016-12-01 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/16112
  
Yep

On Thu, Dec 1, 2016, 8:03 PM Wenchen Fan  wrote:

> LGTM, @ericl  have you run some local benchmark
> to make sure the performance regression is fixed?
>
> —
> You are receiving this because you were mentioned.
>
>
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15863: [SPARK-18419][SQL] `JDBCRelation.insert` should not remo...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15863
  
**[Test build #69544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69544/consoleFull)**
 for PR 15863 at commit 
[`2375f7f`](https://github.com/apache/spark/commit/2375f7fc8d6e3ad879c129dc007c37eeeca3990e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16088
  
**[Test build #69543 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69543/consoleFull)**
 for PR 16088 at commit 
[`a9f9710`](https://github.com/apache/spark/commit/a9f9710b7e0618d2e90d846be791ab80ea76883f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16080: [SPARK-18647][SQL] do not put provider in table properti...

2016-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16080
  
thanks for the review, merging to master/2.1!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16080: [SPARK-18647][SQL] do not put provider in table p...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16080


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16088
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r90586738
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -42,15 +42,48 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 
InternalRow.fromSeq(values.map(CatalystTypeConverters.convertToCatalyst))
   }
 
+  protected def convertToCatalystUnsafe(a: Any): Any = a match {
+case arr: Array[Byte] => arr
+case arr: Array[Boolean] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Short] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Int] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Long] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Float] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Double] => UnsafeArrayData.fromPrimitiveArray(arr)
+case other => CatalystTypeConverters.convertToCatalyst(other)
+  }
+
   protected def checkEvaluation(
   expression: => Expression, expected: Any, inputRow: InternalRow = 
EmptyRow): Unit = {
 val serializer = new JavaSerializer(new SparkConf()).newInstance
 val expr: Expression = 
serializer.deserialize(serializer.serialize(expression))
+// No codegen version expects GenericArrayData
 val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
+// Codegen version expects UnsafeArrayData for array expect 
Array(Binarytype)
+val catalystValueForCodegen = convertToCatalystUnsafe(expected)
--- End diff --

> Codegen version expects UnsafeArrayData
This is wrong, e.g. even the codegen version of `Cast` will produce safe 
format array.

My proposal is, we improve the checkResult to do type-aware comparison on 
complex types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16093: [SPARK-18663][SQL] Simplify CountMinSketch aggregate imp...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16093
  
**[Test build #3454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3454/consoleFull)**
 for PR 16093 at commit 
[`b2985c4`](https://github.com/apache/spark/commit/b2985c4d817b416e434342e952fabf0ee37b9879).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2016-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16056
  
**[Test build #69542 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69542/consoleFull)**
 for PR 16056 at commit 
[`c83919e`](https://github.com/apache/spark/commit/c83919e6ca75b3ea803a9227dc327cc34ec8e728).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16088: [SPARK-18659] [SQL] Incorrect behaviors in overwrite tab...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16088
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69532/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-01 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/13909
  
ping @cloud-fan @hvanhovell


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16113: [SPARK-18657][SPARK-18668] Make StreamingQuery.id persis...

2016-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16113
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69540/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r90586610
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -42,15 +42,48 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks {
 
InternalRow.fromSeq(values.map(CatalystTypeConverters.convertToCatalyst))
   }
 
+  protected def convertToCatalystUnsafe(a: Any): Any = a match {
+case arr: Array[Byte] => arr
+case arr: Array[Boolean] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Short] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Int] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Long] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Float] => UnsafeArrayData.fromPrimitiveArray(arr)
+case arr: Array[Double] => UnsafeArrayData.fromPrimitiveArray(arr)
+case other => CatalystTypeConverters.convertToCatalyst(other)
+  }
+
   protected def checkEvaluation(
   expression: => Expression, expected: Any, inputRow: InternalRow = 
EmptyRow): Unit = {
 val serializer = new JavaSerializer(new SparkConf()).newInstance
 val expr: Expression = 
serializer.deserialize(serializer.serialize(expression))
+// No codegen version expects GenericArrayData
 val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
--- End diff --

if we use encoder here to convert the value to catalyst format, we don't 
need to change the `CatalystTypeConverters` right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >