[GitHub] spark pull request #13988: [SPARK-16101][SQL] Refactoring CSV data source to...

2016-07-17 Thread deanchen
Github user deanchen commented on a diff in the pull request: https://github.com/apache/spark/pull/13988#discussion_r71097167 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala --- @@ -0,0 +1,83 @@ +/* + * Licensed to

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14235 The followings are updated. - Adds a heading line in golden SQL files. - Adds more documents and replace `HiveQL` with `SQL`. The remaining `hive` are only the one in test input SQL.

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71096995 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1774,6 +1775,51 @@ class Analyzer( }

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14235 **[Test build #62446 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62446/consoleFull)** for PR 14235 at commit [`e2a7ac4`](https://github.com/apache/spark/commit/e

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096802 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096761 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark issue #14222: [SPARK-16391][SQL] KeyValueGroupedDataset.reduceGroups s...

2016-07-17 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/14222 there is a usefulness to this `ReduceAggregator` beyond `.reduceGroups`. basically you can take any Aggregator without a zero and turn it into a valid Aggregator, with the caveat being that the

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096584 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096571 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096401 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096388 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71096347 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala --- @@ -35,184 +34,306 @@ import org.apache.spark.util.U

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14235 When the optimized plans are different, we need to check whether the optimized plan of the generated SQL perform worse than the original one. If so, we need to improve the SQL generation logics.

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71096037 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71096013 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095897 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14102#discussion_r71095725 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONOptions.scala --- @@ -51,7 +53,8 @@ private[sql] class JSONOptions(

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095681 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095641 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark pull request #14175: [SPARK-16522][MESOS] Spark application throws exc...

2016-07-17 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/14175#discussion_r71095619 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -552,7 +552,9 @@ private[spark] class

[GitHub] spark issue #14028: [SPARK-16351][SQL] Avoid per-record type dispatch in JSO...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14028 **[Test build #62445 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62445/consoleFull)** for PR 14028 at commit [`6570a98`](https://github.com/apache/spark/commit/6

[GitHub] spark pull request #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14132#discussion_r71095511 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -87,6 +87,7 @@ class Analyzer( EliminateUni

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095404 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark issue #14028: [SPARK-16351][SQL] Avoid per-record type dispatch in JSO...

2016-07-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14028 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #14028: [SPARK-16351][SQL] Avoid per-record type dispatch in JSO...

2016-07-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14028 LGTM pending jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14235 For the self unittest, sure. I did it manually, but it would be better if we had it, too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14235 Hmm, I didn't remember correctly, but the result were different. For the Hint test cases, I used SQLConf to setup the testsuite because I knew that the optimizer would work differently. ---

[GitHub] spark pull request #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession in...

2016-07-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14177 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095116 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -102,7 +140,7 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095137 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14243 cc @sun-rui --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095089 --- Diff: sql/hive/src/test/resources/sqlgen/agg1.sql --- @@ -0,0 +1,3 @@ +SELECT COUNT(value) FROM parquet_t1 GROUP BY key HAVING MAX(key) > 0

[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14177 LGTM. Merging into master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71095072 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14235 @dongjoon-hyun what problem did you run into when you tried comparing optimized plans? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14235 @dongjoon-hyun please also implement unit tests for the checking functionality. In particular, we want to make sure the test actually fails if the generated SQL does not match the golden file. --- I

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094999 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094958 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -102,7 +140,7 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094995 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094954 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094937 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094916 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094901 --- Diff: sql/hive/src/test/resources/sqlgen/agg1.sql --- @@ -0,0 +1,3 @@ +SELECT COUNT(value) FROM parquet_t1 GROUP BY key HAVING MAX(key) > 0 --- End d

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094899 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extend

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62443/ Test PASSed. ---

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094871 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,12 +17,21 @@ package org.apache.spar

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14243 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094865 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094847 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,22 +85,51 @@ class LogicalPlanToSQLSuite extends SQLBuil

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14243 **[Test build #62443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62443/consoleFull)** for PR 14243 at commit [`4206104`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094825 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,12 +17,21 @@ package org.apache.spark.sql.cat

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71094791 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -17,12 +17,21 @@ package org.apache.spark.sql.cat

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14235 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14235 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62444/ Test PASSed. ---

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14235 **[Test build #62444 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62444/consoleFull)** for PR 14235 at commit [`6df0c18`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 Thanks. I'll try to build some! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14176 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14176 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62440/ Test PASSed. ---

[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14176 **[Test build #62440 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62440/consoleFull)** for PR 14176 at commit [`ecff4ff`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14116 Thank you for spending much time on reviewing my PR! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14116 Will do it tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes s

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14086 You can try to use Dockers. You can find most RDBMS, including Hive, Oracle and DB2 on LUW. The only exception is DB2 on z/OS, which is running on mainframe. --- If your project is set up for

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 Thank you for solving my curiosity! I'm collecting the behavior of TRUNCATE default to ensure robustness. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14086 Nope... I lost the IDs to access various DBMS after joining STC. Previously, I had. : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 By the way, I'm just wondering. Does IBM STC has some test environments for various DBMS? I guess you should have that. :) --- If your project is set up for it, you can reply to this email an

[GitHub] spark pull request #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level t...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14054#discussion_r71093530 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -284,9 +286,17 @@ object JdbcUtils extends Loggi

[GitHub] spark issue #14177: [SPARK-16027][SPARKR] Fix R tests SparkSession init/stop

2016-07-17 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14177 Sounds good to me! Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 To identify further works, I'll rephrase your comments. You means the case `SaveMode.Overwrite` should fail due to that current Spark uses `statement.executeUpdate(s"DROP TABLE $table"

[GitHub] spark pull request #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level t...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14054#discussion_r71093448 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -284,9 +286,17 @@ object JdbcUtils extends Loggi

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14235 @dongjoon-hyun i'm not against merging this first. Just want to see if we can improve it further. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu

[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14054 JDBC data sources could be any source that might have different behaviors. Thus, the above feedback is only for RDBMS. I think we do not need to add any restriction on the existing isolation leve

[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14116 Hi, @gatorsmile . Could you review this PR too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14229 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62442/ Test PASSed. ---

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14229 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14229 **[Test build #62442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62442/consoleFull)** for PR 14229 at commit [`90dad9d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 I always appreciate your intensive reviews. Those always make my PRs meaningful and stronger. Thank you, @srowen and @gatorsmile . --- If your project is set up for it, you can reply to this

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14086 `Lower` does not mean `inclusion`. Those are different privileges of course. But, I agree with you here. > I just share what I learned here. Developing a general solution for different JDB

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14086 I think the above discussions are important when users really use the option `truncate`. I am OK to let users decide whether `TRUNCATE` is good or not for their scenario. --- If your project is

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14235 **[Test build #62444 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62444/consoleFull)** for PR 14235 at commit [`6df0c18`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #14086: [SPARK-16463][SQL] Support `truncate` option in Overwrit...

2016-07-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14086 > `TRUNCATE` is the lower privilege than DROP/CREATE. This is not true. For example, in DB2 z, `DROP` and `TRUNCATE` require different privilege. https://www.ibm.com/support/knowledgecen

[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-17 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14241 You should be able to add those filter constraints in FileDataSourceStrategy. I don't think it matters too much whether that code is located within buildScan(), or in the operator itself. --- If you

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14235 It's definitely a solution. In fact, I tried that first before making this PR, of course, not completely, but as a feasibility tests. It has its own pros and cons, too. For example, i

[GitHub] spark pull request #14225: [WIP][SPARK-16334] Maintain single dictionary per...

2016-07-17 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/14225#discussion_r71092643 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -146,10 +153,8 @@ void readBatch(int to

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14243 **[Test build #62443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62443/consoleFull)** for PR 14243 at commit [`4206104`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14243 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14243 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62441/ Test FAILed. ---

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14243 **[Test build #62441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62441/consoleFull)** for PR 14243 at commit [`64f95a4`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14229 **[Test build #62442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62442/consoleFull)** for PR 14229 at commit [`90dad9d`](https://github.com/apache/spark/commit/9

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71092452 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,7 +85,34 @@ class LogicalPlanToSQLSuite extends

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14243 **[Test build #62441 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62441/consoleFull)** for PR 14243 at commit [`64f95a4`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include j...

2016-07-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14243 cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #14243: [SPARK-10683][SPARK-16510][SPARKR] Move SparkR in...

2016-07-17 Thread shivaram
GitHub user shivaram opened a pull request: https://github.com/apache/spark/pull/14243 [SPARK-10683][SPARK-16510][SPARKR] Move SparkR include jar test to SparkSubmitSuite ## What changes were proposed in this pull request? This change moves the include jar test from R to Sp

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71092375 --- Diff: sql/hive/src/test/resources/sqlgen/agg1.sql --- @@ -0,0 +1,3 @@ +SELECT COUNT(value) FROM parquet_t1 GROUP BY key HAVING MAX(key) > 0

[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14176 **[Test build #62440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62440/consoleFull)** for PR 14176 at commit [`ecff4ff`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQLSuite t...

2016-07-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14235 @dongjoon-hyun can you take a look at @gatorsmile's suggestion to check optimized logical plan and see if it is feasible? --- If your project is set up for it, you can reply to this email and have yo

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71092270 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala --- @@ -76,7 +85,34 @@ class LogicalPlanToSQLSuite extends SQLBuild

[GitHub] spark pull request #14235: [SPARK-16590][SQL][TEST] Improve LogicalPlanToSQL...

2016-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14235#discussion_r71092240 --- Diff: sql/hive/src/test/resources/sqlgen/agg1.sql --- @@ -0,0 +1,3 @@ +SELECT COUNT(value) FROM parquet_t1 GROUP BY key HAVING MAX(key) > 0 +---

[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14241 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62438/ Test FAILed. ---

[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14241 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14241 **[Test build #62438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62438/consoleFull)** for PR 14241 at commit [`0d4642a`](https://github.com/apache/spark/commit/

<    1   2   3   4   >