[GitHub] spark pull request: Minor fix: made EXPLAIN output to play well ...

2014-06-16 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1097#issuecomment-46251700 Thanks. I'm merging this one. The test that failed was a flume test that is sometimes flaky. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: Follow up of PR #1071 for Java API

2014-06-16 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1085#issuecomment-46252146 FYI This didn't get merged into branch-1.0. I did a manual cherry pick. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-1063 Add .sortBy(f) method on RDD

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/369#issuecomment-46343296 I will test this today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1063 Add .sortBy(f) method on RDD

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/369#issuecomment-46345169 This looks good to me. I will merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/360#discussion_r13876519 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -0,0 +1,667 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1063 Add .sortBy(f) method on RDD

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/369#issuecomment-46348842 There was a conflict that I had to merge manually. Take a look at master to make sure everything is ok. I did compile and ran a couple things. --- If your project is set

[GitHub] spark pull request: spark-submit: add exec at the end of the scrip...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/858#issuecomment-46353884 Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/999#issuecomment-46363656 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13891473 --- Diff: docs/sql-programming-guide.md --- @@ -91,14 +91,33 @@ of its decedents. To create a basic SQLContext, all you need is a SparkContext

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13891482 --- Diff: docs/sql-programming-guide.md --- @@ -91,14 +91,33 @@ of its decedents. To create a basic SQLContext, all you need is a SparkContext

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13891733 --- Diff: docs/sql-programming-guide.md --- @@ -297,50 +328,152 @@ JavaSchemaRDD teenagers = sqlCtx.sql(SELECT name FROM parquetFile WHERE age = div data

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/360#discussion_r13892570 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -0,0 +1,667 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13892635 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -123,4 +125,53 @@ abstract class QueryPlan[PlanType : TreeNode

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13892709 --- Diff: sql/core/pom.xml --- @@ -54,6 +61,11 @@ version${parquet.version}/version /dependency dependency

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13892874 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -99,6 +97,37 @@ class SQLContext(@transient val sparkContext: SparkContext

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13892881 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -99,6 +97,35 @@ class SQLContext(@transient val sparkContext: SparkContext

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13893161 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -342,13 +344,34 @@ class SchemaRDD( def toJavaSchemaRDD: JavaSchemaRDD = new

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13893257 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -0,0 +1,399 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/999#discussion_r13893273 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -0,0 +1,399 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [Spark 2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/999#issuecomment-46380653 This looks to me overall. Only few nitpicks. I think we should merge it after you addressed the couple comments I had. --- If your project is set up for it, you

[GitHub] spark pull request: SPARK-2170: Fix for global name 'PIPE' is not ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1109#issuecomment-46383100 Thanks. Merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-2170: Fix for global name 'PIPE' is not ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1109#issuecomment-46383668 Actually the merge script failed for this pull request. @pwendell any idea? ``` ./merge_spark_pr.py Which pull request would you like to merge? (e.g. 34): 1109

[GitHub] spark pull request: [SPARK-2060][SQL] Querying JSON Datasets with ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/999#issuecomment-46389105 Thanks. I'm merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-2170: Fix for global name 'PIPE' is not ...

2014-06-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1109#issuecomment-46398768 @gregakespret since this has been fixed already in master, do you mind closing this pr? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: SPARK-2170: Fix for global name 'PIPE' is not ...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1109#issuecomment-46399205 Yup looks like a racing condition (in a good way). Thanks a lot for catching this! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Compression should be a setting for individual...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1091#issuecomment-46405754 Thanks for working on this, @ScrapCodes. I talked with Matei and while we both agree compression would be better set in per-RDD basis, adding another boolean flag

[GitHub] spark pull request: Minor fix

2014-06-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1105#discussion_r13903361 --- Diff: core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala --- @@ -91,8 +91,13 @@ private[spark] object MetadataCleaner { conf.set

[GitHub] spark pull request: [SPARK-2162] Double check in doGetLocal to avo...

2014-06-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1103#discussion_r13903517 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -363,6 +363,12 @@ private[spark] class BlockManager( val info

[GitHub] spark pull request: [SPARK-2162] Double check in doGetLocal to avo...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1103#issuecomment-46406863 This LGTM actually. Makes sense to do another check within the synchronized block in case a block is being removed by another thread. --- If your project is set up

[GitHub] spark pull request: Fix for Spark-2151

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1095#issuecomment-46407129 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix for Spark-2151

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1095#issuecomment-46407125 Do you mind updating the pull request title to say something like [SPARK-2151] Recognize memory format for spark-submit? --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1087#issuecomment-46408239 Just leaving a note that this pr has been reverted because changing the parameter name in Scala could make the function non-source-compatible anymore ... --- If your

[GitHub] spark pull request: [SPARK-2176][SQL] Extra unnecessary exchange o...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1116#issuecomment-46469732 That's not a bad idea. Also we should add more documentation. While Spark SQL code in general is extremely concise, it can be hard to understand (especially the optimizer

[GitHub] spark pull request: [SPARK-2176][SQL] Extra unnecessary exchange o...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1116#issuecomment-46469830 Thanks. Merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2162] Double check in doGetLocal to avo...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1103#issuecomment-46470613 Thanks. I'm merging this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Updated the comment for SPARK-2162.

2014-06-18 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1117 Updated the comment for SPARK-2162. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-2162 Alternatively you can review and apply

[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1087#issuecomment-46477205 That's a very good idea. We should probably have a API-breaking label on JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13936033 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/hiveOperators.scala --- @@ -445,7 +445,19 @@ case class NativeCommand

[GitHub] spark pull request: [SPARK-2151] Recognize memory format for spark...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1095#issuecomment-46484800 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Updated the comment for SPARK-2162.

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1117#issuecomment-46484824 Merged in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13937999 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/hiveOperators.scala --- @@ -445,7 +445,19 @@ case class NativeCommand

[GitHub] spark pull request: Remove unicode operator from RDD.scala

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1119#issuecomment-46492402 @ash211 ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Remove unicode operator from RDD.scala

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1119#issuecomment-46500211 Thanks. I've merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2038: rename conf parameters in the sa...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1087#issuecomment-46500185 Yup I added api-breaking label to the ticket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-46500308 That test is flaky and being fixed right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2184][SQL] AddExchange isn't idempotent

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1122#issuecomment-46512587 I'm merging this in master branch-1.0. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2187] Explain should not run the optimi...

2014-06-18 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1123 [SPARK-2187] Explain should not run the optimizer twice. @yhuai @marmbrus You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark explain

[GitHub] spark pull request: [SPARK-2187] Explain should not run the optimi...

2014-06-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1123#discussion_r13949881 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala --- @@ -71,16 +72,24 @@ case class SetCommand

[GitHub] spark pull request: [SPARK-2187] Explain should not run the optimi...

2014-06-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1123#issuecomment-46525610 Ok I am merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954580 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -17,7 +17,7 @@ package org.apache.spark.sql.hive

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954638 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/commands.scala --- @@ -60,3 +60,16 @@ case class ExplainCommand(plan

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954626 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala --- @@ -257,6 +250,88 @@ class HiveQuerySuite extends

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954661 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -81,6 +81,20 @@ private[hive] trait HiveStrategies { def apply

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954704 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -362,13 +367,19 @@ private[hive] object HiveQl

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r13954716 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -362,13 +367,19 @@ private[hive] object HiveQl

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1118#issuecomment-46534190 hmmm a lot of tests are failing because the output doesn't match exactly Hive's ... --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/360#issuecomment-46593227 That test has been flaky. We are fixing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/360#issuecomment-46593240 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2196] [SQL] Fix nullability of CaseWhen...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1133#issuecomment-46611219 @concretevitamin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/360#issuecomment-46612027 @AndreSchumacher do u mind removing the [WIP] tag from the pull request? Unfortunately due to the avro version bump, we can't include this in 1.0.1. --- If your

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46612533 Any idea why the having test from Hive is not runnable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: A few minor Spark SQL Scaladoc fixes.

2014-06-19 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1139 A few minor Spark SQL Scaladoc fixes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark sparksqldoc Alternatively you can review

[GitHub] spark pull request: [SPARK-2191][SQL] Make sure InsertIntoHiveTabl...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1129#issuecomment-46617536 I've merged this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: A few minor Spark SQL Scaladoc fixes.

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1139#issuecomment-46619076 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r14001780 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/commands.scala --- @@ -60,3 +60,23 @@ case class ExplainCommand(plan

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1118#discussion_r14002250 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveComparisonTest.scala --- @@ -144,6 +144,10 @@ abstract class HiveComparisonTest

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46634535 Thanks, @willb. There is at least one problem I found. - I think you'd need to add a cast to the having expression. Otherwise try run the following: ```select key, count

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46635173 To be more specific, I think you can always add a cast that cast the having expression to boolean, and then we have SimplifyCasts in the optimizer that would remove

[GitHub] spark pull request: A few minor Spark SQL Scaladoc fixes.

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1139#issuecomment-46636241 Ok merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: More minor scaladoc cleanup for Spark SQL.

2014-06-19 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1142 More minor scaladoc cleanup for Spark SQL. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark sqlclean Alternatively you can review

[GitHub] spark pull request: [SPARK-2209][SQL] Cast shouldn't do null check...

2014-06-19 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1143 [SPARK-2209][SQL] Cast shouldn't do null check twice. Also took the chance to clean up cast a little bit. Too many arrows on each line before! You can merge this pull request into a Git repository

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46642243 That's definitely a bug - I will take a look at it later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46644761 I found the issue and fixed it. Will push out a pull request soon. If you can just add the boolean cast (always add it - no need to check if the type is already

[GitHub] spark pull request: SPARK-1293 [SQL] [WIP] Parquet support for nes...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/360#issuecomment-46644798 That sounds good. If you can just comment that test out for now, that'd be great. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: More minor scaladoc cleanup for Spark SQL.

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1142#issuecomment-46646143 Ok merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46646244 Here's the patch: https://github.com/apache/spark/pull/1144 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2210] boolean cast on boolean value sho...

2014-06-19 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1144 [SPARK-2210] boolean cast on boolean value should be removed. ``` explain select cast(cast(key=0 as boolean) as boolean) aaa from src ``` should be ``` [Physical execution plan

[GitHub] spark pull request: [SPARK-2209][SQL] Cast shouldn't do null check...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1143#discussion_r14007468 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -104,85 +121,118 @@ case class Cast(child: Expression

[GitHub] spark pull request: [SPARK-2209][SQL] Cast shouldn't do null check...

2014-06-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1143#discussion_r14007471 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -24,72 +24,89 @@ import org.apache.spark.sql.catalyst.types

[GitHub] spark pull request: [SPARK-2218] rename Equals to EqualsTo in Spar...

2014-06-19 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1146 [SPARK-2218] rename Equals to EqualsTo in Spark SQL expressions. Due to the existence of scala.Equals, it is very error prone to name the expression Equals, especially because we use a lot of partial

[GitHub] spark pull request: SparkSQL add SkewJoin

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-46648420 Do you mind reformatting the code to match the Spark coding style? https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide --- If your project

[GitHub] spark pull request: [SQL] Improve Speed of InsertIntoHiveTable

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1130#issuecomment-46648984 Merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2177][SQL] describe table result contai...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1118#issuecomment-46649113 Ok I'm merging this in master branch-1.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1293 [SQL] Parquet support for nested ty...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/360#issuecomment-46649473 Ok I'm going to merge this in master branch-1.0 now. Kinda scary but the change is very isolated. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-2185] Emit warning when task size excee...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1149#issuecomment-46649825 ``` error file=/home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala message=File must end with newline

[GitHub] spark pull request: [SPARK-2210] cast to boolean on boolean value ...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1144#issuecomment-46650094 Ok merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-2203: PySpark defaults to use same num r...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1138#issuecomment-46650603 Merging this in master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46650863 BTW I really want this to go into 1.0.1, which will probably have a release candidate soon. So if you have a chance to rebase your PR and add the cast, please do. Thanks

[GitHub] spark pull request: [SPARK-2196] [SQL] Fix nullability of CaseWhen...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1133#issuecomment-46650936 Thanks. I'm merging this in master branch-1.0. The test failure is not related to this change. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2209][SQL] Cast shouldn't do null check...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1143#issuecomment-46650243 Ok merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2218] rename Equals to EqualTo in Spark...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1146#issuecomment-46651438 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2218] rename Equals to EqualTo in Spark...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1146#issuecomment-46652247 Ok merging this in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-1412][SQL] Disable partial aggregation ...

2014-06-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1152 [SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP This is just a prototype. Kinda ugly, doesn't properly connect with the config system yet, and have

[GitHub] spark pull request: [SPARK-1412][SQL] Disable partial aggregation ...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1152#issuecomment-46654388 @concretevitamin I find it hard to actually use config options in a physical operator. Any suggestions? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-1412][SQL] Disable partial aggregation ...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1152#issuecomment-46654585 @pwendell / @mateiz should we actually build this into Spark directly (i.e. in Aggregator)? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2219][SQL] Fix add jar to execute with ...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1154#issuecomment-46706270 This needs to call Spark's addJar, doesn't it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46724012 I'm going to merge this in master branch-1.0. I will create a separate ticket to track progress on HAVING. Basically there are two things missing: 1. HAVING

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46725494 BTW two follow up tickets created: https://issues.apache.org/jira/browse/SPARK-2225 https://issues.apache.org/jira/browse/SPARK-2226 Let me know

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46725451 There are databases that support that, and it seems to me a very simple change (actually just removing the check code you added is probably enough). --- If your project

[GitHub] spark pull request: SPARK-2180: support HAVING clauses in Hive que...

2014-06-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1136#issuecomment-46726272 I actually did 2225 already. I will assign 2226 to you. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

<    4   5   6   7   8   9   10   11   12   13   >