[GitHub] spark pull request: [SPARK-6806] [SPARKR] [DOCS] Add a new SparkR ...

2015-05-29 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/6490#discussion_r31304628 --- Diff: docs/sparkr.md --- @@ -0,0 +1,198 @@ +--- +layout: global +displayTitle: SparkR (R on Spark) +title: SparkR (R on Spark

[GitHub] spark pull request: [SPARK-7512] [SPARKR] Fix RDD's show method to...

2015-05-09 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/6035#issuecomment-100564609 LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-27 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5568#issuecomment-96814334 LGTM. /cc @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-27 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5568#issuecomment-96751034 @hlin09 - cool, I don't have a strong opinion towards either approach for now. Did a pass and left some minor style comments. --- If your project is set up

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-27 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/5568#discussion_r29169055 --- Diff: R/pkg/R/RDD.R --- @@ -1529,3 +1529,50 @@ setMethod(zipRDD, PipelinedRDD(zippedRDD, partitionFunc

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-27 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/5568#discussion_r29168961 --- Diff: R/pkg/R/RDD.R --- @@ -1529,3 +1529,50 @@ setMethod(zipRDD, PipelinedRDD(zippedRDD, partitionFunc

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-27 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/5568#discussion_r29169169 --- Diff: R/pkg/R/RDD.R --- @@ -1529,3 +1529,50 @@ setMethod(zipRDD, PipelinedRDD(zippedRDD, partitionFunc

[GitHub] spark pull request: [SPARK-7033][SPARKR] Clean usage of split. Use...

2015-04-23 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5628#issuecomment-95651218 Thanks, @sun-rui - doing a grep, could you also update test_rdd.R's line 124? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-7033][SPARKR] Clean usage of split. Use...

2015-04-23 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5628#issuecomment-95775125 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6991] [SparkR] Adds support for zipPart...

2015-04-20 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5568#issuecomment-94624336 Thanks @hlin09 - one high-level question: do you have an intended use case in mind for arbitrary / large arity functions? In Spark's Scala API up to 4 RDDs

[GitHub] spark pull request: [Minor][SparkR] Minor refactor and removes red...

2015-04-13 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5495#issuecomment-92477984 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [Minor][SparkR] Minor refactor and removes red...

2015-04-13 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5495#issuecomment-92479065 Hey @pwendell I can't seem to summon Jenkins now :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [Minor][SparkR] Minor refactor and removes red...

2015-04-13 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5495#issuecomment-92589456 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-09 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5096#issuecomment-91123979 :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR into Apache Spar...

2015-03-18 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/5077#issuecomment-83271348 @shivaram What's the timeline we are looking at? If possible I'd like to take a close look at this next week / after next week. --- If your project

[GitHub] spark pull request: [SPARK-3609][SQL] Adds sizeInBytes statistics ...

2014-09-19 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/2468#discussion_r17817992 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala --- @@ -122,6 +122,16 @@ object NativeType

[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

2014-08-31 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/2220#issuecomment-53995569 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3329: [SQL] Don't depend on Hive SET pai...

2014-08-31 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/2220#issuecomment-53995880 Ah, so this problem *was* fixed in PR #1514, but it seems like this [PR](https://github.com/apache/spark/commit/a7a9d14479ea6421513a962ff0f45cb969368bab#diff-48

[GitHub] spark pull request: [SPARK-3252][SQL] Add missing condition for te...

2014-08-27 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/2159#issuecomment-53635984 This is not harmful yet it doesn't do anything either, since in the test query the two relations should be the same. On Wednesday, August 27, 2014

[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-07 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1839#issuecomment-51520540 Jenkins, this is okay to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

2014-08-07 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1801#issuecomment-51533012 @liancheng @andrewor14 @pwendell With this patch things like `./bin/spark-shell --master local[2]` errors out (bad options: --master). I had to workaround

[GitHub] spark pull request: [SPARK-2678][Core][SQL] A workaround for SPARK...

2014-08-07 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1801#issuecomment-51533306 Oh, it's been reported by #1825. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2406][SQL] Initial support for using Pa...

2014-08-06 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1819#discussion_r15913270 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -78,6 +78,14 @@ class HiveContext(sc: SparkContext) extends

[GitHub] spark pull request: [SPARK-2406][SQL] Initial support for using Pa...

2014-08-06 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1819#discussion_r15914067 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -78,6 +78,14 @@ class HiveContext(sc: SparkContext) extends

[GitHub] spark pull request: [SPARK-2860][SQL] Fix coercion of CASE WHEN.

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1785#discussion_r15828435 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -336,28 +338,33 @@ trait HiveTypeCoercion

[GitHub] spark pull request: [SPARK-2860][SQL] Fix coercion of CASE WHEN.

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1785#discussion_r15828596 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -336,28 +338,33 @@ trait HiveTypeCoercion

[GitHub] spark pull request: [SPARK-2860][SQL] Fix coercion of CASE WHEN.

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1785#issuecomment-51232808 A few minor comments otherwise LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Set Spark SQL Hive compatibility test shuffle ...

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1784#issuecomment-51234268 Can we reset the original value in afterAll()? There's a test in `SQLConfSuite` that depends on that option, and in the future people might easily add tests

[GitHub] spark pull request: [SQL] Tighten the visibility of various SQLCon...

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1794#discussion_r15856595 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -17,17 +17,17 @@ package org.apache.spark.sql +import

[GitHub] spark pull request: [SQL] Tighten the visibility of various SQLCon...

2014-08-05 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1794#issuecomment-51295283 Hey @rxin -- I think this is good to go. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2179] [SQL] Public API for DataTypes an...

2014-08-04 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1774#discussion_r15790429 --- Diff: docs/sql-programming-guide.md --- @@ -152,6 +152,41 @@ val teenagers = sqlContext.sql(SELECT name FROM people WHERE age = 13 AND age

[GitHub] spark pull request: [SPARK-2179] [SQL] Public API for DataTypes an...

2014-08-04 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1774#discussion_r15790441 --- Diff: docs/sql-programming-guide.md --- @@ -152,6 +152,41 @@ val teenagers = sqlContext.sql(SELECT name FROM people WHERE age = 13 AND age

[GitHub] spark pull request: [SPARK-2179] [SQL] Public API for DataTypes an...

2014-08-04 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1774#discussion_r15790528 --- Diff: docs/sql-programming-guide.md --- @@ -225,6 +260,54 @@ ListString teenagerNames = teenagers.map(new FunctionRow, String

[GitHub] spark pull request: [SPARK-2179] [SQL] Public API for DataTypes an...

2014-08-04 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1774#discussion_r15790538 --- Diff: docs/sql-programming-guide.md --- @@ -259,6 +342,40 @@ for teenName in teenNames.collect(): print teenName {% endhighlight

[GitHub] spark pull request: [SPARK-2783][SQL] Basic support for analyze in...

2014-08-03 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1741#discussion_r15733255 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -21,12 +21,15 @@ import java.io.{BufferedReader, File

[GitHub] spark pull request: [SPARK-2783][SQL] Basic support for analyze in...

2014-08-03 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1741#discussion_r15733260 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -280,7 +281,7 @@ private[hive] case class

[GitHub] spark pull request: [SPARK-2783][SQL] Basic support for analyze in...

2014-08-03 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1741#discussion_r15733265 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -280,7 +281,7 @@ private[hive] case class

[GitHub] spark pull request: [SPARK-2783][SQL] Basic support for analyze in...

2014-08-03 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1741#discussion_r15733274 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -92,6 +95,64 @@ class HiveContext(sc: SparkContext) extends

[GitHub] spark pull request: [SPARK-2783][SQL] Basic support for analyze in...

2014-08-03 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1741#discussion_r15733277 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -92,6 +95,64 @@ class HiveContext(sc: SparkContext) extends

[GitHub] spark pull request: [SPARK-2675] Increase EVENT_QUEUE_CAPACITY by ...

2014-08-02 Thread concretevitamin
Github user concretevitamin closed the pull request at: https://github.com/apache/spark/pull/1579 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-2316] Avoid O(blocks) operations in lis...

2014-08-01 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1679#issuecomment-50946639 How many listeners are used in these benchmarks? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2531 SPARK-2436] [SQL] Optimize the B...

2014-07-31 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1448#issuecomment-50807987 I have rebased made changes according to the previous review comments. Also updated the title and description of the PR, combining two JIRA tickets

[GitHub] spark pull request: [SPARK-2316] Avoid O(blocks) operations in lis...

2014-07-31 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1679#issuecomment-50827351 This patch is much appreciated -- thanks for working on this! On Thu, Jul 31, 2014 at 3:02 PM, Apache Spark QA notificati...@github.com wrote

[GitHub] spark pull request: [SPARK-2531] [SQL] Make BroadcastNestedLoopJoi...

2014-07-30 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1448#discussion_r15599752 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins.scala --- @@ -332,71 +342,88 @@ case class BroadcastNestedLoopJoin

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-29 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-50522265 Rebased addressed review comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-29 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-50522286 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2179][SQL] Public API for DataTypes and...

2014-07-28 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1346#issuecomment-50423690 @yhuai @marmbrus I am not sure if this has been discussed before, but what do you guys think about adding a version of `applySchema(RDD[Array[String

[GitHub] spark pull request: [SPARK-2179][SQL] Public API for DataTypes and...

2014-07-28 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1346#issuecomment-50423851 To add to this: for my own purpose, I can certainly hack something together based off this branch in a custom Spark build, but just want to throw this thought

[GitHub] spark pull request: [SPARK-2410][SQL] Merging Hive Thrift/JDBC ser...

2014-07-26 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1600#issuecomment-50224519 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2674] [SQL] [PySpark] support datetime ...

2014-07-26 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1601#discussion_r15434945 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala --- @@ -395,6 +395,11 @@ class SchemaRDD

[GitHub] spark pull request: [SPARK-2675] Increase EVENT_QUEUE_CAPACITY by ...

2014-07-25 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1579#issuecomment-50183155 Unfortunately I may not be able to find time to run an experiment soon. If anyone is interested, I think `org.apache.spark.util.SizeEstimator` is reasonable

[GitHub] spark pull request: [SPARK-2675] Increase EVENT_QUEUE_CAPACITY by ...

2014-07-24 Thread concretevitamin
GitHub user concretevitamin opened a pull request: https://github.com/apache/spark/pull/1579 [SPARK-2675] Increase EVENT_QUEUE_CAPACITY by 20x. JIRA ticket: https://issues.apache.org/jira/browse/SPARK-2675 @pwendell @andrewor You can merge this pull request into a Git

[GitHub] spark pull request: [SPARK-2675] Increase EVENT_QUEUE_CAPACITY by ...

2014-07-24 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1579#issuecomment-50065677 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2531] [SQL] Make BroadcastNestedLoopJoi...

2014-07-23 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1448#issuecomment-49911126 Thanks for the comments @chenghao-intel and @marmbrus. As Michael said I'll revisit this after the codegen PR. --- If your project is set up for it, you can

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-23 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/993#discussion_r15311508 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateProjection.scala --- @@ -0,0 +1,218

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-22 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r15243150 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -66,8 +66,8 @@ abstract class SparkPlan extends QueryPlan

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-22 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r15243796 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -26,6 +26,26 @@ import

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-22 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-49785200 I have addressed the latest round of review comments rebased onto latest master. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: Fix flakey HiveQuerySuite test

2014-07-21 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1514#issuecomment-49667765 Thanks for the fix. Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2561][SQL] Fix apply schema

2014-07-21 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1470#issuecomment-49686947 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1439#discussion_r15013611 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -241,4 +252,37 @@ private[hive] object HadoopTableReader

[GitHub] spark pull request: [SPARK-2190][SQL] Specialized ColumnType for T...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1440#discussion_r15018072 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnStats.scala --- @@ -344,21 +344,52 @@ private[sql] class StringColumnStats

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r15019589 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -26,6 +26,28 @@ import

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-49210466 Jenkins, test this please. I think I have addressed the latest round of review comments, where the biggest changes being: - Remove statistics

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-49219047 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r15024063 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -26,6 +26,28 @@ import

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-49222667 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2531] [SQL] Make BroadcastNestedLoopJoi...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1448#issuecomment-49227975 Jenkins, this is okay to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SQL] Add HiveDecimal HiveVarchar support in...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1436#issuecomment-49228593 Hey @chenghao-intel -- can you create a JIRA ticket for this? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-2531] [SQL] Make BroadcastNestedLoopJoi...

2014-07-16 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1448#issuecomment-49239568 Jenkins, test this please... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
GitHub user concretevitamin opened a pull request: https://github.com/apache/spark/pull/1423 [SQL] Synchronize on a lock when using scala reflection inside data type objects. You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105217 Jenkins, add to whitelist please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SQL] Synchronize on a lock when using scala r...

2014-07-15 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1423#issuecomment-49105361 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin closed the pull request at: https://github.com/apache/spark/pull/1390 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1390#issuecomment-48935494 @yhuai suggested a much simpler fix -- I benchmarked this and it gave the same performance boost. I am closing this and opening a new PR. --- If your project

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
GitHub user concretevitamin opened a pull request: https://github.com/apache/spark/pull/1408 [SPARK-2443][SQL] Fix slow read from partitioned tables This fix obtains a comparable performance boost as [PR #1390](https://github.com/apache/spark/pull/1390) by moving an array update

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1390#issuecomment-48936743 New PR here: #1408 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1408#issuecomment-48936856 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1390#discussion_r14894946 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-48941276 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1408#issuecomment-48954674 I think we should ask the users who reported the performance issue if this fix solves their problems. Otherwise the comments in the previous PR seem to only

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1390#discussion_r14902569 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r14902843 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -263,6 +268,19 @@ private[hive] case class

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1390#discussion_r14902570 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r14904711 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -47,6 +47,13 @@ private[sql] abstract class

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-14 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r14906952 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -92,6 +114,8 @@ abstract class

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-13 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1238#discussion_r14857017 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -21,16 +21,27 @@ import java.util.Properties import

[GitHub] spark pull request: [SPARK-2443][SQL] Fix slow read from partition...

2014-07-12 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1390#issuecomment-48830080 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2393][SQL] Prototype implementation of ...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-48570761 I have addressed most of the review comments rebased. Please review again take a look at the TODOs. It might be better to leave the optimization

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-48630836 @rxin sure, done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14781579 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -19,8 +19,11 @@ package

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14781690 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -207,3 +210,64 @@ case class

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14781542 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -207,3 +210,64 @@ case class

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14782267 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -860,6 +860,7 @@ private[hive] object HiveQl { val BETWEEN

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14782583 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -207,3 +210,64 @@ case class

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14782682 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -207,3 +210,64 @@ case class

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1359#issuecomment-48642397 Hey @willb - thanks for working on this, which is going to be very useful for Spark SQL. I left a couple minor comments. Another general concern

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on a diff in the pull request: https://github.com/apache/spark/pull/1359#discussion_r14783724 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -207,3 +210,64 @@ case class

[GitHub] spark pull request: [SPARK-2393][SQL] Cost estimation optimization...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1238#issuecomment-48645064 To handle potential overflow (one last TODO), I think there are a couple alternatives: - A: Throw exceptions for overflowing operations. Similar to [1

[GitHub] spark pull request: SPARK-2407: Added internal implementation of S...

2014-07-10 Thread concretevitamin
Github user concretevitamin commented on the pull request: https://github.com/apache/spark/pull/1359#issuecomment-48648864 This sounds pretty cool! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

  1   2   3   >