[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18433803 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -80,8 +81,10 @@ class StatisticsSuite extends QueryTest with BeforeAn

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18433810 --- Diff: sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala --- @@ -0,0 +1,158 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18433827 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim.scala --- @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18433838 --- Diff: sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala --- @@ -0,0 +1,158 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18433848 --- Diff: sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim.scala --- @@ -0,0 +1,158 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-3721] [PySpark] broadcast objects large...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2659#issuecomment-57929753 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21311/consoleFull) for PR 2659 at commit [`1c2d928`](https://github.com/a

[GitHub] spark pull request: [SPARK-3721] [PySpark] broadcast objects large...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2659#issuecomment-57929754 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: SPARK-1656: Fix potential resource leaks

2014-10-05 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/577#discussion_r18433883 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -73,7 +73,21 @@ private[spark] class DiskStore(blockManager: BlockManager, diskMa

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2513#discussion_r18434155 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -183,17 +184,15 @@ class SqlParser extends StandardTokenParsers w

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18434383 --- Diff: pom.xml --- @@ -1260,7 +1259,18 @@ - + + hive-default + + +

[GitHub] spark pull request: [SPARK-3802][BUILD] Scala version is wrong in ...

2014-10-05 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/2661 [SPARK-3802][BUILD] Scala version is wrong in dev/audit-release/blank_sbt_build/build.sbt In dev/audit-release/blank_sbt_build/build.sbt, scalaVersion indicates 2.9.3 but I think 2.10.4 is correct.

[GitHub] spark pull request: [SPARK-3802][BUILD] Scala version is wrong in ...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2661#issuecomment-57932586 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21312/consoleFull) for PR 2661 at commit [`58a9666`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3793][SQL]use HiveConf/Context when par...

2014-10-05 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2655#issuecomment-57932879 It should be fixed in https://github.com/apache/spark/pull/2241, so close this --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-3793][SQL]use HiveConf/Context when par...

2014-10-05 Thread scwf
Github user scwf closed the pull request at: https://github.com/apache/spark/pull/2655 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: [SPARK-3802][BUILD] Scala version is wrong in ...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2661#issuecomment-57933979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: [SPARK-3802][BUILD] Scala version is wrong in ...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2661#issuecomment-57933977 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21312/consoleFull) for PR 2661 at commit [`58a9666`](https://github.com/a

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-05 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2629#issuecomment-57936350 Since this is a doc change only, the test failure must be spurious and I think it's ignorable. (Although you might break your long line across two lines if you change the

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/2662 SPARK-3794 [CORE] Building spark core fails due to inadvertent dependency on Commons IO Remove references to Commons IO FileUtils and replace with pure Java version, which doesn't need to traverse t

[GitHub] spark pull request: [SPARK-3801] More efficient app dir cleanup

2014-10-05 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2660#issuecomment-57937489 I think this may be subsumed in https://github.com/apache/spark/pull/2662 --- If your project is set up for it, you can reply to this email and have your reply appear on G

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2662#issuecomment-57937669 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21313/consoleFull) for PR 2662 at commit [`4cd172f`](https://github.com/ap

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2241#issuecomment-57939048 I tested with @pwendell shaded hive-0.13.1, also has this problem: Exception in thread "main" java.lang.ClassNotFoundException: com.google.protobuf_spark.GeneratedMessa

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2241#issuecomment-57939320 I made a shaded hive-0.13.1 version several days ago for testing(https://github.com/scwf/hive/tree/0.13.1-shaded).Hope it is useful:) --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-3801] More efficient app dir cleanup

2014-10-05 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/2660#issuecomment-57939614 @srowen, I think so. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-3801] More efficient app dir cleanup

2014-10-05 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/2660 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2662#issuecomment-57939900 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2662#issuecomment-57939897 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21313/consoleFull) for PR 2662 at commit [`4cd172f`](https://github.com/a

[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...

2014-10-05 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2576#issuecomment-57941761 Updated, @marmbrus, can you take a look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/2663 [SPARK-3007][SQL] Fixes dynamic partitioning support for lower Hadoop versions This is a follow up of #2226 and #2616 to fix Jenkins build failures for lower Hadoop versions (1.0.x and 2.0.x).

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57942933 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21314/consoleFull) for PR 2663 at commit [`0177dae`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57943087 @yhuai Would you mind to leave some suggestions/comments? I reached the conclusion in the PR description by combing insertion and loading related code in Hive, wonderin

[GitHub] spark pull request: [SPARK-3597][Mesos] Implement `killTask`.

2014-10-05 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2453#issuecomment-57943748 Alright, thanks for testing @JoshRosen and @brndnmtthws. I'm merging this into master and 1.1. --- If your project is set up for it, you can reply to this email and h

[GitHub] spark pull request: [SPARK-3597][Mesos] Implement `killTask`.

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2453 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-1656: Fix potential resource leaks

2014-10-05 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/577#issuecomment-57943959 Thanks @zsxwing I'm merging this. I'll try to get this into 1.1 as well but there might be merge conflicts. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1656: Fix potential resource leaks

2014-10-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/577#discussion_r18436136 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -73,7 +73,21 @@ private[spark] class DiskStore(blockManager: BlockManager, dis

[GitHub] spark pull request: Event proration based on event timestamps.

2014-10-05 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2633#issuecomment-57944060 Yes, but please update the title of the PR. Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: SPARK-1656: Fix potential resource leaks

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/577 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] spark pull request: Event proration based on event timestamps.

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2633#issuecomment-57944295 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21315/consoleFull) for PR 2633 at commit [`bfe9502`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57944639 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21314/consoleFull) for PR 2663 at commit [`0177dae`](https://github.com/a

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57944640 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: Event proration based on event timestamps.

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2633#issuecomment-57946571 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: Event proration based on event timestamps.

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2633#issuecomment-57946570 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21315/consoleFull) for PR 2633 at commit [`bfe9502`](https://github.com/a

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57946672 I'm going to merge this so we can fix Jenkins. If @yhuai has comments they can be addressed in a follow up. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2663 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2513#discussion_r18436541 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -183,17 +184,15 @@ class SqlParser extends StandardTokenParsers wi

[GitHub] spark pull request: HOTFIX: Fix unicode error in merge script.

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2645 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57951207 I think it is good. Just a note. For Hive, seems it also [set the output committer to its NullOutputCommitter](https://github.com/apache/hive/blob/trunk/shims/0.20

[GitHub] spark pull request: SPARK-3805 Set spark.worker.cleanup.enabled to...

2014-10-05 Thread ash211
GitHub user ash211 opened a pull request: https://github.com/apache/spark/pull/2664 SPARK-3805 Set spark.worker.cleanup.enabled to true by default You can merge this pull request into a Git repository by running: $ git pull https://github.com/ash211/spark SPARK-3805 Alternati

[GitHub] spark pull request: SPARK-3805 Set spark.worker.cleanup.enabled to...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2664#issuecomment-57953537 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21316/consoleFull) for PR 2664 at commit [`60ba894`](https://github.com/ap

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2662#issuecomment-57954626 I believe this was introduced in https://github.com/apache/spark/pull/2609 -- any idea why Jenkins didn't catch the build issue? cc @mccheah --- If your project

[GitHub] spark pull request: [Spark] RDD take() method: overestimate too mu...

2014-10-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2648#issuecomment-57955322 This seems right to me yingjie. Let's see if the tests work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-3805 Set spark.worker.cleanup.enabled to...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2664#issuecomment-57955684 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21316/consoleFull) for PR 2664 at commit [`60ba894`](https://github.com/a

[GitHub] spark pull request: SPARK-3805 Set spark.worker.cleanup.enabled to...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2664#issuecomment-57955685 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: [SPARK-3166]: Allow custom serialiser to be sh...

2014-10-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1890#issuecomment-57956057 I set the Target Version on SPARK-3166 to 1.2.0 so we can try to get this in --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark pull request: [SPARK-3166]: Allow custom serialiser to be sh...

2014-10-05 Thread GrahamDennis
Github user GrahamDennis commented on the pull request: https://github.com/apache/spark/pull/1890#issuecomment-57956359 @rxin, @ash211 It would be good to have a conversation about whether this is the best approach. My approach is a sort-of brute-force approach of just adding

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-05 Thread giwa
Github user giwa commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-57957378 Why was this error happened in jenkins? Is this because of many commits? The error said ``` Fetching upstream changes from https://github.com/apache/spark.git

[GitHub] spark pull request: [SPARK-3721] [PySpark] broadcast objects large...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2659#issuecomment-57958523 SQL changes LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2654#issuecomment-57958555 Good catch! Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request: [Minor] Trivial fix to make codes more readabl...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2654 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2652#issuecomment-57958721 Oh wow, good catch! I couldn't figure out why this wasn't working! Merged to master. --- If your project is set up for it, you can reply to this email and have your r

[GitHub] spark pull request: [SPARK-3792][SQL]enable JavaHiveQLSuite

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2652 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2513#issuecomment-57958810 I'm going to merge this. Feel free to clean up minor ";" issue as part of the other parser refactoring you are doing. Thanks :) --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-3645][SQL] Makes table caching eager by...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2513 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SQL] SPARK-3776: Wrong conversion to Catalyst...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2641#issuecomment-57958918 Good catch, thanks for fixing this! Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark pull request: [SQL] SPARK-3776: Wrong conversion to Catalyst...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2641 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439071 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -95,6 +95,22 @@ case class In(value: Expression, lis

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439078 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -95,6 +95,22 @@ case class In(value: Expression, lis

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439085 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439096 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizedInSuite.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Ap

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439341 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -226,6 +229,24 @@ object ConstantFolding extends Rule[L

[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...

2014-10-05 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2561#discussion_r18439369 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvaluationSuite.scala --- @@ -136,6 +138,18 @@ class ExpressionEva

[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...

2014-10-05 Thread zhzhan
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/2241#discussion_r18439490 --- Diff: pom.xml --- @@ -1260,7 +1259,18 @@ - + + hive-default + + +

[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57962393 FYI, this broke the build for some versions of Hadoop: ``` [INFO] Compiling 395 Scala sources and 29 Java sources to

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2662#issuecomment-57962605 @ash211 I'd guess that is dependent on the version of Hadoop that we are compiling with. It did cause failures on some versions of the master build. @srowen tha

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2662 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/2662#discussion_r18439704 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -710,18 +708,20 @@ private[spark] object Utils extends Logging { * Determines i

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1547: Adding Gradient Boos...

2014-10-05 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2607#issuecomment-57966546 @jkbradley I meant multi-class classification. As you pointed out, binary classification should be similar to the regression case but I am not sure one can h

[GitHub] spark pull request: [SPARK-3007][SQL] Fixes dynamic partitioning s...

2014-10-05 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2663#issuecomment-57968435 @yhuai Thanks for pointing out the `NullOutputCommitter` part, that's the missing piece I was looking for :) --- If your project is set up for it, you can reply to thi

[GitHub] spark pull request: SPARK-3805 Set spark.worker.cleanup.enabled to...

2014-10-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2664#issuecomment-57969633 Back when we introduced this feature we decided to have it be false based on the principle of least surprise. The concern was, say someone upgrades spark and now without

[GitHub] spark pull request: Adding support of initial value for state upda...

2014-10-05 Thread soumitrak
GitHub user soumitrak opened a pull request: https://github.com/apache/spark/pull/2665 Adding support of initial value for state update. SPARK-3660 : Initial RDD for updateStateByKey transformation I have added a sample StatefulNetworkWordCountWithInitial inspired by Statef

[GitHub] spark pull request: Adding support of initial value for state upda...

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2665#issuecomment-57972339 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-3806][SQL]Minor fix for CliSuite

2014-10-05 Thread scwf
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/2666 [SPARK-3806][SQL]Minor fix for CliSuite To fix two issue in CliSuite 1 CliSuite throw IndexOutOfBoundsException: Exception in thread "Thread-6" java.lang.IndexOutOfBoundsException: 6 at

[GitHub] spark pull request: [SPARK-3806][SQL]Minor fix for CliSuite

2014-10-05 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2666#issuecomment-57972926 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: SPARK-3794 [CORE] Building spark core fails du...

2014-10-05 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2662#discussion_r18440969 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -710,18 +708,20 @@ private[spark] object Utils extends Logging { * Determines if

[GitHub] spark pull request: [SPARK-3806][SQL]Minor fix for CliSuite

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2666#issuecomment-57972987 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-3786] [PySpark] speedup tests

2014-10-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2646#discussion_r18440988 --- Diff: python/pyspark/tests.py --- @@ -152,7 +152,7 @@ def test_external_sort(self): self.assertGreater(shuffle.DiskBytesSpilled, last)

[GitHub] spark pull request: Rectify gereneric parameter names between Spar...

2014-10-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2637#issuecomment-57973031 Thanks @nkronenfeld and @srowen. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Rectify gereneric parameter names between Spar...

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2637 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2629#issuecomment-57974125 Yep, I'll merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3765][Doc] add testing with sbt to docs

2014-10-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2629 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-10-05 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-57974280 @nchammas that page won't appear until we actually update the live docs (something that happens for each release rather than when a push a PR) --- If your project is se

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-57974391 @giwa maybe,this should be one of the most hottest PR (according to number of commits) : -) --- If your project is set up for it, you can reply to this email and have y

[GitHub] spark pull request: [SPARK-3786] [PySpark] speedup tests

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2646#issuecomment-57974527 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: SPARK-3568 [mllib] add ranking metrics

2014-10-05 Thread coderxiang
GitHub user coderxiang opened a pull request: https://github.com/apache/spark/pull/2667 SPARK-3568 [mllib] add ranking metrics Add common metrics for ranking algorithms (http://www-nlp.stanford.edu/IR-book/), including: - Mean Average Precision - Precision@n: top-n precisi

[GitHub] spark pull request: SPARK-3568 [mllib] add ranking metrics

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2667#issuecomment-57976456 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21318/consoleFull) for PR 2667 at commit [`3a5a6ff`](https://github.com/ap

[GitHub] spark pull request: [SPARK-2805] akka 2.3.4

2014-10-05 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/1685#issuecomment-57976544 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark pull request: [SPARK-3731] [PySpark] fix memory leak in Pyth...

2014-10-05 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/2668 [SPARK-3731] [PySpark] fix memory leak in PythonRDD The parent.getOrCompute() of PythonRDD is executed in a separated thread, it should release the memory reserved for shuffle and unrolling finally.

[GitHub] spark pull request: [SPARK-3731] [PySpark] fix memory leak in Pyth...

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2668#issuecomment-57977968 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21319/consoleFull) for PR 2668 at commit [`ae98be2`](https://github.com/ap

[GitHub] spark pull request: [SPARK-2805] akka 2.3.4

2014-10-05 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/1685#issuecomment-57979326 LGTM, I have tested it locally by running test suits(only relevant ones.) @pwendell Can you trigger jenkins here and should be okay to merge ? --- If your project is

[GitHub] spark pull request: SPARK-3568 [mllib] add ranking metrics

2014-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2667#issuecomment-57979727 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/2

[GitHub] spark pull request: SPARK-3568 [mllib] add ranking metrics

2014-10-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2667#issuecomment-57979724 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21318/consoleFull) for PR 2667 at commit [`3a5a6ff`](https://github.com/a

  1   2   >