[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213202828 @marmbrus @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213202897 @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark pull request: [SPARK-14768] [ML] [PySpark] removed expectedT...

2016-04-21 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12581#issuecomment-213202410 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213202206 **[Test build #56626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56626/consoleFull)** for PR 12588 at commit [`c0b1e40`](https://gi

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-04-21 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r60680711 --- Diff: python/pyspark/__init__.py --- @@ -59,6 +59,8 @@ def since(version): indent_p = re.compile(r'\n( +)') def deco(f): +

[GitHub] spark pull request: [SPARK-14768] [ML] [PySpark] removed expectedT...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12581#issuecomment-213201504 **[Test build #2853 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2853/consoleFull)** for PR 12581 at commit [`7aa164f`](https://g

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213201193 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213201198 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213201183 **[Test build #56625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56625/consoleFull)** for PR 12592 at commit [`48f1e6a`](https://g

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213200949 **[Test build #56625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56625/consoleFull)** for PR 12592 at commit [`48f1e6a`](https://gi

[GitHub] spark pull request: [SPARK-14579][SQL]Fix the race condition in St...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12582#issuecomment-213200812 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14579][SQL]Fix the race condition in St...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12582#issuecomment-213200810 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14579][SQL]Fix the race condition in St...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12582#issuecomment-213200647 **[Test build #56612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56612/consoleFull)** for PR 12582 at commit [`6fadd0f`](https://g

[GitHub] spark pull request: [SPARK-14807] Create a compatibility module

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12580#issuecomment-213200452 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14807] Create a compatibility module

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12580#issuecomment-213200454 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-13330][PYSPARK] PYTHONHASHSEED is not p...

2016-04-21 Thread zjffdu
Github user zjffdu commented on the pull request: https://github.com/apache/spark/pull/11211#issuecomment-213200292 The executor here may be a little misleading. It means the python worker rather than the spark executor. Python worker get its env from PythonRDD and PythonRDD get th

[GitHub] spark pull request: [SPARK-14807] Create a compatibility module

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12580#issuecomment-213200311 **[Test build #56608 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56608/consoleFull)** for PR 12580 at commit [`8d54be3`](https://g

[GitHub] spark pull request: [SPARK-9778][SQL] remove unnecessary evaluatio...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8066#issuecomment-213198192 **[Test build #56624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56624/consoleFull)** for PR 8066 at commit [`af1a83d`](https://gith

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12592#discussion_r60679517 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala --- @@ -95,20 +89,18 @@ trait StreamTest extends QueryTest with Timeouts { /

[GitHub] spark pull request: [SPARK-9778][SQL] remove unnecessary evaluatio...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/8066#issuecomment-213196419 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12592#discussion_r60679490 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala --- @@ -67,12 +67,6 @@ import org.apache.spark.util.Utils */ trait StreamTest

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213195344 **[Test build #56623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56623/consoleFull)** for PR 12591 at commit [`1c0d8cb`](https://gi

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/12592#discussion_r60679327 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -369,6 +369,7 @@ class StreamExecution( def awai

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213192672 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213192667 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213192403 **[Test build #56622 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56622/consoleFull)** for PR 12591 at commit [`e9c6d60`](https://g

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213192175 **[Test build #56611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56611/consoleFull)** for PR 12588 at commit [`173bc45`](https://g

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213192420 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213192426 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12440#issuecomment-213191782 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213191681 **[Test build #56622 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56622/consoleFull)** for PR 12591 at commit [`e9c6d60`](https://gi

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12440#issuecomment-213191780 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12440#issuecomment-213191287 **[Test build #56607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56607/consoleFull)** for PR 12440 at commit [`facab2c`](https://g

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213190125 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213190118 **[Test build #56621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56621/consoleFull)** for PR 12592 at commit [`ef65ad7`](https://g

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213190121 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213189773 **[Test build #2852 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2852/consoleFull)** for PR 12584 at commit [`1caae34`](https://

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12592#issuecomment-213189615 **[Test build #56621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56621/consoleFull)** for PR 12592 at commit [`ef65ad7`](https://gi

[GitHub] spark pull request: [SPARK-14833][SQL][STREAMING][TEST] Refactor S...

2016-04-21 Thread tdas
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/12592 [SPARK-14833][SQL][STREAMING][TEST] Refactor StreamTests to test for source fault-tolerance correctly. ## What changes were proposed in this pull request? Current StreamTest allows testing of

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213188904 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function

2016-04-21 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-213188379 looks good except 1 minor doc comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213188258 @viirya @davies Could I ask your thought on this? I can make a JIRA for this. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213188160 **[Test build #56620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56620/consoleFull)** for PR 12588 at commit [`3d65569`](https://gi

[GitHub] spark pull request: [SPARK-14824][SQL] Rename HiveContext object t...

2016-04-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12586 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213186906 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213186903 **[Test build #56619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56619/consoleFull)** for PR 12591 at commit [`d7efacb`](https://g

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213186907 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING] Refactor DataSou...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12591#issuecomment-213186733 **[Test build #56619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56619/consoleFull)** for PR 12591 at commit [`d7efacb`](https://gi

[GitHub] spark pull request: [SPARK-14824][SQL] Rename HiveContext object t...

2016-04-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12586#issuecomment-213186770 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12239#issuecomment-213186736 cc @liancheng to take another look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-14824][SQL] Rename HiveContext object t...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12586#issuecomment-213185994 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14832][SQL][STREAMING]

2016-04-21 Thread tdas
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/12591 [SPARK-14832][SQL][STREAMING] ## What changes were proposed in this pull request? When creating a file stream using sqlContext.write.stream(), existing files are scanned twice for finding th

[GitHub] spark pull request: [SPARK-14824][SQL] Rename HiveContext object t...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12586#issuecomment-213185998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60677797 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apac

[GitHub] spark pull request: [SPARK-14824][SQL] Rename HiveContext object t...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12586#issuecomment-213185507 **[Test build #56610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56610/consoleFull)** for PR 12586 at commit [`d4f1181`](https://g

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-21 Thread jodersky
Github user jodersky commented on the pull request: https://github.com/apache/spark/pull/12555#issuecomment-213185109 Fair enough, test:compile is alright too. I would just avoid compile:compile On Apr 21, 2016 5:48 PM, "Reynold Xin" wrote: > hm most people don't run packa

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60677594 --- Diff: R/pkg/inst/worker/worker.R --- @@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs() # as number of partitions to create. numPartitions <-

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60677525 --- Diff: R/pkg/R/generics.R --- @@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2) {standardGeneric("covar_samp") }) #' @export setGe

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12239#discussion_r60677401 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -259,4 +261,78 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213184557 **[Test build #56618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56618/consoleFull)** for PR 12588 at commit [`23e72ef`](https://gi

[GitHub] spark pull request: [SPARK-10496][SQL] Add DataFrame cumulative su...

2016-04-21 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12578#issuecomment-213184521 @rxin It is said in [JIRA](https://issues.apache.org/jira/browse/SPARK-10496) that window function is not efficient for a large number of rows in this problem. @jk

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60677343 --- Diff: R/pkg/R/DataFrame.R --- @@ -1137,11 +1137,22 @@ setMethod("summarize", #' @rdname dapply #' @name dapply #' @export +#' @exam

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12555#issuecomment-213184473 hm most people don't run package and just compile or test:compile, so adding it there would reduce the gain. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-14828][SQL] Start SparkSession in REPL ...

2016-04-21 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/12589#issuecomment-213184305 Yeah, if there are problems we don't have to do that, but `val df = spark.read.json(...)` is pretty nice --- If your project is set up for it, you can reply to this e

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213183696 @mathieulongtin I think it might be more sensible that SQL in OPTIONS clause supports `null` and some other types such as long, double and boolean. It [looks]

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12584 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213182422 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213181982 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213181990 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213181304 **[Test build #56600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56600/consoleFull)** for PR 12584 at commit [`dc11d4b`](https://g

[GitHub] spark pull request: [SPARK-14664][SQL] Fix DecimalAggregates optim...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12421#issuecomment-213181115 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14664][SQL] Fix DecimalAggregates optim...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12421#issuecomment-213181121 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14664][SQL] Fix DecimalAggregates optim...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12421#issuecomment-213179995 **[Test build #56602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56602/consoleFull)** for PR 12421 at commit [`7f64e62`](https://g

[GitHub] spark pull request: [SPARK-14796][SQL] Add spark.sql.optimizer.inS...

2016-04-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12562#issuecomment-213177718 Hi, @rxin and @marmbrus . How do you think about the updated PR? It's just first update. If there is something to do more, please let me know. Thank you

[GitHub] spark pull request: [SPARK-14821][SQL] Implement AnalyzeTable in s...

2016-04-21 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/12584#issuecomment-213177586 LGTM2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-21 Thread jodersky
Github user jodersky commented on the pull request: https://github.com/apache/spark/pull/12555#issuecomment-213176921 Hmm, maybe you need to add the project or config too? As in "set lintOnCompile in project in Config := true"? Sbt isn't too friendly when setting command line options

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-21 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/12239#discussion_r60676493 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -414,8 +414,42 @@ class Analyzer( }

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213176449 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213176444 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14479] [ML] GLM supports output link pr...

2016-04-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12287 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213175920 **[Test build #56603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56603/consoleFull)** for PR 12583 at commit [`2a077bd`](https://g

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12239#discussion_r60676371 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -414,8 +414,42 @@ class Analyzer( }

[GitHub] spark pull request: [SPARK-14479] [ML] GLM supports output link pr...

2016-04-21 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12287#issuecomment-213175477 LGTM. Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does no

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12239#discussion_r60676184 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -414,8 +414,42 @@ class Analyzer( }

[GitHub] spark pull request: [SPARK-14312] [ML] [SparkR] NaiveBayes model p...

2016-04-21 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/12573#discussion_r60676231 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/NaiveBayesWrapper.scala --- @@ -74,4 +83,41 @@ private[r] object NaiveBayesWrapper { .fit(data

[GitHub] spark pull request: [SPARK-14558][CORE] In ClosureCleaner, clean t...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12327#issuecomment-213170704 `cacheSize1` and `cacheSize2` are both the size after cleaning. The difference is that, `cacheSize1` is the size after cleaned the data with line object reference, `c

[GitHub] spark pull request: [SPARK-14721][SQL] Remove HiveContext (part 2)

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12585#issuecomment-213170169 **[Test build #56617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56617/consoleFull)** for PR 12585 at commit [`069c8b6`](https://gi

[GitHub] spark pull request: [TESTING] Load defaults in SharedSparkContext

2016-04-21 Thread andrewor14
Github user andrewor14 closed the pull request at: https://github.com/apache/spark/pull/12503 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-14803][SQL][Optimizer] A bug in Elimina...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12575#issuecomment-213169619 Can you add a test case to expose this bug? And I think a better way to fix this bug is to remove the alias only project, but not stop the optimization. ---

[GitHub] spark pull request: [SPARK-14312] [ML] [SparkR] NaiveBayes model p...

2016-04-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12573#discussion_r60675421 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/NaiveBayesWrapper.scala --- @@ -74,4 +83,41 @@ private[r] object NaiveBayesWrapper { .fit

[GitHub] spark pull request: SPARK-8398 hadoop input/output format advanced...

2016-04-21 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/6848#issuecomment-213169287 @koertkuipers now days we try and provide a description for our pull request (sometimes it can be copied from the JIRA) for the eventual commit message - it might be goo

[GitHub] spark pull request: [SPARK-14828][SQL] Start SparkSession in REPL ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12589#issuecomment-213168922 **[Test build #56616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56616/consoleFull)** for PR 12589 at commit [`45783de`](https://gi

[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function

2016-04-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/11569#discussion_r60675031 --- Diff: R/pkg/R/DataFrame.R --- @@ -2465,6 +2465,110 @@ setMethod("drop", base::drop(x) }) +#' This function c

[GitHub] spark pull request: SPARK-8398 hadoop input/output format advanced...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6848#issuecomment-213168019 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12402#issuecomment-213167994 The current plan of supporting gaussiansDF for now and gaussians later (in Python) sounds good. @wangmiao1981 Thanks for the updates; I had some more comments. ---

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60674866 --- Diff: python/pyspark/ml/clustering.py --- @@ -20,9 +20,150 @@ from pyspark.ml.wrapper import JavaEstimator, JavaModel from pyspark.ml.param.sh

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60674864 --- Diff: python/pyspark/ml/clustering.py --- @@ -20,9 +20,150 @@ from pyspark.ml.wrapper import JavaEstimator, JavaModel from pyspark.ml.param.sh

[GitHub] spark pull request: SPARK-8398 hadoop input/output format advanced...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6848#issuecomment-213168022 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/5

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60674858 --- Diff: python/pyspark/ml/clustering.py --- @@ -20,9 +20,150 @@ from pyspark.ml.wrapper import JavaEstimator, JavaModel from pyspark.ml.param.sh

[GitHub] spark pull request: [SPARK-14369][SQL] Locality support for FileSc...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12527#issuecomment-213167863 **[Test build #56615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56615/consoleFull)** for PR 12527 at commit [`e0bfa3e`](https://gi

<    1   2   3   4   5   6   7   8   9   10   >