[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14624 **[Test build #63788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63788/consoleFull)** for PR 14624 at commit

[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14624 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63788/ Test FAILed. ---

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12004 **[Test build #63789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63789/consoleFull)** for PR 12004 at commit

[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14624 @mvervuurt Could you please double check this one? I am pretty sure I checked out this PR and then tested this correctly. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #14640: [SPARK-17055] [MLLIB] add labelKFold to CrossValidator

2016-08-15 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14640 Thanks for making this issue and PR :) The first thing before people are likely to have the bandwith to review this is we are switching all new ML development to Spark ML from MLlib so it might be

[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14580 Rebased. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #12695: [SPARK-14914] Normalize Paths/URIs for windows.

2016-08-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/12695 Let's close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #14628: [SPARK-17050][ML][MLLib] Improve kmean rdd.aggregate to ...

2016-08-15 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14628 Awesome thanks for taking the time to do this. A few follow up questions: 1) So this is happening with the default tree depth (2) did you try it with other depths? 2) Have you had a chance

[GitHub] spark issue #14639: [SPARK-18054][SPARKR] SparkR can not run in yarn-cluster...

2016-08-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14639 @zjffdu Lets discuss why this was introduced more in the JIRA. Regd. the code change, on my Mac `$HOME` is set without any custom changes on my side. Any ideas when this will not be the case ?

[GitHub] spark pull request #14524: [SPARK-16832] [ML] [WIP] CrossValidator and Train...

2016-08-15 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14524#discussion_r74780485 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -126,13 +126,15 @@ private[python] class PythonMLLibAPI extends

[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14624 Jenkins add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #14649: [SPARK-17059][SQL] Allow FileFormat to specify pa...

2016-08-15 Thread andreweduffy
GitHub user andreweduffy opened a pull request: https://github.com/apache/spark/pull/14649 [SPARK-17059][SQL] Allow FileFormat to specify partition pruning strategy ## What changes were proposed in this pull request? Different FileFormat implementations may be able to make

[GitHub] spark pull request #14615: [SPARK-17029] make toJSON not go through rdd form...

2016-08-15 Thread robert3005
Github user robert3005 commented on a diff in the pull request: https://github.com/apache/spark/pull/14615#discussion_r74788858 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala --- @@ -84,7 +84,7 @@ class JsonFileFormat

[GitHub] spark pull request #14615: [SPARK-17029] make toJSON not go through rdd form...

2016-08-15 Thread robert3005
Github user robert3005 commented on a diff in the pull request: https://github.com/apache/spark/pull/14615#discussion_r74789001 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2494,16 +2494,18 @@ class Dataset[T] private[sql]( * @since 2.0.0

[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread mvervuurt
Github user mvervuurt commented on the issue: https://github.com/apache/spark/pull/14624 My bad about the other pull request. This one is the correct one ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14620: [SPARK-17032][SQL] Add test cases for methods in ParserU...

2016-08-15 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14620 @jiangxb1987 are you also adding tests for the remaining `ParserUtils` methods? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #14624: Fix PySpark DataFrameWriter JDBC method docstring becaus...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14624 **[Test build #63788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63788/consoleFull)** for PR 14624 at commit

[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-15 Thread tedyu
Github user tedyu commented on the issue: https://github.com/apache/spark/pull/14568 @rxin Can you take a look at the python API one more time ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14392: [SPARK-16446] [SparkR] [ML] Gaussian Mixture Model wrapp...

2016-08-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14392 `spark.gaussianMixture` sounds good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #14650: [SPARK-17062][MESOS] add conf option to mesos dis...

2016-08-15 Thread skonto
GitHub user skonto opened a pull request: https://github.com/apache/spark/pull/14650 [SPARK-17062][MESOS] add conf option to mesos dispatcher ## What changes were proposed in this pull request? Adds --conf option to set spark configuration properties in mesos dispacther.

[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14182#discussion_r74798929 --- Diff: R/pkg/R/generics.R --- @@ -1279,6 +1279,11 @@ setGeneric("spark.naiveBayes", function(data, formula, ...) { standardGeneric("s #' @export

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14634 Sorry. What's the necessity to make this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14650 **[Test build #63790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63790/consoleFull)** for PR 14650 at commit

[GitHub] spark issue #11956: [SPARK-14098][SQL] Generate Java code that gets a float/...

2016-08-15 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/11956 @kiszk I'm sorry that I do not have the bandwidth to review this, https://github.com/apache/spark/pull/13899/files sounds like an easier approach (have not looked into the details), how do you think

[GitHub] spark pull request #14522: [Spark-16508][SparkR] Split docs for arrange and ...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14522 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optim...

2016-08-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/14580#discussion_r74811155 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1334,12 +1334,19 @@ object EliminateOuterJoin

[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos

2016-08-15 Thread tnachen
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/14644 @srowen Mesos also supports node labels as well (which is how constraints is implemented in Spark framework). However GPUs are implemented as a resource (as we want to account for # of GPUs instead

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14469 **[Test build #63796 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63796/consoleFull)** for PR 14469 at commit

[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14580 **[Test build #63797 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63797/consoleFull)** for PR 14580 at commit

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12004 **[Test build #63787 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63787/consoleFull)** for PR 12004 at commit

[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14568 **[Test build #63791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63791/consoleFull)** for PR 14568 at commit

[GitHub] spark pull request #14151: [SPARK-16496][SQL] Add wholetext as option for re...

2016-08-15 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14151#discussion_r74804217 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -533,6 +533,12 @@ object SQLConf {

[GitHub] spark issue #14522: [Spark-16508][SparkR] Split docs for arrange and orderBy...

2016-08-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14522 Yeah LGTM. Merging this to master, branch-2.0 -- Thanks @junyangq --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14468: [SPARK-16671][core][sql] Consolidate code to do variable...

2016-08-15 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14468 Given the overwhelming amount of feedback, merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74805503 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/ca...

2016-08-15 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14579#discussion_r74811199 --- Diff: python/pyspark/rdd.py --- @@ -188,6 +188,12 @@ def __init__(self, jrdd, ctx, jrdd_deserializer=AutoBatchedSerializer(PickleSeri

[GitHub] spark issue #14607: [SPARK-17063] [SQL] Improve performance of MSCK REPAIR T...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14607 **[Test build #63794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63794/consoleFull)** for PR 14607 at commit

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12004 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12004 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12004 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63787/ Test PASSed. ---

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-08-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13758 Thank you for your comment. We are on the same page to address the problem. In this case, you are right. This is because we finally would like to write an primitive int array

[GitHub] spark issue #14626: [SPARK-16519][SPARKR] Handle SparkR RDD generics that cr...

2016-08-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14626 I had one minor question about partitionBy -- otherwise change LGTM. Thanks @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #14151: [SPARK-16496][SQL] Add wholetext as option for re...

2016-08-15 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14151#discussion_r74805700 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala --- @@ -39,6 +39,11 @@ class TextSuite extends QueryTest

[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14644 **[Test build #63792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63792/consoleFull)** for PR 14644 at commit

[GitHub] spark issue #11956: [SPARK-14098][SQL] Generate Java code that gets a float/...

2016-08-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/11956 @davies, thank you for your comment. I hope that you will have bandwidth soon since Spark 2.0 was released. [this PR](https://github.com/apache/spark/pull/13899/files) does the same thing. In

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12004 **[Test build #63789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63789/consoleFull)** for PR 12004 at commit

[GitHub] spark issue #12004: [SPARK-7481][build] [WIP] Add Hadoop 2.6+ spark-cloud mo...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12004 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63789/ Test FAILed. ---

[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14182 **[Test build #63795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63795/consoleFull)** for PR 14182 at commit

[GitHub] spark issue #12436: [SPARK-14649][CORE] DagScheduler should not run duplicat...

2016-08-15 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/12436 See https://issues.apache.org/jira/browse/SPARK-17064 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14469 **[Test build #63796 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63796/consoleFull)** for PR 14469 at commit

[GitHub] spark pull request #14468: [SPARK-16671][core][sql] Consolidate code to do v...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14607: [SPARK-17063] [SQL] Improve performance of MSCK REPAIR T...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14607 **[Test build #63793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63793/consoleFull)** for PR 14607 at commit

[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-15 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/11157 woot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #14371: [SPARK-16736] Core+ SQL superfluous fs calls

2016-08-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/14371#discussion_r74814338 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/FileBasedWriteAheadLog.scala --- @@ -231,13 +232,17 @@ private[streaming] class

[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-15 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14580 @gatorsmile I am trying to understand you comment. Why shouldn't we use `full outer` in combination with `using`? I am under the impression that using is just a bit of syntactic sugar. For

[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...

2016-08-15 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/14557 There are multiple issues with this PR. Some are at a more stylistic level, but some include deeper issues -- e.g. see SPARK-17064. Most fundamentally, this PR is the wrong solution at least

[GitHub] spark pull request #14568: [SPARK-10868] monotonicallyIncreasingId() support...

2016-08-15 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14568#discussion_r74801826 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/MonotonicallyIncreasingID.scala --- @@ -81,3 +91,12 @@ case class

[GitHub] spark pull request #14626: [SPARK-16519][SPARKR] Handle SparkR RDD generics ...

2016-08-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14626#discussion_r74803446 --- Diff: R/pkg/R/generics.R --- @@ -152,9 +146,9 @@ setGeneric("getNumPartitions", function(x) { standardGeneric("getNumPartitions") # @export

[GitHub] spark issue #14568: [SPARK-10868] monotonicallyIncreasingId() supports offse...

2016-08-15 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14568 @tedyu the scala code is shaping up nicely. I do have a question regarding usage. How will this be used? The thing is that the `monotonically_increasing_id` returns an id based on the

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 Based on my understanding, after this PR, we will respect the conf values of `hive.exec.dynamic.partition`, `hive.exec.dynamic.partition.mode` and `hive.exec.compress.output` that are specified

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14469 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14469 LGTM, so I'll merge this to master. Thanks @davies! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/ca...

2016-08-15 Thread MechCoder
Github user MechCoder commented on a diff in the pull request: https://github.com/apache/spark/pull/14579#discussion_r74813935 --- Diff: python/pyspark/rdd.py --- @@ -188,6 +188,12 @@ def __init__(self, jrdd, ctx, jrdd_deserializer=AutoBatchedSerializer(PickleSeri

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14469 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame from dict...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14469 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63796/ Test PASSed. ---

[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14644 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63792/ Test FAILed. ---

[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14644 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14644: [MESOS] Enable GPU support with Mesos

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14644 **[Test build #63792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63792/consoleFull)** for PR 14644 at commit

[GitHub] spark pull request #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optim...

2016-08-15 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14580#discussion_r74808006 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1334,12 +1334,19 @@ object EliminateOuterJoin

[GitHub] spark pull request #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/ca...

2016-08-15 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/14579#discussion_r74811837 --- Diff: python/pyspark/rdd.py --- @@ -188,6 +188,12 @@ def __init__(self, jrdd, ctx, jrdd_deserializer=AutoBatchedSerializer(PickleSeri

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-08-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13758 You are right. I missed `UnsafeArrayData` is a subclass of `ArrayData`. We can pass `UnsafeArrayData` to an projection. I have one question. When we directly generate `UnsafeArrayData`

[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-15 Thread junyangq
Github user junyangq commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74874081 --- Diff: R/pkg/R/SQLContext.R --- @@ -181,7 +181,7 @@ getDefaultSqlSource <- function() { #' @method createDataFrame default #' @note

[GitHub] spark pull request #14659: [SPARK-16757] Set up Spark caller context to HDFS

2016-08-15 Thread Sherry302
GitHub user Sherry302 opened a pull request: https://github.com/apache/spark/pull/14659 [SPARK-16757] Set up Spark caller context to HDFS ## What changes were proposed in this pull request? 1. Pass `jobId` to Task. 2. Invoke Hadoop APIs. A new function

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74876946 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/MultinomialLogisticRegression.scala --- @@ -0,0 +1,626 @@ +/* + * Licensed to the

[GitHub] spark pull request #14660: [SPARK-17071][SQL] Fetch Parquet schema without a...

2016-08-15 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14660 [SPARK-17071][SQL] Fetch Parquet schema without another Spark job when it is a single file to touch ## What changes were proposed in this pull request? It seems Spark executes

[GitHub] spark issue #14607: [SPARK-17063] [SQL] Improve performance of MSCK REPAIR T...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14607 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63794/ Test FAILed. ---

[GitHub] spark pull request #14151: [SPARK-16496][SQL] Add wholetext as option for re...

2016-08-15 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14151#discussion_r74831276 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -533,6 +533,12 @@ object SQLConf {

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74831972 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-08-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r74839612 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -332,6 +378,34 @@ private[spark] object JettyUtils extends Logging {

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74840360 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14182 **[Test build #63795 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63795/consoleFull)** for PR 14182 at commit

[GitHub] spark pull request #14469: [SPARK-16700] [PYSPARK] [SQL] create DataFrame fr...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14469 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14650 **[Test build #63790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63790/consoleFull)** for PR 14650 at commit

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13796 **[Test build #63798 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63798/consoleFull)** for PR 13796 at commit

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74832288 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -945,13 +955,139 @@ class BinaryLogisticRegressionSummary

[GitHub] spark pull request #13796: [SPARK-7159][ML] Add multiclass logistic regressi...

2016-08-15 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13796#discussion_r74832879 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -930,10 +942,8 @@ class BinaryLogisticRegressionSummary

[GitHub] spark issue #14542: [SPARK-16930][yarn] Fix a couple of races in cluster app...

2016-08-15 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14542 Ping, I'll push this tomorrow unless someone complains. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13796 **[Test build #63799 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63799/consoleFull)** for PR 13796 at commit

[GitHub] spark issue #14607: [SPARK-17063] [SQL] Improve performance of MSCK REPAIR T...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14607 **[Test build #63800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63800/consoleFull)** for PR 14607 at commit

[GitHub] spark issue #14580: [SPARK-16991][SQL] Fix `EliminateOuterJoin` optimizer to...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14580 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13796 **[Test build #63798 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63798/consoleFull)** for PR 13796 at commit

[GitHub] spark issue #14651: [SPARK-17065][SQL]Improve the error message when encount...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14651 **[Test build #63801 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63801/consoleFull)** for PR 14651 at commit

[GitHub] spark issue #14651: [SPARK-17065][SQL]Improve the error message when encount...

2016-08-15 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14651 The motivation here is, e.g., if the h2o library is in classpath, even if I don't use it (such as, run `Seq(1).toDS().write.format("parquet").save("foo")`), my code will still fail. The improved

[GitHub] spark issue #14651: [SPARK-17065][SQL]Improve the error message when encount...

2016-08-15 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14651 An alternative fix is just ignoring broken libraries. However, if the user does want to use this library, the exception will be `Failed to find data source: $provider. Please find packages at

[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-08-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r74836512 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala --- @@ -244,7 +254,12 @@ private[ui] class MasterPage(parent: MasterWebUI)

[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-08-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/13950#discussion_r74840528 --- Diff: core/src/test/scala/org/apache/spark/deploy/master/MasterSuite.scala --- @@ -157,6 +157,34 @@ class MasterSuite extends SparkFunSuite }

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13796 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63799/ Test PASSed. ---

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13796 **[Test build #63799 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63799/consoleFull)** for PR 13796 at commit

[GitHub] spark issue #13796: [SPARK-7159][ML] Add multiclass logistic regression to S...

2016-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13796 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-08-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14182 This looks fine to me - @felixcheung feel free to merge this when you think its good to go --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

<    1   2   3   4   5   6   >