[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-13 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r78689714 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +322,29 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-13 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/14961 Confirmed the issue was introduced by https://github.com/netty/netty/commit/d58dec8862e02fc2a98f8dcdb166db4b788be50a#diff-8d83d75ebf8a18cc48bf0a0b1183c188 Add

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78689452 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -330,14 +332,237 @@ class StatisticsSuite extends QueryTest with

[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-13 Thread eyalfa
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/1 @HyukjinKwon, thank you very much for your analysis. if you read the history of this PR you'd see that at some point @hvanhovell suggested that we completely remove CreateStruct and

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14834 **[Test build #65355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65355/consoleFull)** for PR 14834 at commit

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-13 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14834 @dbtsai Thanks for your review. I addressed all but one comment, which I left a follow up on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14834 **[Test build #65354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65354/consoleFull)** for PR 14834 at commit

[GitHub] spark issue #15092: [SPARK-17142][SQL] Complex query triggers binding error ...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15092 **[Test build #65353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65353/consoleFull)** for PR 15092 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78688956 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #15092: [SPARK-17142][SQL] Complex query triggers binding...

2016-09-13 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/15092 [SPARK-17142][SQL] Complex query triggers binding error in HashAggregateExec [BACKPORT 2.0] ## What changes were proposed in this pull request? This PR backports #14917 to branch-2.0.

[GitHub] spark issue #14926: [SPARK-17365][Core] Remove/Kill multiple executors toget...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14926 **[Test build #65352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65352/consoleFull)** for PR 14926 at commit

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r78688637 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -595,55 +831,104 @@ class LogisticRegressionModel

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78688327 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to the

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r78688210 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -508,11 +680,42 @@ object LogisticRegression extends

[GitHub] spark issue #15091: [Core][Doc]:remove redundant comment

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15091 **[Test build #65351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65351/consoleFull)** for PR 15091 at commit

[GitHub] spark pull request #15091: [Core][Doc]:remove redundant comment

2016-09-13 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/15091 [Core][Doc]:remove redundant comment ## What changes were proposed in this pull request? In the comment, there is redundant `the estimated`. This PR simply remove the redundant

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78687900 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf) extends

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r7868 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/1 cc @shivaram Would this be sensible if we print the results if R tests failed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78687294 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf) extends

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78687128 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -457,6 +457,20 @@ class DataFrameReaderWriterSuite

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78687147 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78687123 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala --- @@ -322,6 +325,14 @@ class CatalogSuite assert(e2.message ==

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78687075 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2661,4 +2661,15 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14981 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14981 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65345/ Test FAILed. ---

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15090 Like Hive, I think we should implement a built-in function, `compute_stats`. Then, the implementation of `AnalyzeColumnCommand` will be much cleaner. --- If your project is set up for it, you

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14981 **[Test build #65345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65345/consoleFull)** for PR 14981 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78686975 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to the

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78686868 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -457,6 +457,20 @@ class DataFrameReaderWriterSuite

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78686835 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala --- @@ -322,6 +325,14 @@ class CatalogSuite assert(e2.message ==

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78686776 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2661,4 +2661,15 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78686462 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/1 I see. It seems using `struct(...)` does not print `struct(...)` but `named_struct(...)` as specified in `CreateNamedStruct`. So, the code below: ```scala scala>

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78685367 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78685328 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78685262 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78685116 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark issue #14962: [SPARK-17402][SQL] separate the management of temp views...

2016-09-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14962 Is it possible to first have a PR to fix the bugs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78684701 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14971 **[Test build #65350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65350/consoleFull)** for PR 14971 at commit

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14118 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65343/ Test PASSed. ---

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14118 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14118 **[Test build #65343 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65343/consoleFull)** for PR 14118 at commit

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14971 @hvanhovell @cloud-fan Could you help me review this PR? https://github.com/apache/spark/pull/15090 is changing the same code path for column-level statistics. Thanks! --- If your

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14971 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r78683972 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -330,14 +332,237 @@ class StatisticsSuite extends QueryTest with

[GitHub] spark pull request #15026: [SPARK-17472] [PYSPARK] Better error message for ...

2016-09-13 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/15026#discussion_r78683777 --- Diff: python/pyspark/broadcast.py --- @@ -75,7 +75,13 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None):

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15090 **[Test build #65349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65349/consoleFull)** for PR 15090 at commit

[GitHub] spark pull request #14962: [SPARK-17402][SQL] separate the management of tem...

2016-09-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14962#discussion_r78683471 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -439,7 +439,7 @@ class Analyzer( object

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 @junyangq As we discussed before, lets open a new PR for 2.0 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14980 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15073: [SPARK-17518] [SQL] Block Users to Specify the Internal ...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15073 **[Test build #65348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65348/consoleFull)** for PR 15073 at commit

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15090 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65347/ Test FAILed. ---

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15090 **[Test build #65347 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65347/consoleFull)** for PR 15090 at commit

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15090 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65342/ Test PASSed. ---

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15090 **[Test build #65347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65347/consoleFull)** for PR 15090 at commit

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15085 **[Test build #65342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65342/consoleFull)** for PR 15085 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-13 Thread wzhfy
GitHub user wzhfy opened a pull request: https://github.com/apache/spark/pull/15090 [SPARK-17073] [SQL] generate column-level statistics ## What changes were proposed in this pull request? Generate basic column statistics for all the atomic types: - numeric types: max,

[GitHub] spark issue #15042: [SPARK-17449] [Documentation] [Relation between heartbea...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15042 **[Test build #65346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65346/consoleFull)** for PR 15042 at commit

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r78682701 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +322,29 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark issue #15059: [SPARK-17506][SQL] Improve the check double values equal...

2016-09-13 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15059 Moving generic testing utils from mllib to common looks OK to me. Actually we have ```TestingUtils``` under both spark.ml.util and spark.mllib.util. If we would like to move, we should remove

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 Thanks @junyangq and @felixcheung - Merging this into master once the AppVeyor check passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14981 **[Test build #65345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65345/consoleFull)** for PR 14981 at commit

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14980 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14980 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65344/ Test PASSed. ---

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14980 **[Test build #65344 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65344/consoleFull)** for PR 14980 at commit

[GitHub] spark issue #15000: [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspa...

2016-09-13 Thread apetresc
Github user apetresc commented on the issue: https://github.com/apache/spark/pull/15000 @srowen: Just to make sure I understand, are you asking me to remove the Java accessor here, and just plumb straight through to the Scala object from PySpark? Or is it fine as-is? --- If your

[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15035 We definitely shouldn't change SpecificMutableRow to do this upcast; otherwise we might introduce subtle bugs with type mismatches in the future. cc @sameeragarwal to see if there is a better

[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14980 **[Test build #65344 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65344/consoleFull)** for PR 14980 at commit

[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette

2016-09-13 Thread junyangq
Github user junyangq commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r78679227 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -385,22 +385,29 @@ head(result[order(result$max_mpg, decreasing = TRUE), ]) Similar to

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15085 **[Test build #65337 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65337/consoleFull)** for PR 15085 at commit

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65337/ Test FAILed. ---

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14974: [Trivial][ML] Remove unnecessary `new` before cas...

2016-09-13 Thread zhengruifeng
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/14974 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14118 **[Test build #65343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65343/consoleFull)** for PR 14118 at commit

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/14118 @HyukjinKwon thanks for the information! @srowen yea I still think this is good to go. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #14118: [SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV ca...

2016-09-13 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/14118 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15060: [SPARK-17507][ML][MLLib] check weight vector size in ANN

2016-09-13 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/15060 @srowen the `weight` by default will randomly generated and it will automatically match the size, only when it is specified by user it will need this check... now the modification here seems

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15043 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65341/ Test FAILed. ---

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65340/ Test PASSed. ---

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15043 **[Test build #65341 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65341/consoleFull)** for PR 15043 at commit

[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15089 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15089: [SPARK-15621] [SQL] Support spilling for Python UDF

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15089 **[Test build #65340 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65340/consoleFull)** for PR 15089 at commit

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15085 **[Test build #65342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65342/consoleFull)** for PR 15085 at commit

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-13 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14834 Only couple minor issues; otherwise, LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r78674556 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala --- @@ -201,11 +201,24 @@ abstract class

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r7867 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -676,39 +936,54 @@ object LogisticRegressionModel extends

[GitHub] spark pull request #15085: [SPARK-17484] Prevent invalid block locations fro...

2016-09-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/15085#discussion_r78674398 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -857,9 +862,11 @@ private[spark] class BlockManager( val

[GitHub] spark pull request #15085: [SPARK-17484] Prevent invalid block locations fro...

2016-09-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/15085#discussion_r78674369 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -857,9 +862,11 @@ private[spark] class BlockManager( val

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r78674092 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -595,55 +831,104 @@ class LogisticRegressionModel

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/14834#discussion_r78673689 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala --- @@ -508,11 +680,42 @@ object LogisticRegression extends

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15085 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65339/ Test PASSed. ---

[GitHub] spark issue #15085: [SPARK-17484] Prevent invalid block locations from being...

2016-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15085 **[Test build #65339 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65339/consoleFull)** for PR 15085 at commit

[GitHub] spark issue #14691: [SPARK-16407][STREAMING] Allow users to supply custom st...

2016-09-13 Thread jodersky
Github user jodersky commented on the issue: https://github.com/apache/spark/pull/14691 I like the idea! This is might not be the best place to start a discussion, but I reckon that the sink provider api could also eventually be used to provision builtin sinks. It would make the

[GitHub] spark issue #15088: SPARK-17532: Add lock debugging info to thread dumps.

2016-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65336/ Test PASSed. ---

  1   2   3   4   5   >