[GitHub] spark pull request #16776: [SPARK-19436][SQL] Add missing tests for approxQu...

2017-02-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16776#discussion_r100711390 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala --- @@ -63,44 +63,49 @@ final class DataFrameStatFunctions private

[GitHub] spark pull request #16741: [SPARK-19402][DOCS] Support LaTex inline formula ...

2017-02-13 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16741#discussion_r100900474 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -135,13 +135,13 @@ abstract class MLWriter extends BaseReadWrite with

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-02-15 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16512 @shivaram No need to revert this, but why was this merged into 2.1? It's a new API, and we generally don't backport new APIs. --- If your project is set up for it, you can reply to

[GitHub] spark issue #16623: [SPARK-19066][SPARKR][Backport-2.1]:LDA doesn't set opti...

2017-02-16 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16623 @felixcheung Just saw this was backported to 2.1.1. Since this is a fairly significant behavioral change, I recommend we revert this backport. I could imagine workloads working with EM but

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698315 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698326 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698328 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698331 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698323 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698338 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698335 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698340 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698347 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698262 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698270 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-16 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r101698275 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,327 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #16623: [SPARK-19066][SPARKR][Backport-2.1]:LDA doesn't set opti...

2017-02-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16623 Hm, true, this is a weird case, where it is somewhere between a behavior change and a bug fix. You're right---let's not revert this patch. I do worry about other patches in Sp

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102273266 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102273256 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 > wrap the old AssociationRules code Sorry, forget this comment from me; I was thinking that something like FPGrowthModel.transform had already been implemented, but it&#x

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102505813 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535103 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535151 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598358 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102599281 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535190 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102599552 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535137 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535131 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598249 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102595600 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535118 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535156 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598955 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598078 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598022 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598758 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535175 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102594866 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102599019 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102535113 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102598489 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,341 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102792331 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102792088 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102792306 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102845588 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102845724 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r102856964 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #16772: [SPARK-14772][PYTHON][ML] Fixed Params.copy method to ma...

2017-02-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16772 LGTM Merging with master @BryanCutler would you mind sending a backport PR against branch-2.1 to run Jenkins tests? Thank you! --- If your project is set up for it, you can

[GitHub] spark issue #16772: [SPARK-14772][PYTHON][ML] Fixed Params.copy method to ma...

2017-02-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16772 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16772: [SPARK-14772][PYTHON][ML] Fixed Params.copy method to ma...

2017-02-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16772 well...will merge after new tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12888: [SPARK-14772][ML,PySpark]Python ML Params.copy treats ui...

2017-02-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/12888 @hujy Thank you for sending this PR, and apologies for not seeing it earlier. Since the other PR for this JIRA is ready to merge, could you please close this issue? Thanks again! --- If your

[GitHub] spark issue #16772: [SPARK-14772][PYTHON][ML] Fixed Params.copy method to ma...

2017-02-23 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16772 OK merging now. @BryanCutler do let me know if you don't have time to send a backport--thanks! --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I agree that, if the set of rules is small (1-2 GB max), then collecting and broadcasting it is best. But for larger sets of rules, we'd have to keep it distributed. I'm very

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r103050909 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,347 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I don't think we need to support the default prediction (for empty/null inputs) now. I agree we could use an inputer or add something as an option later on. Will take a final loo

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r103068619 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,339 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r103068598 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,339 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #17069: [MINOR][ML][DOC] Document default value for Gener...

2017-02-25 Thread jkbradley
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/17069 [MINOR][ML][DOC] Document default value for GeneralizedLinearRegression.linkPower Add Scaladoc for GeneralizedLinearRegression.linkPower default value Follow-up to https://github.com

[GitHub] spark issue #17069: [MINOR][ML][DOC] Document default value for GeneralizedL...

2017-02-25 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/17069 CC @actuaryzhang @yanboliang Just noticed that the default is missing in the Scaladoc. (Thanks btw for adding Tweedie support!) --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15770: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2017-02-25 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15770 Sorry for my absence from recent conversation! I agree there is no clear answer for handling input and output schema. Some options: * Option 1: same as RDD/GraphX-based API

[GitHub] spark issue #17048: [SPARK-14772][PYTHON][ML] Fixed Params.copy method to ma...

2017-02-25 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/17048 LGTM Merging with branch-2.1 Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-26 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16782 Thanks @BryanCutler for the patch! The fix looks reasonable to me, but let me try to check with @davies to confirm. If this is the right approach, then I think we should update the other

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16784#discussion_r103278729 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -203,6 +227,8 @@ class LinearSVCSuite extends SparkFunSuite

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16784#discussion_r103278715 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -234,7 +261,12 @@ object LinearSVCSuite { val yD

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16784#discussion_r103278706 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -220,12 +246,13 @@ object LinearSVCSuite

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15211 You have a good point about setting user expectations about speed and scalability. I don't think that the average user needs to understand the underlying implementation, but perfor

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93685938 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686143 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686377 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686305 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686501 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686612 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93686797 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #12420: [SPARK-14585][ML][WIP] Provide accessor methods for Pipe...

2016-12-22 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/12420 @pravingadakh Will you be able to continue with this? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955440 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,525 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955620 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955710 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955672 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,525 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955813 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,525 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955845 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93955921 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15211 Thank you for the updates! I just followed up on a few items from the updates. I'll try to take a look at the internals later today. --- If your project is set up for it, you can

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93956291 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15211 Btw, @hhbyyh @yanboliang do either of you know of resources proving convergence rates for OWL-QN for the hinge loss? The original OWL-QN paper's analysis does not handle hinge loss. -

[GitHub] spark issue #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15211 Btw, I'm making a review pass now for the internals. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93981891 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93975460 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93972718 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93982401 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93979950 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93975729 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93975041 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93981539 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93980975 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93974665 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93974973 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93972268 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93973041 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15211: [SPARK-14709][ML] spark.ml API for linear SVM

2016-12-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15211#discussion_r93982121 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -0,0 +1,558 @@ +/* + * Licensed to the Apache Software

<    1   2   3   4   5   6   7   8   9   10   >