spark git commit: [SPARK-4588] ML Attributes

2015-03-12 Thread meng
/pull/4925/files?diff=unified#diff-95e7f5060429f189460b44a3f8731a35R24 More details can be found in the design doc. srowen Could you help review this PR? There are many lines but most of them are boilerplate code. Author: Xiangrui Meng m...@databricks.com Author: Sean Owen so...@cloudera.com

spark git commit: [mllib] [python] Add LassoModel to __all__ in regression.py

2015-03-12 Thread meng
K. Bradley jos...@databricks.com Closes #4970 from jkbradley/SPARK-6253 and squashes the following commits: c2cb533 [Joseph K. Bradley] Add LassoModel to __all__ in regression.py (cherry picked from commit 17c309c87e78da145dc358514150ec5700eed8f0) Signed-off-by: Xiangrui Meng m...@databricks.com

spark git commit: [mllib] [python] Add LassoModel to __all__ in regression.py

2015-03-12 Thread meng
16:46:29 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Mar 12 16:46:29 2015 -0700 -- python/pyspark/mllib/regression.py | 6 -- 1 file changed, 4 insertions(+), 2 deletions

spark git commit: [SPARK-5986][MLLib] Add save/load for k-means

2015-03-11 Thread meng
: 2672374 Author: Xusen Yin yinxu...@gmail.com Authored: Wed Mar 11 00:24:55 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Mar 11 00:24:55 2015 -0700 -- .../spark/mllib/clustering/KMeansModel.scala| 68

spark git commit: [branch-1.0][SPARK-4355] ColumnStatisticsAggregator doesn't merge mean correctly

2015-03-09 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.0 e751f8f26 - 0afb04250 [branch-1.0][SPARK-4355] ColumnStatisticsAggregator doesn't merge mean correctly This backports the bug fix in #3220. Author: Xiangrui Meng m...@databricks.com Closes #3850 from mengxr/SPARK-4355-1.0

spark git commit: [SPARK-6090][MLLIB] add a basic BinaryClassificationMetrics to PySpark/MLlib

2015-03-05 Thread meng
are not supported in this PR. davies If we recognize Scala's `Product`s in Py4J, we can easily add wrappers for Scala methods that returns `RDD[(Double, Double)]`. Is it easy to register serializer for `Product` in PySpark? Author: Xiangrui Meng m...@databricks.com Closes #4863 from mengxr/SPARK-6090

spark git commit: [SPARK-6141][MLlib] Upgrade Breeze from 0.10 to 0.11 to fix convergence bug

2015-03-03 Thread meng
://github.com/scalanlp/breeze/pull/373#issuecomment-76879760 Author: Xiangrui Meng m...@databricks.com Author: DB Tsai dbt...@alpinenow.com Author: DB Tsai dbt...@dbtsai.com Closes #4879 from dbtsai/breeze and squashes the following commits: d848f65 [DB Tsai] Merge pull request #1 from mengxr

spark git commit: [SPARK-6141][MLlib] Upgrade Breeze from 0.10 to 0.11 to fix convergence bug

2015-03-03 Thread meng
://github.com/scalanlp/breeze/pull/373#issuecomment-76879760 Author: Xiangrui Meng m...@databricks.com Author: DB Tsai dbt...@alpinenow.com Author: DB Tsai dbt...@dbtsai.com Closes #4879 from dbtsai/breeze and squashes the following commits: d848f65 [DB Tsai] Merge pull request #1 from mengxr

spark git commit: [SPARK-6121][SQL][MLLIB] simpleString for UDT

2015-03-02 Thread meng
Repository: spark Updated Branches: refs/heads/master e3a88d110 - 2db6a853a [SPARK-6121][SQL][MLLIB] simpleString for UDT `df.dtypes` shows `null` for UDTs. This PR uses `udt` by default and `VectorUDT` overwrites it with `vector`. jkbradley davies Author: Xiangrui Meng m...@databricks.com

spark git commit: [SPARK-6121][SQL][MLLIB] simpleString for UDT

2015-03-02 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 ea69cf28e - 1b8ab5752 [SPARK-6121][SQL][MLLIB] simpleString for UDT `df.dtypes` shows `null` for UDTs. This PR uses `udt` by default and `VectorUDT` overwrites it with `vector`. jkbradley davies Author: Xiangrui Meng m

spark git commit: [SPARK-5537] Add user guide for multinomial logistic regression

2015-03-02 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 1b8ab5752 - 11389f026 [SPARK-5537] Add user guide for multinomial logistic regression This is based on #4801 from dbtsai. The linear method guide is re-organized a little bit for this change. Closes #4801 Author: Xiangrui Meng m

spark git commit: [SPARK-6097][MLLIB] Support tree model save/load in PySpark/MLlib

2015-03-02 Thread meng
Repository: spark Updated Branches: refs/heads/master 54d19689f - 7e53a79c3 [SPARK-6097][MLLIB] Support tree model save/load in PySpark/MLlib Similar to `MatrixFactorizaionModel`, we only need wrappers to support save/load for tree models in Python. jkbradley Author: Xiangrui Meng m

spark git commit: [SPARK-5537][MLlib][Docs] Add user guide for multinomial logistic regression

2015-03-02 Thread meng
: refs/heads/master Commit: b196056190c569505cc32669d1aec30ed9d70665 Parents: c2fe3a6 Author: DB Tsai dbt...@alpinenow.com Authored: Mon Mar 2 22:37:12 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Mon Mar 2 22:37:12 2015 -0800

spark git commit: [SPARK-6097][MLLIB] Support tree model save/load in PySpark/MLlib

2015-03-02 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 4e6e0086c - 62c53be2a [SPARK-6097][MLLIB] Support tree model save/load in PySpark/MLlib Similar to `MatrixFactorizaionModel`, we only need wrappers to support save/load for tree models in Python. jkbradley Author: Xiangrui Meng m

spark git commit: [SPARK-6080] [PySpark] correct LogisticRegressionWithLBFGS regType parameter for pyspark

2015-03-02 Thread meng
/pyspark_classification and squashes the following commits: 12db65a [Yanbo Liang] correct LogisticRegressionWithLBFGS regType parameter for pyspark (cherry picked from commit af2effdd7b54316af0c02e781911acfb148b962b) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org

spark git commit: [SPARK-6053][MLLIB] support save/load in PySpark's ALS

2015-03-01 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 bb1661872 - b570d98e8 [SPARK-6053][MLLIB] support save/load in PySpark's ALS A simple wrapper to save/load `MatrixFactorizationModel` in Python. jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4811 from mengxr/SPARK-5991

spark git commit: [SPARK-6053][MLLIB] support save/load in PySpark's ALS

2015-03-01 Thread meng
Repository: spark Updated Branches: refs/heads/master fd8d283ee - aedbbaa3d [SPARK-6053][MLLIB] support save/load in PySpark's ALS A simple wrapper to save/load `MatrixFactorizationModel` in Python. jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4811 from mengxr/SPARK-5991

spark git commit: [SPARK-6083] [MLLib] [DOC] Make Python API example consistent in NaiveBayes

2015-03-01 Thread meng
/3f00bb3e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3f00bb3e Branch: refs/heads/master Commit: 3f00bb3ef1384fabf86a68180d40a1a515f6f5e3 Parents: aedbbaa Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Sun Mar 1 16:28:15 2015 -0800 Committer: Xiangrui Meng m...@databricks.com

spark git commit: [SPARK-6083] [MLLib] [DOC] Make Python API example consistent in NaiveBayes

2015-03-01 Thread meng
[MechCoder] Add parse function 65bbbe9 [MechCoder] [SPARK-6083] Make Python API example consistent in NaiveBayes (cherry picked from commit 3f00bb3ef1384fabf86a68180d40a1a515f6f5e3) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

spark git commit: [SPARK-4587] [mllib] [docs] Fixed save, load calls in ML guide examples

2015-02-27 Thread meng
and squashes the following commits: 83d369d [Joseph K. Bradley] added comment to save,load parts of ML guide examples 2841170 [Joseph K. Bradley] Fixed save,load calls in ML guide examples (cherry picked from commit d17cb2ba33b363dd346ac5a5681e1757decd0f4d) Signed-off-by: Xiangrui Meng m

spark git commit: [SPARK-4587] [mllib] [docs] Fixed save, load calls in ML guide examples

2015-02-27 Thread meng
: Fri Feb 27 13:00:36 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Fri Feb 27 13:00:36 2015 -0800 -- docs/mllib-collaborative-filtering.md | 10 +--- docs/mllib-decision-tree.md | 20

spark git commit: [SPARK-5996][SQL] Fix specialized outbound conversions

2015-02-25 Thread meng
] [SPARK-5996][SQL] Fix specialized outbound conversions (cherry picked from commit f84c799ea0b82abca6a4fad39532c2515743b632) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

spark git commit: [SPARK-5996][SQL] Fix specialized outbound conversions

2015-02-25 Thread meng
/f84c799e Branch: refs/heads/master Commit: f84c799ea0b82abca6a4fad39532c2515743b632 Parents: dd077ab Author: Michael Armbrust mich...@databricks.com Authored: Wed Feb 25 10:13:40 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Feb 25 10:13:40 2015 -0800

spark git commit: [SPARK-5974] [SPARK-5980] [mllib] [python] [docs] Update ML guide with save/load, Python GBT

2015-02-25 Thread meng
K. Bradley jos...@databricks.com Authored: Wed Feb 25 16:13:17 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Feb 25 16:13:17 2015 -0800 -- docs/mllib-classification-regression.md | 9 +- docs/mllib

spark git commit: [SPARK-5974] [SPARK-5980] [mllib] [python] [docs] Update ML guide with save/load, Python GBT

2015-02-25 Thread meng
d20559b157743981b9c09e286f2aaff8cbefab59) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a1b4856e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a1b4856e Diff: http://git-wip-us.apache.org/repos/asf/spark

spark git commit: [SPARK-5976][MLLIB] Add partitioner to factors returned by ALS

2015-02-25 Thread meng
. In the new implementation, we didn't set partitioners in the factors returned by ALS, which would cause performance regression. srowen coderxiang Author: Xiangrui Meng m...@databricks.com Closes #4748 from mengxr/SPARK-5976 and squashes the following commits: 9373a09 [Xiangrui Meng] add

spark git commit: [SPARK-5976][MLLIB] Add partitioner to factors returned by ALS

2015-02-25 Thread meng
. In the new implementation, we didn't set partitioners in the factors returned by ALS, which would cause performance regression. srowen coderxiang Author: Xiangrui Meng m...@databricks.com Closes #4748 from mengxr/SPARK-5976 and squashes the following commits: 9373a09 [Xiangrui Meng] add

spark git commit: [MLLIB] Change x_i to y_i in Variance's user guide

2015-02-24 Thread meng
Repository: spark Updated Branches: refs/heads/master 6d2caa576 - 105791e35 [MLLIB] Change x_i to y_i in Variance's user guide Variance is calculated on labels/responses. Author: Xiangrui Meng m...@databricks.com Closes #4740 from mengxr/patch-1 and squashes the following commits: 673317b

spark git commit: [MLLIB] Change x_i to y_i in Variance's user guide

2015-02-24 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 eaf7bf98a - a4ff445a9 [MLLIB] Change x_i to y_i in Variance's user guide Variance is calculated on labels/responses. Author: Xiangrui Meng m...@databricks.com Closes #4740 from mengxr/patch-1 and squashes the following commits

spark git commit: [SPARK-5939][MLLib] make FPGrowth example app take parameters

2015-02-23 Thread meng
651a1c019eb911005e234a46cc559d63da352377) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/33b90848 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/33b90848 Diff: http://git-wip-us.apache.org/repos

spark git commit: [SPARK-5939][MLLib] make FPGrowth example app take parameters

2015-02-23 Thread meng
Author: Jacky Li jacky.li...@huawei.com Authored: Mon Feb 23 08:47:28 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Mon Feb 23 08:47:28 2015 -0800 -- data/mllib/sample_fpgrowth.txt | 6

spark git commit: [SPARK-5912] [docs] [mllib] Small fixes to ChiSqSelector docs

2015-02-23 Thread meng
/spark/diff/59536cc8 Branch: refs/heads/master Commit: 59536cc87e10e5011560556729dd901280958f43 Parents: 28ccf5e Author: Joseph K. Bradley jos...@databricks.com Authored: Mon Feb 23 16:15:57 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Mon Feb 23 16:15:57 2015 -0800

spark git commit: [SPARK-5912] [docs] [mllib] Small fixes to ChiSqSelector docs

2015-02-23 Thread meng
to guide 3f3f9f4 [Joseph K. Bradley] small fixes to ChiSqSelector docs (cherry picked from commit 59536cc87e10e5011560556729dd901280958f43) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark

spark git commit: [SPARK-5958][MLLIB][DOC] update block matrix user guide

2015-02-23 Thread meng
Meng m...@databricks.com Closes #4737 from mengxr/update-block-matrix-user-guide and squashes the following commits: 70f53ac [Xiangrui Meng] update block matrix user guide (cherry picked from commit cf2e41653de778dc8db8b03385a053aae1152e19) Signed-off-by: Xiangrui Meng m...@databricks.com

spark git commit: [SPARK-5958][MLLIB][DOC] update block matrix user guide

2015-02-23 Thread meng
Repository: spark Updated Branches: refs/heads/master 1ed57086d - cf2e41653 [SPARK-5958][MLLIB][DOC] update block matrix user guide * Removed SVD code from examples. * Corrected Java API doc link. * Updated variable names: `AtransposeA` - `ata`. * Minor changes. brkyvz Author: Xiangrui Meng

[2/2] spark git commit: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] Doc cleanups for 1.3 release

2015-02-20 Thread meng
] updated programming guide for ml and mllib (cherry picked from commit 4a17eedb16343413e5b6f8bb58c6da8952ee7ab6) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8c12f311 Tree: http

[1/2] spark git commit: [SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] Doc cleanups for 1.3 release

2015-02-20 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 0382dcc0a - 8c12f3114 http://git-wip-us.apache.org/repos/asf/spark/blob/8c12f311/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleTextClassificationPipeline.java

spark git commit: [SPARK-5902] [ml] Made PipelineStage.transformSchema public instead of private to ml

2015-02-19 Thread meng
of transformSchema protected as well fdaf26a [Joseph K. Bradley] Made PipelineStage.transformSchema protected instead of private[ml] (cherry picked from commit a5fed34355b403188ad50b567ab62ee54597b493) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf

spark git commit: [SPARK-5900][MLLIB] make PIC and FPGrowth Java-friendly

2015-02-19 Thread meng
`/`cluster` and `FreqItemset`/`items`/`freq`. Please let me know if there are better suggestions. CC: jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4695 from mengxr/SPARK-5900 and squashes the following commits: 865b5ca [Xiangrui Meng] make Assignment serializable cffa96e

spark git commit: [SPARK-5900][MLLIB] make PIC and FPGrowth Java-friendly

2015-02-19 Thread meng
`/`cluster` and `FreqItemset`/`items`/`freq`. Please let me know if there are better suggestions. CC: jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4695 from mengxr/SPARK-5900 and squashes the following commits: 865b5ca [Xiangrui Meng] make Assignment serializable cffa96e [Xiangrui

spark git commit: [SPARK-5507] Added documentation for BlockMatrix

2015-02-18 Thread meng
] [SPARK-5507] Added documentation for BlockMatrix (cherry picked from commit a8eb92dcb9ab1e6d8a34eed9a8fddeda645b5094) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/56f8f295

spark git commit: [SPARK-5519][MLLIB] add user guide with example code for fp-growth

2015-02-18 Thread meng
class to wrap the return pair to make it Java friendly. Author: Xiangrui Meng m...@databricks.com Closes #4661 from mengxr/SPARK-5519 and squashes the following commits: 58ccc25 [Xiangrui Meng] add user guide with example code for fp-growth (cherry picked from commit

spark git commit: [SPARK-5507] Added documentation for BlockMatrix

2015-02-18 Thread meng
/a8eb92dc Branch: refs/heads/master Commit: a8eb92dcb9ab1e6d8a34eed9a8fddeda645b5094 Parents: 85e9d09 Author: Burak Yavuz brk...@gmail.com Authored: Wed Feb 18 10:11:08 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Feb 18 10:11:08 2015 -0800

spark git commit: [SPARK-5519][MLLIB] add user guide with example code for fp-growth

2015-02-18 Thread meng
class to wrap the return pair to make it Java friendly. Author: Xiangrui Meng m...@databricks.com Closes #4661 from mengxr/SPARK-5519 and squashes the following commits: 58ccc25 [Xiangrui Meng] add user guide with example code for fp-growth Project: http://git-wip-us.apache.org/repos/asf/spark

spark git commit: [SPARK-5879][MLLIB] update PIC user guide and add a Java example

2015-02-18 Thread meng
for this issue. Author: Xiangrui Meng m...@databricks.com Closes #4680 from mengxr/SPARK-5897 and squashes the following commits: 847d216 [Xiangrui Meng] apache header 87719a2 [Xiangrui Meng] remove PIC image 2dd921f [Xiangrui Meng] update PIC user guide and add a Java example (cherry picked from

spark git commit: [SPARK-5879][MLLIB] update PIC user guide and add a Java example

2015-02-18 Thread meng
for this issue. Author: Xiangrui Meng m...@databricks.com Closes #4680 from mengxr/SPARK-5897 and squashes the following commits: 847d216 [Xiangrui Meng] apache header 87719a2 [Xiangrui Meng] remove PIC image 2dd921f [Xiangrui Meng] update PIC user guide and add a Java example Project: http://git-wip

spark git commit: [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM

2015-02-17 Thread meng
: Xiangrui Meng m...@databricks.com Closes #4647 from mengxr/SPARK-5858 and squashes the following commits: 036dc7f [Xiangrui Meng] remove unnecessary first() call 12c5548 [Xiangrui Meng] check numFeatures only once Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip

spark git commit: [SPARK-5858][MLLIB] Remove unnecessary first() call in GLM

2015-02-17 Thread meng
. Author: Xiangrui Meng m...@databricks.com Closes #4647 from mengxr/SPARK-5858 and squashes the following commits: 036dc7f [Xiangrui Meng] remove unnecessary first() call 12c5548 [Xiangrui Meng] check numFeatures only once (cherry picked from commit c76da36c2163276b5c34e59fbb139eeb34ed0faa) Signed

spark git commit: [SPARK-5802][MLLIB] cache transformed data in glm

2015-02-16 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 d0701d9bf - dfe0fa01c [SPARK-5802][MLLIB] cache transformed data in glm If we need to transform the input data, we should cache the output to avoid re-computing feature vectors every iteration. dbtsai Author: Xiangrui Meng m

spark git commit: [SPARK-5802][MLLIB] cache transformed data in glm

2015-02-16 Thread meng
Repository: spark Updated Branches: refs/heads/master d380f324c - fd84229e2 [SPARK-5802][MLLIB] cache transformed data in glm If we need to transform the input data, we should cache the output to avoid re-computing feature vectors every iteration. dbtsai Author: Xiangrui Meng m

spark git commit: [Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop

2015-02-16 Thread meng
: http://git-wip-us.apache.org/repos/asf/spark/diff/d51d6ba1 Branch: refs/heads/master Commit: d51d6ba1547ae75ac76c9e6d8ea99e937eb7d09f Parents: c78a12c Author: Peter Rudenko petro.rude...@gmail.com Authored: Mon Feb 16 00:07:23 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Mon

spark git commit: [Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop

2015-02-16 Thread meng
to declaration c5f3265 [Peter Rudenko] [Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop (cherry picked from commit d51d6ba1547ae75ac76c9e6d8ea99e937eb7d09f) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git

spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

2015-02-15 Thread meng
/master Commit: 836577b382695558f5c97d94ee725d0156ebfad2 Parents: 61eb126 Author: Sean Owen so...@cloudera.com Authored: Sun Feb 15 09:15:48 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Sun Feb 15 09:15:48 2015 -0800

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

2015-02-15 Thread meng
: 61eb12674b90143388a01c22bf51cb7d02ab0447 Parents: c771e47 Author: martinzapletal zapletal-mar...@email.cz Authored: Sun Feb 15 09:10:03 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Sun Feb 15 09:10:03 2015 -0800

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

2015-02-15 Thread meng
and Java (cherry picked from commit 61eb12674b90143388a01c22bf51cb7d02ab0447) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d96e188c Tree: http://git-wip-us.apache.org/repos/asf

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

2015-02-15 Thread meng
Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf7d708 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf7d708 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9cf7d708 Branch: refs

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

2015-02-15 Thread meng
arguments. The trade-off is discussed in the design doc of SPARK-4586. Generated doc: ![screen shot 2015-02-12 at 3 06 58 am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png) CC: davies rxin Author: Xiangrui Meng m...@databricks.com Closes #4564

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

2015-02-15 Thread meng
arguments. The trade-off is discussed in the design doc of SPARK-4586. Generated doc: ![screen shot 2015-02-12 at 3 06 58 am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png) CC: davies rxin Author: Xiangrui Meng m...@databricks.com Closes #4564

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

2015-02-15 Thread meng
DoubleMatrix (cherry picked from commit acf2558dc92901c342262c35eebb95f2a9b7a9ae) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/db3c539f Tree: http://git-wip-us.apache.org/repos

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

2015-02-15 Thread meng
: acf2558dc92901c342262c35eebb95f2a9b7a9ae Parents: cd4a153 Author: Sean Owen so...@cloudera.com Authored: Sun Feb 15 20:41:27 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Sun Feb 15 20:41:27 2015 -0800

spark git commit: [SPARK-5503][MLLIB] Example code for Power Iteration Clustering

2015-02-13 Thread meng
Branch: refs/heads/master Commit: e1a1ff8108463ca79299ec0eb555a0c8db9dffa0 Parents: c0ccd25 Author: sboeschhuawei stephen.boe...@huawei.com Authored: Fri Feb 13 09:45:57 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Fri Feb 13 09:45:57 2015 -0800

spark git commit: [SPARK-5503][MLLIB] Example code for Power Iteration Clustering

2015-02-13 Thread meng
[sboeschhuawei] placeholder for pic examples (cherry picked from commit e1a1ff8108463ca79299ec0eb555a0c8db9dffa0) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5e639422 Tree

spark git commit: SPARK-5805 Fixed the type error in documentation.

2015-02-13 Thread meng
from emres/SPARK-5805 and squashes the following commits: 1029f66 [Emre Sevinç] SPARK-5805 Fixed the type error in documentation. (cherry picked from commit 9f31db061019414a964aac432e946eac61f8307c) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf

spark git commit: SPARK-5805 Fixed the type error in documentation.

2015-02-13 Thread meng
/spark/tree/9f31db06 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9f31db06 Branch: refs/heads/master Commit: 9f31db061019414a964aac432e946eac61f8307c Parents: 077eec2 Author: Emre Sevinç emre.sev...@gmail.com Authored: Fri Feb 13 12:31:27 2015 -0800 Committer: Xiangrui Meng m

spark git commit: [SPARK-5806] re-organize sections in mllib-clustering.md

2015-02-13 Thread meng
Repository: spark Updated Branches: refs/heads/master 2e0c08452 - cc56c8729 [SPARK-5806] re-organize sections in mllib-clustering.md Put example code close to the algorithm description. Author: Xiangrui Meng m...@databricks.com Closes #4598 from mengxr/SPARK-5806 and squashes the following

spark git commit: [SPARK-5806] re-organize sections in mllib-clustering.md

2015-02-13 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 d9d0250fc - 965876328 [SPARK-5806] re-organize sections in mllib-clustering.md Put example code close to the algorithm description. Author: Xiangrui Meng m...@databricks.com Closes #4598 from mengxr/SPARK-5806 and squashes

spark git commit: [SPARK-5730][ML] add doc groups to spark.ml components

2015-02-13 Thread meng
/getters will be at the bottom. Preview: ![screen shot 2015-02-13 at 2 47 49 pm](https://cloud.githubusercontent.com/assets/829644/6196657/5740c240-b38f-11e4-94bb-bd8ef5a796c5.png) Author: Xiangrui Meng m...@databricks.com Closes #4600 from mengxr/SPARK-5730 and squashes the following commits

spark git commit: [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays

2015-02-13 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 965876328 - 356b798b3 [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays because ArrayBuffer is not specialized. Author: Xiangrui Meng m...@databricks.com Closes #4594 from mengxr/SPARK-5803 and squashes the following

spark git commit: [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays

2015-02-13 Thread meng
Repository: spark Updated Branches: refs/heads/master cc56c8729 - d50a91d52 [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays because ArrayBuffer is not specialized. Author: Xiangrui Meng m...@databricks.com Closes #4594 from mengxr/SPARK-5803 and squashes the following commits

spark git commit: [SPARK-5757][MLLIB] replace SQL JSON usage in model import/export by json4s

2015-02-12 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 e23c8f5c8 - e26c14990 [SPARK-5757][MLLIB] replace SQL JSON usage in model import/export by json4s This PR detaches MLlib model import/export code from SQL's JSON support, and hence unblocks #4544 . yhuai Author: Xiangrui Meng m

spark git commit: [SPARK-5714][Mllib] Refactor initial step of LDA to remove redundant operations

2015-02-10 Thread meng
Hsieh vii...@gmail.com Authored: Tue Feb 10 21:51:15 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Tue Feb 10 21:51:15 2015 -0800 -- .../org/apache/spark/mllib/clustering/LDA.scala | 37 +++- 1

spark git commit: [SPARK-5714][Mllib] Refactor initial step of LDA to remove redundant operations

2015-02-10 Thread meng
f86a89a2e081ee4593ce03398c2283fd77daac6e) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba3aa8fc Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ba3aa8fc Diff: http://git-wip-us.apache.org/repos/asf/spark

spark git commit: [SPARK-5021] [MLlib] Gaussian Mixture now supports Sparse Input

2015-02-10 Thread meng
: fd2c032f95bbee342ca539df9e44927482981659 Parents: f98707c Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Tue Feb 10 14:05:55 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Tue Feb 10 14:05:55 2015 -0800

spark git commit: [SPARK-5021] [MLlib] Gaussian Mixture now supports Sparse Input

2015-02-10 Thread meng
(cherry picked from commit fd2c032f95bbee342ca539df9e44927482981659) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bba09539 Tree: http://git-wip-us.apache.org/repos/asf/spark

spark git commit: [SPARK-5597][MLLIB] save/load for decision trees and emsembles

2015-02-09 Thread meng
: Joseph K. Bradley jos...@databricks.com Author: Xiangrui Meng m...@databricks.com Closes #4493 from mengxr/SPARK-5597 and squashes the following commits: 75e3bb6 [Xiangrui Meng] fix style 2b0033d [Xiangrui Meng] update tree export schema and refactor the implementation 45873a2 [Joseph K. Bradley

spark git commit: [SPARK-5597][MLLIB] save/load for decision trees and emsembles

2015-02-09 Thread meng
. Author: Joseph K. Bradley jos...@databricks.com Author: Xiangrui Meng m...@databricks.com Closes #4493 from mengxr/SPARK-5597 and squashes the following commits: 75e3bb6 [Xiangrui Meng] fix style 2b0033d [Xiangrui Meng] update tree export schema and refactor the implementation 45873a2 [Joseph K

spark git commit: SPARK-4900 [MLLIB] MLlib SingularValueDecomposition ARPACK IllegalStateException

2015-02-09 Thread meng
-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ebf1df03 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ebf1df03 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ebf1df03

spark git commit: SPARK-4900 [MLLIB] MLlib SingularValueDecomposition ARPACK IllegalStateException

2015-02-09 Thread meng
...@cloudera.com Authored: Mon Feb 9 21:13:58 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Mon Feb 9 21:13:58 2015 -0800 -- .../org/apache/spark/mllib/linalg/EigenValueDecomposition.scala| 2 +- 1 file changed, 1

spark git commit: SPARK-5665 [DOCS] Update netlib-java documentation

2015-02-08 Thread meng
56aff4bd6c7c9d18f4f962025708f20a4a82dcf0) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c515634e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c515634e Diff: http://git-wip-us.apache.org

spark git commit: [SPARK-5539][MLLIB] LDA guide

2015-02-08 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 955f2863e - 5782ee29e [SPARK-5539][MLLIB] LDA guide This is the LDA user guide from jkbradley with Java and Scala code example. Author: Xiangrui Meng m...@databricks.com Author: Joseph K. Bradley jos...@databricks.com Closes #4465

spark git commit: [SPARK-5539][MLLIB] LDA guide

2015-02-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 4575c5643 - 855d12ac0 [SPARK-5539][MLLIB] LDA guide This is the LDA user guide from jkbradley with Java and Scala code example. Author: Xiangrui Meng m...@databricks.com Author: Joseph K. Bradley jos...@databricks.com Closes #4465 from

spark git commit: [SPARK-5660][MLLIB] Make Matrix apply public

2015-02-08 Thread meng
Repository: spark Updated Branches: refs/heads/master a052ed425 - c17161189 [SPARK-5660][MLLIB] Make Matrix apply public This is #4447 with `override`. Closes #4447 Author: Joseph K. Bradley jos...@databricks.com Author: Xiangrui Meng m...@databricks.com Closes #4462 from mengxr/SPARK-5660

spark git commit: SPARK-4405 [MLLIB] Matrices.* construction methods should check for rows x cols overflow

2015-02-08 Thread meng
: c171611 Author: Sean Owen so...@cloudera.com Authored: Sun Feb 8 21:08:50 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Sun Feb 8 21:08:50 2015 -0800 -- .../org/apache/spark/mllib/linalg/Matrices.scala

spark git commit: SPARK-4405 [MLLIB] Matrices.* construction methods should check for rows x cols overflow

2015-02-08 Thread meng
4396dfb37f433ef186e3e0a09db9906986ec940b) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fa8ea48f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fa8ea48f Diff: http://git-wip

spark git commit: [SPARK-5598][MLLIB] model save/load for ALS

2015-02-08 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 42c56b6f1 - 9e4d58fe2 [SPARK-5598][MLLIB] model save/load for ALS following #4233. jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4422 from mengxr/SPARK-5598 and squashes the following commits: a059394 [Xiangrui Meng

spark git commit: [SPARK-5598][MLLIB] model save/load for ALS

2015-02-08 Thread meng
Repository: spark Updated Branches: refs/heads/master 804949d51 - 5c299c58f [SPARK-5598][MLLIB] model save/load for ALS following #4233. jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4422 from mengxr/SPARK-5598 and squashes the following commits: a059394 [Xiangrui Meng

spark git commit: [SPARK-5652][Mllib] Use broadcasted weights in LogisticRegressionModel

2015-02-06 Thread meng
80f3bcb58f836cfe1829c85bdd349c10525c8a5e) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6fda4c13 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6fda4c13 Diff: http://git-wip

spark git commit: [SPARK-5601][MLLIB] make streaming linear algorithms Java-friendly

2015-02-06 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 400580228 - 11b28b9b4 [SPARK-5601][MLLIB] make streaming linear algorithms Java-friendly Overload `trainOn`, `predictOn`, and `predictOnValues`. CC freeman-lab Author: Xiangrui Meng m...@databricks.com Closes #4432 from mengxr

spark git commit: [SPARK-5601][MLLIB] make streaming linear algorithms Java-friendly

2015-02-06 Thread meng
Repository: spark Updated Branches: refs/heads/master c4021401e - 0e23ca9f8 [SPARK-5601][MLLIB] make streaming linear algorithms Java-friendly Overload `trainOn`, `predictOn`, and `predictOnValues`. CC freeman-lab Author: Xiangrui Meng m...@databricks.com Closes #4432 from mengxr/streaming

spark git commit: [SPARK-5652][Mllib] Use broadcasted weights in LogisticRegressionModel

2015-02-06 Thread meng
Parents: 0d74bd7 Author: Liang-Chi Hsieh vii...@gmail.com Authored: Fri Feb 6 11:22:11 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Fri Feb 6 11:22:11 2015 -0800 -- .../spark/mllib/classification

spark git commit: [SPARK-5013] [MLlib] Added documentation and sample data file for GaussianMixture

2015-02-06 Thread meng
documentation and sample data file for GaussianMixture (cherry picked from commit 9ad56ad2a2a51df449040c4f4b7c66b104883312) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

spark git commit: [SPARK-5460][MLlib] Wrapped `Try` around `deleteAllCheckpoints` - RandomForest.

2015-02-05 Thread meng
62371adaa5b9251579db7300504506975689610c) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/44768f58 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/44768f58 Diff: http://git-wip

spark git commit: [SPARK-5460][MLlib] Wrapped `Try` around `deleteAllCheckpoints` - RandomForest.

2015-02-05 Thread meng
: 4d8d070 Author: x1- viva...@gmail.com Authored: Thu Feb 5 15:02:04 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Feb 5 15:02:04 2015 -0800 -- .../scala/org/apache/spark/mllib/tree/RandomForest.scala| 9

spark git commit: [SPARK-5604[MLLIB] remove checkpointDir from LDA

2015-02-05 Thread meng
they don't show up in the generated Java doc (SPARK-5610). jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4390 from mengxr/SPARK-5604 and squashes the following commits: a34bb39 [Xiangrui Meng] remove checkpointDir from LDA Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

spark git commit: [SPARK-5604[MLLIB] remove checkpointDir from LDA

2015-02-05 Thread meng
they don't show up in the generated Java doc (SPARK-5610). jkbradley Author: Xiangrui Meng m...@databricks.com Closes #4390 from mengxr/SPARK-5604 and squashes the following commits: a34bb39 [Xiangrui Meng] remove checkpointDir from LDA (cherry picked from commit

spark git commit: [SPARK-5585] Flaky test in MLlib python

2015-02-04 Thread meng
'master' of github.com:apache/spark into flaky_test ced499b [Davies Liu] add seed for test (cherry picked from commit 38a416f0360fa68fc445af14910fb253ff9ad493) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip

spark git commit: [SPARK-5585] Flaky test in MLlib python

2015-02-04 Thread meng
-wip-us.apache.org/repos/asf/spark/diff/38a416f0 Branch: refs/heads/master Commit: 38a416f0360fa68fc445af14910fb253ff9ad493 Parents: 5aa0f21 Author: Davies Liu dav...@databricks.com Authored: Wed Feb 4 08:54:20 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Feb 4 08:54:20 2015

spark git commit: [SPARK-5596] [mllib] ML model import/export for GLMs, NaiveBayes

2015-02-04 Thread meng
Parents: c23ac03 Author: Joseph K. Bradley jos...@databricks.com Authored: Wed Feb 4 22:46:48 2015 -0800 Committer: Xiangrui Meng m...@databricks.com Committed: Wed Feb 4 22:46:48 2015 -0800 -- .../classification

spark git commit: [SPARK-5596] [mllib] ML model import/export for GLMs, NaiveBayes

2015-02-04 Thread meng
975bcef467b35586e5224171071355409f451d2d) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/885bcbb0 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/885bcbb0 Diff: http://git-wip

spark git commit: [SPARK-5599] Check MLlib public APIs for 1.3

2015-02-04 Thread meng
`. All other changes are documentation and annotations. The `Experimental` tag is removed from `ALS.setAlpha` and `Rating`. One issue not addressed in this PR is the `setCheckpointDir` in `LDA` (https://issues.apache.org/jira/browse/SPARK-5604). CC: srowen jkbradley Author: Xiangrui Meng m

<    6   7   8   9   10   11   12   13   14   15   >