[jira] [Resolved] (SPARK-3156) DecisionTree: Order categorical features adaptively

2014-09-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3156. -- Resolution: Fixed Fix Version/s: 1.2.0 DecisionTree: Order categorical features

[jira] [Created] (SPARK-3443) Update the default values of some decision tree parameters

2014-09-08 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3443: Summary: Update the default values of some decision tree parameters Key: SPARK-3443 URL: https://issues.apache.org/jira/browse/SPARK-3443 Project: Spark

[jira] [Updated] (SPARK-3443) Update the default values of some decision tree parameters

2014-09-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3443: - Priority: Minor (was: Major) Target Version/s: 1.2.0 Update the default values of

[jira] [Commented] (SPARK-3249) Fix links in ScalaDoc that cause warning messages in `sbt/sbt unidoc`

2014-09-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126191#comment-14126191 ] Xiangrui Meng commented on SPARK-3249: -- I think we should point to the one with the

[jira] [Updated] (SPARK-3160) Simplify DecisionTree data structure for training

2014-09-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3160: - Assignee: Joseph K. Bradley Simplify DecisionTree data structure for training

[jira] [Resolved] (SPARK-3443) Update the default values of some decision tree parameters

2014-09-08 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3443. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2322

[jira] [Created] (SPARK-3459) MulticlassMetrics is not serializable

2014-09-09 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3459: Summary: MulticlassMetrics is not serializable Key: SPARK-3459 URL: https://issues.apache.org/jira/browse/SPARK-3459 Project: Spark Issue Type: Bug

[jira] [Closed] (SPARK-3459) MulticlassMetrics is not serializable

2014-09-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3459. Resolution: Cannot Reproduce MulticlassMetrics is not serializable

[jira] [Resolved] (SPARK-3494) DecisionTree overflow error in calculating maxMemoryUsage

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3494. -- Resolution: Fixed Assignee: Joseph K. Bradley https://github.com/apache/spark/pull/2341

[jira] [Resolved] (SPARK-3160) Simplify DecisionTree data structure for training

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3160. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2341

[jira] [Updated] (SPARK-3494) DecisionTree overflow error in calculating maxMemoryUsage

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3494: - Fix Version/s: 1.2.0 DecisionTree overflow error in calculating maxMemoryUsage

[jira] [Resolved] (SPARK-2830) MLlib v1.1 documentation

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2830. -- Resolution: Fixed Fix Version/s: 1.1.0 MLlib v1.1 documentation

[jira] [Updated] (SPARK-2838) performance tests for feature transformations

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2838: - Target Version/s: 1.2.0 (was: 1.1.0) performance tests for feature transformations

[jira] [Updated] (SPARK-3249) Fix links in ScalaDoc that cause warning messages in `sbt/sbt unidoc`

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3249: - Target Version/s: 1.2.0 (was: 1.1.0) Fix links in ScalaDoc that cause warning messages in

[jira] [Updated] (SPARK-3436) [MLlib]Streaming SVM

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3436: - Assignee: Liquan Pei [MLlib]Streaming SVM - Key:

[jira] [Updated] (SPARK-2838) performance tests for feature transformations

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2838: - Assignee: (was: Xiangrui Meng) performance tests for feature transformations

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131299#comment-14131299 ] Xiangrui Meng commented on SPARK-1405: -- [~xusen] and [~gq] Thanks for working on LDA!

[jira] [Created] (SPARK-3530) Pipeline and Parameters

2014-09-15 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3530: Summary: Pipeline and Parameters Key: SPARK-3530 URL: https://issues.apache.org/jira/browse/SPARK-3530 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-3396) Change LogistricRegressionWithSGD's default regType to L2

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3396. -- Resolution: Fixed Change LogistricRegressionWithSGD's default regType to L2

[jira] [Resolved] (SPARK-3516) DecisionTree Python support for params maxInstancesPerNode, maxInfoGain

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3516. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2349

[jira] [Updated] (SPARK-3516) DecisionTree Python support for params maxInstancesPerNode, maxInfoGain

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3516: - Assignee: Joseph K. Bradley DecisionTree Python support for params maxInstancesPerNode,

[jira] [Commented] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134831#comment-14134831 ] Xiangrui Meng commented on SPARK-3366: -- It is more about communication than

[jira] [Created] (SPARK-3541) Improve ALS internal storage

2014-09-15 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3541: Summary: Improve ALS internal storage Key: SPARK-3541 URL: https://issues.apache.org/jira/browse/SPARK-3541 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-3530) Pipeline and Parameters

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134860#comment-14134860 ] Xiangrui Meng commented on SPARK-3530: -- [~srowen] Thanks for the comments! The new

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3181: - Priority: Major (was: Critical) Add Robust Regression Algorithm with Huber Estimator

[jira] [Updated] (SPARK-3188) Add Robust Regression Algorithm with Tukey bisquare weight function (Biweight Estimates)

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3188: - Priority: Minor (was: Critical) Add Robust Regression Algorithm with Tukey bisquare weight

[jira] [Updated] (SPARK-3188) Add Robust Regression Algorithm with Tukey bisquare weight function (Biweight Estimates)

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3188: - Affects Version/s: (was: 1.0.2) Add Robust Regression Algorithm with Tukey bisquare weight

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3181: - Affects Version/s: (was: 1.0.2) Add Robust Regression Algorithm with Huber Estimator

[jira] [Updated] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2014-09-15 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3181: - Target Version/s: 1.2.0 (was: 1.1.1, 1.2.0) Add Robust Regression Algorithm with Huber

[jira] [Updated] (SPARK-1503) Implement Nesterov's accelerated first-order method

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1503: - Assignee: (was: Xiangrui Meng) Implement Nesterov's accelerated first-order method

[jira] [Updated] (SPARK-3357) Internal log messages should be set at DEBUG level instead of INFO

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3357: - Assignee: (was: Xiangrui Meng) Internal log messages should be set at DEBUG level instead of

[jira] [Updated] (SPARK-3258) Python API for streaming MLlib algorithms

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3258: - Assignee: (was: Xiangrui Meng) Python API for streaming MLlib algorithms

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Assignee: Burak Yavuz (was: Xiangrui Meng) Support multi-model training in MLlib

[jira] [Resolved] (SPARK-2944) sc.makeRDD doesn't distribute partitions evenly

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2944. -- Resolution: Cannot Reproduce Closing this one now because I couldn't find an easy way to

[jira] [Updated] (SPARK-3066) Support recommendAll in matrix factorization model

2014-09-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3066: - Assignee: (was: Xiangrui Meng) Support recommendAll in matrix factorization model

[jira] [Updated] (SPARK-3568) Add metrics for ranking algorithms

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3568: - Priority: Minor (was: Major) Add metrics for ranking algorithms

[jira] [Updated] (SPARK-3568) Add metrics for ranking algorithms

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3568: - Assignee: Shuo Xiang Add metrics for ranking algorithms --

[jira] [Created] (SPARK-3569) Add metadata field to StructField

2014-09-17 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3569: Summary: Add metadata field to StructField Key: SPARK-3569 URL: https://issues.apache.org/jira/browse/SPARK-3569 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-3569) Add metadata field to StructField

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3569: - Component/s: MLlib ML Add metadata field to StructField

[jira] [Created] (SPARK-3572) Support register UserType in SQL

2014-09-17 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3572: Summary: Support register UserType in SQL Key: SPARK-3572 URL: https://issues.apache.org/jira/browse/SPARK-3572 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-3573) Dataset

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Shepherd: Michael Armbrust Dataset --- Key: SPARK-3573

[jira] [Updated] (SPARK-3569) Add metadata field to StructField

2014-09-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3569: - Description: Want to add a metadata field to StructField that can be used by other applications

[jira] [Updated] (SPARK-3270) Spark API for Application Extensions

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3270: - Issue Type: New Feature (was: Improvement) Spark API for Application Extensions

[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139461#comment-14139461 ] Xiangrui Meng commented on SPARK-3403: -- Sorry, it should be netlib-java, but the real

[jira] [Commented] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139600#comment-14139600 ] Xiangrui Meng commented on SPARK-3530: -- [~eustache] The default implementation of

[jira] [Comment Edited] (SPARK-3530) Pipeline and Parameters

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14139600#comment-14139600 ] Xiangrui Meng edited comment on SPARK-3530 at 9/18/14 10:06 PM:

[jira] [Updated] (SPARK-3573) Dataset

2014-09-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with extra

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Summary: RDD[Double] doesn't use primitive arrays for caching (was: RandomRDDs doesn't create

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Issue Type: Improvement (was: Bug) RDD[Double] doesn't use primitive arrays for caching

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Component/s: (was: MLlib) RDD[Double] doesn't use primitive arrays for caching

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3600: - Target Version/s: (was: 1.1.1, 1.2.0) RDD[Double] doesn't use primitive arrays for caching

[jira] [Commented] (SPARK-3573) Dataset

2014-09-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141271#comment-14141271 ] Xiangrui Meng commented on SPARK-3573: -- [~sandyr] SQL/Streaming/GraphX provide

[jira] [Assigned] (SPARK-3541) Improve ALS internal storage

2014-09-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-3541: Assignee: Xiangrui Meng Improve ALS internal storage

[jira] [Resolved] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1484. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2347

[jira] [Updated] (SPARK-1484) MLlib should warn if you are using an iterative algorithm on non-cached data

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1484: - Assignee: Aaron Staple MLlib should warn if you are using an iterative algorithm on non-cached

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148529#comment-14148529 ] Xiangrui Meng commented on SPARK-1405: -- [~Guoqiang Li] and [~pedrorodriguez], since

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Assignee: Guoqiang Li (was: Xusen Yin) parallel Latent Dirichlet Allocation (LDA) atop of spark

[jira] [Updated] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1405: - Shepherd: Xiangrui Meng parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

[jira] [Commented] (SPARK-1241) Support sliding in RDD

2014-09-25 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148549#comment-14148549 ] Xiangrui Meng commented on SPARK-1241: -- This is implemented MLlib:

[jira] [Commented] (SPARK-3588) Gaussian Mixture Model clustering

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148900#comment-14148900 ] Xiangrui Meng commented on SPARK-3588: -- Please follow the instructions at

[jira] [Resolved] (SPARK-3614) Filter on minimum occurrences of a term in IDF

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3614. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2494

[jira] [Commented] (SPARK-2516) Bootstrapping

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149721#comment-14149721 ] Xiangrui Meng commented on SPARK-2516: -- The plan was to implement Bag of Little

[jira] [Updated] (SPARK-2516) Bootstrapping

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2516: - Assignee: Yu Ishikawa Bootstrapping - Key: SPARK-2516

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1547: - Shepherd: Joseph K. Bradley Add gradient boosting algorithm to MLlib

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1547: - Target Version/s: 1.2.0 Add gradient boosting algorithm to MLlib

[jira] [Updated] (SPARK-3700) Improve the performance of JSON parser

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3700: - Assignee: (was: Yin Huai) Improve the performance of JSON parser

[jira] [Updated] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3701: - Priority: Minor (was: Major) Some clean-up work after the refactoring of MLlib's SerDe for

[jira] [Created] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-26 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3701: Summary: Some clean-up work after the refactoring of MLlib's SerDe for PySpark Key: SPARK-3701 URL: https://issues.apache.org/jira/browse/SPARK-3701 Project: Spark

[jira] [Updated] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3702: - Assignee: Joseph K. Bradley Standardize MLlib classes for learners, models

[jira] [Updated] (SPARK-1545) Add Random Forest algorithm to MLlib

2014-09-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1545: - Assignee: Joseph K. Bradley (was: Manish Amde) Add Random Forest algorithm to MLlib

[jira] [Resolved] (SPARK-1545) Add Random Forest algorithm to MLlib

2014-09-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1545. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2435

[jira] [Updated] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3366: - Assignee: Qiping Li Compute best splits distributively in decision tree

[jira] [Resolved] (SPARK-2885) All-pairs similarity via DIMSUM

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2885. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 1778

[jira] [Created] (SPARK-3735) Sending the factor directly or AtA based on the cost in ALS

2014-09-29 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3735: Summary: Sending the factor directly or AtA based on the cost in ALS Key: SPARK-3735 URL: https://issues.apache.org/jira/browse/SPARK-3735 Project: Spark

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-09-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152636#comment-14152636 ] Xiangrui Meng commented on SPARK-3434: -- [~shivaram] Could you post the design of the

[jira] [Updated] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3366: - Target Version/s: 1.2.0 Compute best splits distributively in decision tree

[jira] [Updated] (SPARK-3436) Streaming SVM

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3436: - Summary: Streaming SVM (was: [MLlib]Streaming SVM ) Streaming SVM --

[jira] [Updated] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3486: - Summary: Add PySpark support for Word2Vec (was: [MLlib]Add PySpark support for Word2Vec) Add

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Target Version/s: 1.2.0 Avoid 1 extra aggregation for DecisionTree training

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3158: - Priority: Major (was: Minor) Avoid 1 extra aggregation for DecisionTree training

[jira] [Updated] (SPARK-3161) Cache example-node map for DecisionTree training

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3161: - Priority: Major (was: Minor) Target Version/s: 1.2.0 Cache example-node map for

[jira] [Commented] (SPARK-3541) Improve ALS internal storage

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153911#comment-14153911 ] Xiangrui Meng commented on SPARK-3541: -- I put the implementation at

[jira] [Resolved] (SPARK-3701) Some clean-up work after the refactoring of MLlib's SerDe for PySpark

2014-09-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3701. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2548

[jira] [Updated] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-10-01 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3751: - Assignee: Joseph K. Bradley DecisionTreeRunner functionality improvement

[jira] [Resolved] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-10-01 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3751. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2604

[jira] [Updated] (SPARK-3572) Support register UserType in SQL

2014-10-02 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3572: - Assignee: Joseph K. Bradley Support register UserType in SQL

[jira] [Resolved] (SPARK-3366) Compute best splits distributively in decision tree

2014-10-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3366. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2595

[jira] [Updated] (SPARK-1655) In naive Bayes, store conditional probabilities distributively.

2014-10-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1655: - Assignee: Aaron Staple In naive Bayes, store conditional probabilities distributively.

[jira] [Created] (SPARK-3820) Specialize columnSimilarity() without any threshold

2014-10-06 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3820: Summary: Specialize columnSimilarity() without any threshold Key: SPARK-3820 URL: https://issues.apache.org/jira/browse/SPARK-3820 Project: Spark Issue

[jira] [Commented] (SPARK-3803) ArrayIndexOutOfBoundsException found in executing computePrincipalComponents

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161281#comment-14161281 ] Xiangrui Meng commented on SPARK-3803: -- In `computeCovariance`, we generate a warning

[jira] [Closed] (SPARK-3370) The simple test error

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3370. Resolution: Duplicate This is a known issue. We can fix it by checkpointing intermediate RDDs. For

[jira] [Updated] (SPARK-3424) KMeans Plus Plus is too slow

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3424: - Assignee: Derrick Burns KMeans Plus Plus is too slow

[jira] [Updated] (SPARK-3261) KMeans clusterer can return duplicate cluster centers

2014-10-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3261: - Assignee: Derrick Burns KMeans clusterer can return duplicate cluster centers

[jira] [Commented] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161540#comment-14161540 ] Xiangrui Meng commented on SPARK-3828: -- `text8` doesn't contain any line feed

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162156#comment-14162156 ] Xiangrui Meng commented on SPARK-3434: -- [~shivaram] and [~ConcreteVitamin] Any

[jira] [Reopened] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-3828: -- Spark returns inconsistent results when building with different Hadoop version

[jira] [Commented] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162439#comment-14162439 ] Xiangrui Meng commented on SPARK-3828: -- I re-opened this because it may be a serious

[jira] [Created] (SPARK-3838) Python code example for Word2Vec in user guide

2014-10-07 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-3838: Summary: Python code example for Word2Vec in user guide Key: SPARK-3838 URL: https://issues.apache.org/jira/browse/SPARK-3838 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-3790) CosineSimilarity via DIMSUM example

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3790. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2622

[jira] [Resolved] (SPARK-3486) Add PySpark support for Word2Vec

2014-10-07 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3486. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 2356

<    1   2   3   4   5   6   7   8   9   10   >