[jira] [Resolved] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2019-05-08 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-19208. -- Resolution: Not A Problem > MultivariateOnlineSummarizer performance optimization >

[jira] [Resolved] (SPARK-22320) ORC should support VectorUDT/MatrixUDT

2019-05-08 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-22320. -- Resolution: Not A Problem > ORC should support VectorUDT/MatrixUDT >

[jira] [Resolved] (SPARK-7008) An implementation of Factorization Machine (LibFM)

2019-05-08 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-7008. - Resolution: Not A Problem > An implementation of Factorization Machine (LibFM) >

[jira] [Updated] (SPARK-28399) Impl RobustScaler

2019-07-15 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28399: - Issue Type: New Feature (was: Improvement) > Impl RobustScaler > - > >

[jira] [Created] (SPARK-28399) Impl RobustScaler

2019-07-15 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28399: Summary: Impl RobustScaler Key: SPARK-28399 URL: https://issues.apache.org/jira/browse/SPARK-28399 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-28159) Make the transform natively in ml framework to avoid extra conversion

2019-06-25 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28159: - Description: It is a long time since ML was released. However, there are still many TODOs

[jira] [Resolved] (SPARK-27656) Safely register class for GraphX

2019-06-25 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-27656. -- Resolution: Not A Problem > Safely register class for GraphX >

[jira] [Commented] (SPARK-28542) Document Stages page

2019-08-18 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910145#comment-16910145 ] zhengruifeng commented on SPARK-28542: -- [~planga82]  Just go ahead! Thanks! > Document Stages page

[jira] [Commented] (SPARK-28543) Document Spark Jobs page

2019-08-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905935#comment-16905935 ] zhengruifeng commented on SPARK-28543: -- [~planga82] I had just create a page in

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-08-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905937#comment-16905937 ] zhengruifeng commented on SPARK-28373: -- [~yumwang]  I had just create a page in

[jira] [Created] (SPARK-28780) Delete the incorrect setWeightCol method in LinearSVCModel

2019-08-20 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28780: Summary: Delete the incorrect setWeightCol method in LinearSVCModel Key: SPARK-28780 URL: https://issues.apache.org/jira/browse/SPARK-28780 Project: Spark

[jira] [Updated] (SPARK-28958) pyspark.ml function parity

2019-09-03 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28958: - Attachment: ML_SYNC.pdf > pyspark.ml function parity > -- > >

[jira] [Created] (SPARK-28958) pyspark.ml function parity

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28958: Summary: pyspark.ml function parity Key: SPARK-28958 URL: https://issues.apache.org/jira/browse/SPARK-28958 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921138#comment-16921138 ] zhengruifeng commented on SPARK-28373: -- [~planga82] Thanks!:D > Document JDBC/ODBC Server page >

[jira] [Commented] (SPARK-28372) Document Spark WEB UI

2019-09-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921151#comment-16921151 ] zhengruifeng commented on SPARK-28372: -- [~smilegator] I think we may need to add a subtask for

[jira] [Created] (SPARK-28968) Add HasNumFeatures in the scala side

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28968: Summary: Add HasNumFeatures in the scala side Key: SPARK-28968 URL: https://issues.apache.org/jira/browse/SPARK-28968 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28969: Summary: OneVsRestModel in the py side should not set WeightCol and Classifier Key: SPARK-28969 URL: https://issues.apache.org/jira/browse/SPARK-28969 Project: Spark

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-01 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920599#comment-16920599 ] zhengruifeng commented on SPARK-28373: -- [~smilegator] [~yumwang]  I am afraid I have no time to do

[jira] [Created] (SPARK-28858) add tree-based transformation in the py side

2019-08-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28858: Summary: add tree-based transformation in the py side Key: SPARK-28858 URL: https://issues.apache.org/jira/browse/SPARK-28858 Project: Spark Issue Type:

[jira] [Commented] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922994#comment-16922994 ] zhengruifeng commented on SPARK-28969: -- friendly ping [~huaxingao] > OneVsRestModel in the py side

[jira] [Updated] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28969: - Parent: SPARK-28958 Issue Type: Sub-task (was: Improvement) > OneVsRestModel in the py

[jira] [Resolved] (SPARK-28968) Add HasNumFeatures in the scala side

2019-09-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-28968. -- Resolution: Resolved > Add HasNumFeatures in the scala side >

[jira] [Commented] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927498#comment-16927498 ] zhengruifeng commented on SPARK-28985: -- [~huaxingao] You can refer to my old prs

[jira] [Resolved] (SPARK-22797) Add multiple column support to PySpark Bucketizer

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-22797. -- Resolution: Done > Add multiple column support to PySpark Bucketizer >

[jira] [Created] (SPARK-29116) Refactor py classes related to DecisionTree

2019-09-17 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29116: Summary: Refactor py classes related to DecisionTree Key: SPARK-29116 URL: https://issues.apache.org/jira/browse/SPARK-29116 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29116) Refactor py classes related to DecisionTree

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931179#comment-16931179 ] zhengruifeng commented on SPARK-29116: -- friendly ping [~huaxingao] , are you willing to work on

[jira] [Created] (SPARK-29118) Avoid redundant computation in GMM.transform

2019-09-17 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29118: Summary: Avoid redundant computation in GMM.transform Key: SPARK-29118 URL: https://issues.apache.org/jira/browse/SPARK-29118 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931125#comment-16931125 ] zhengruifeng commented on SPARK-22796: -- [~huaxingao]  

[jira] [Updated] (SPARK-29118) Avoid redundant computation in GMM.transform && GLR.transform

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29118: - Summary: Avoid redundant computation in GMM.transform && GLR.transform (was: Avoid redundant

[jira] [Updated] (SPARK-29118) Avoid redundant computation in GMM.transform && GLR.transform

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29118: - Description: In SPARK-27944, the computation for output columns with empty name is skipped.

[jira] [Commented] (SPARK-9612) Add instance weight support for GBTs

2019-09-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924090#comment-16924090 ] zhengruifeng commented on SPARK-9612: - https://issues.apache.org/jira/browse/SPARK-19591 is now

[jira] [Commented] (SPARK-28927) ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923163#comment-16923163 ] zhengruifeng commented on SPARK-28927: -- [~JerryHouse]  As to AUC, which impl do you use?

[jira] [Reopened] (SPARK-23578) Add multicolumn support for Binarizer

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-23578: -- this ticket is for Binarizer not Bucketizer > Add multicolumn support for Binarizer >

[jira] [Resolved] (SPARK-23578) Add multicolumn support for Binarizer

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-23578. -- Resolution: Duplicate > Add multicolumn support for Binarizer >

[jira] [Created] (SPARK-29143) Pyspark feature models support column setters/getters

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29143: Summary: Pyspark feature models support column setters/getters Key: SPARK-29143 URL: https://issues.apache.org/jira/browse/SPARK-29143 Project: Spark Issue

[jira] [Created] (SPARK-29142) Pyspark clustering models support column setters/getters/predict

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29142: Summary: Pyspark clustering models support column setters/getters/predict Key: SPARK-29142 URL: https://issues.apache.org/jira/browse/SPARK-29142 Project: Spark

[jira] [Commented] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932272#comment-16932272 ] zhengruifeng commented on SPARK-29144: -- I prefer option 2, and will send a PR for this. >

[jira] [Created] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29144: Summary: Binarizer handel sparse vector incorrectly with negative threshold Key: SPARK-29144 URL: https://issues.apache.org/jira/browse/SPARK-29144 Project: Spark

[jira] [Updated] (SPARK-29144) Binarizer handle sparse vectors incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29144: - Summary: Binarizer handle sparse vectors incorrectly with negative threshold (was: Binarizer

[jira] [Updated] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29144: - Description: the process on sparse vector is wrong if thread<0: {code:java} scala> val data =

[jira] [Created] (SPARK-29093) Remove automatically generated param setters in _shared_params_code_gen.py

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29093: Summary: Remove automatically generated param setters in _shared_params_code_gen.py Key: SPARK-29093 URL: https://issues.apache.org/jira/browse/SPARK-29093 Project:

[jira] [Created] (SPARK-29094) Add extractInstances method

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29094: Summary: Add extractInstances method Key: SPARK-29094 URL: https://issues.apache.org/jira/browse/SPARK-29094 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-29094) Add extractInstances method

2019-09-16 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29094. -- Resolution: Duplicate > Add extractInstances method > --- > >

[jira] [Created] (SPARK-29095) add extractInstances

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29095: Summary: add extractInstances Key: SPARK-29095 URL: https://issues.apache.org/jira/browse/SPARK-29095 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28985: - Description: 1, add common abstract classes like JavaClassificationModel &

[jira] [Updated] (SPARK-28958) pyspark.ml function parity

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28958: - Description: I looked into the hierarchy of both py and scala sides, and found that they are

[jira] [Created] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-05 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28985: Summary: Pyspark ClassificationModel and RegressionModel support column setters/getters/predict Key: SPARK-28985 URL: https://issues.apache.org/jira/browse/SPARK-28985

[jira] [Created] (SPARK-28579) MaxAbsScaler avoids conversion to breeze.vector

2019-07-31 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28579: Summary: MaxAbsScaler avoids conversion to breeze.vector Key: SPARK-28579 URL: https://issues.apache.org/jira/browse/SPARK-28579 Project: Spark Issue Type:

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Created] (SPARK-28499) Optimize MinMaxScaler

2019-07-24 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28499: Summary: Optimize MinMaxScaler Key: SPARK-28499 URL: https://issues.apache.org/jira/browse/SPARK-28499 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-28499) Optimize MinMaxScaler

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28499: - Description: current impl of MinMaxScaler has some small places to be optimized: 1, avoid call

[jira] [Created] (SPARK-28421) SparseVector.apply performance optimization

2019-07-17 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28421: Summary: SparseVector.apply performance optimization Key: SPARK-28421 URL: https://issues.apache.org/jira/browse/SPARK-28421 Project: Spark Issue Type:

[jira] [Created] (SPARK-28514) Remove the redundant transformImpl method in RF & GBT

2019-07-25 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28514: Summary: Remove the redundant transformImpl method in RF & GBT Key: SPARK-28514 URL: https://issues.apache.org/jira/browse/SPARK-28514 Project: Spark Issue

[jira] [Created] (SPARK-29258) parity between ml.evaluator and mllib.metrics

2019-09-26 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29258: Summary: parity between ml.evaluator and mllib.metrics Key: SPARK-29258 URL: https://issues.apache.org/jira/browse/SPARK-29258 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-29258) parity between ml.evaluator and mllib.metrics

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29258. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25940

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939082#comment-16939082 ] zhengruifeng commented on SPARK-29212: -- [~zero323] I had not notice the base hierarchy without

[jira] [Resolved] (SPARK-29142) Pyspark clustering models support column setters/getters/predict

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29142. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25859

[jira] [Created] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-09-26 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29269: Summary: Pyspark ALSModel support getters/setters Key: SPARK-29269 URL: https://issues.apache.org/jira/browse/SPARK-29269 Project: Spark Issue Type:

[jira] [Created] (SPARK-29212) Add common classes without using JVM backend

2019-09-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29212: Summary: Add common classes without using JVM backend Key: SPARK-29212 URL: https://issues.apache.org/jira/browse/SPARK-29212 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-09-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938205#comment-16938205 ] zhengruifeng commented on SPARK-29212: -- [~zero323] Would you like to help work on this? > Add

[jira] [Updated] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-06-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-27018: - Component/s: Spark Core > Checkpointed RDD deleted prematurely when using GBTClassifier >

[jira] [Created] (SPARK-29656) ML algs expose aggregationDepth

2019-10-30 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29656: Summary: ML algs expose aggregationDepth Key: SPARK-29656 URL: https://issues.apache.org/jira/browse/SPARK-29656 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-29751) Scalers use Summarizer instead of MultivariateOnlineSummarizer

2019-11-04 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29751: Summary: Scalers use Summarizer instead of MultivariateOnlineSummarizer Key: SPARK-29751 URL: https://issues.apache.org/jira/browse/SPARK-29751 Project: Spark

[jira] [Resolved] (SPARK-29656) ML algs expose aggregationDepth

2019-11-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29656. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26322

[jira] [Assigned] (SPARK-29656) ML algs expose aggregationDepth

2019-11-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29656: Assignee: zhengruifeng > ML algs expose aggregationDepth >

[jira] [Created] (SPARK-29808) StopWordsRemover should support multi-cols

2019-11-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29808: Summary: StopWordsRemover should support multi-cols Key: SPARK-29808 URL: https://issues.apache.org/jira/browse/SPARK-29808 Project: Spark Issue Type:

[jira] [Updated] (SPARK-16872) Impl Gaussian Naive Bayes Classifier

2019-11-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-16872: - Summary: Impl Gaussian Naive Bayes Classifier (was: Include Gaussian Naive Bayes Classifier)

[jira] [Updated] (SPARK-16872) Include Gaussian Naive Bayes Classifier

2019-11-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-16872: - Component/s: PySpark > Include Gaussian Naive Bayes Classifier >

[jira] [Reopened] (SPARK-16872) Include Gaussian Naive Bayes Classifier

2019-11-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-16872: -- > Include Gaussian Naive Bayes Classifier > --- > >

[jira] [Assigned] (SPARK-29754) LoR/AFT/LiR/SVC use Summarizer instead of MultivariateOnlineSummarizer

2019-11-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29754: Assignee: zhengruifeng > LoR/AFT/LiR/SVC use Summarizer instead of

[jira] [Resolved] (SPARK-29754) LoR/AFT/LiR/SVC use Summarizer instead of MultivariateOnlineSummarizer

2019-11-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29754. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26396

[jira] [Resolved] (SPARK-29686) LinearSVC should persist instances if needed

2019-10-31 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29686. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26344

[jira] [Assigned] (SPARK-29645) ML add param RelativeError

2019-10-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29645: Assignee: zhengruifeng > ML add param RelativeError > -- > >

[jira] [Resolved] (SPARK-29645) ML add param RelativeError

2019-10-30 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29645. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26305

[jira] [Resolved] (SPARK-16872) Impl Gaussian Naive Bayes Classifier

2019-11-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-16872. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 26413

[jira] [Created] (SPARK-29942) Impl Complement Naive Bayes Classifier

2019-11-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29942: Summary: Impl Complement Naive Bayes Classifier Key: SPARK-29942 URL: https://issues.apache.org/jira/browse/SPARK-29942 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-29808) StopWordsRemover should support multi-cols

2019-11-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29808: Assignee: Huaxin Gao > StopWordsRemover should support multi-cols >

[jira] [Created] (SPARK-29914) ML models append metadata in `transform`/`transformSchema`

2019-11-15 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29914: Summary: ML models append metadata in `transform`/`transformSchema` Key: SPARK-29914 URL: https://issues.apache.org/jira/browse/SPARK-29914 Project: Spark

[jira] [Updated] (SPARK-29756) CountVectorizer forget to unpersist intermediate rdd

2019-11-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29756: - Description: {code:java} scala> val df = spark.createDataFrame(Seq( | (0, Array("a",

[jira] [Created] (SPARK-29756) CountVectorizer forget to unpersist intermediate rdd

2019-11-05 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29756: Summary: CountVectorizer forget to unpersist intermediate rdd Key: SPARK-29756 URL: https://issues.apache.org/jira/browse/SPARK-29756 Project: Spark Issue

[jira] [Created] (SPARK-29754) LoR/AFT/LiR/SVC use Summarizer instead of MultivariateOnlineSummarizer

2019-11-05 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29754: Summary: LoR/AFT/LiR/SVC use Summarizer instead of MultivariateOnlineSummarizer Key: SPARK-29754 URL: https://issues.apache.org/jira/browse/SPARK-29754 Project:

[jira] [Assigned] (SPARK-29756) CountVectorizer forget to unpersist intermediate rdd

2019-11-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29756: Assignee: zhengruifeng > CountVectorizer forget to unpersist intermediate rdd >

[jira] [Resolved] (SPARK-29756) CountVectorizer forget to unpersist intermediate rdd

2019-11-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29756. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26398

[jira] [Created] (SPARK-29801) ML models unify toString method

2019-11-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29801: Summary: ML models unify toString method Key: SPARK-29801 URL: https://issues.apache.org/jira/browse/SPARK-29801 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-29960) MulticlassClassificationEvaluator support hammingLoss

2019-11-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29960. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26597

[jira] [Assigned] (SPARK-29960) MulticlassClassificationEvaluator support hammingLoss

2019-11-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29960: Assignee: zhengruifeng > MulticlassClassificationEvaluator support hammingLoss >

[jira] [Updated] (SPARK-29942) Impl Complement Naive Bayes Classifier

2019-11-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29942: - Fix Version/s: (was: 3.1.0) 3.0.0 > Impl Complement Naive Bayes

[jira] [Resolved] (SPARK-29942) Impl Complement Naive Bayes Classifier

2019-11-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29942. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 26575

[jira] [Assigned] (SPARK-29942) Impl Complement Naive Bayes Classifier

2019-11-21 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29942: Assignee: zhengruifeng > Impl Complement Naive Bayes Classifier >

[jira] [Resolved] (SPARK-29914) ML models append metadata in `transform`/`transformSchema`

2019-12-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29914. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 26547

[jira] [Assigned] (SPARK-29914) ML models append metadata in `transform`/`transformSchema`

2019-12-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29914: Assignee: zhengruifeng > ML models append metadata in `transform`/`transformSchema` >

[jira] [Updated] (SPARK-30120) LSH approxNearestNeighbors should use TopByKeyAggregator when numNearestNeighbors is small

2019-12-03 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-30120: - Description: ping [~huaxingao] > LSH approxNearestNeighbors should use TopByKeyAggregator when

[jira] [Created] (SPARK-30109) PCA use BLAS.gemv with sparse vector

2019-12-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-30109: Summary: PCA use BLAS.gemv with sparse vector Key: SPARK-30109 URL: https://issues.apache.org/jira/browse/SPARK-30109 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-30044) MNB/CNB/BNB use empty matrix instead of null

2019-12-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-30044: Assignee: zhengruifeng > MNB/CNB/BNB use empty matrix instead of null >

[jira] [Resolved] (SPARK-30044) MNB/CNB/BNB use empty matrix instead of null

2019-12-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-30044. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26679

[jira] [Comment Edited] (SPARK-30144) MLP param map missing

2019-12-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991050#comment-16991050 ] zhengruifeng edited comment on SPARK-30144 at 12/9/19 1:40 AM: ---

[jira] [Commented] (SPARK-30144) MLP param map missing

2019-12-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991050#comment-16991050 ] zhengruifeng commented on SPARK-30144: -- [~huaxingao]  It seems like that

[jira] [Created] (SPARK-30178) RobustScaler support bigger numFeatures

2019-12-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-30178: Summary: RobustScaler support bigger numFeatures Key: SPARK-30178 URL: https://issues.apache.org/jira/browse/SPARK-30178 Project: Spark Issue Type:

[jira] [Created] (SPARK-30202) impl QuantileTransform

2019-12-10 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-30202: Summary: impl QuantileTransform Key: SPARK-30202 URL: https://issues.apache.org/jira/browse/SPARK-30202 Project: Spark Issue Type: Improvement

<    1   2   3   4   5   6   7   8   9   10   >