[jira] [Created] (SPARK-29566) Imputer should support single-column input/ouput

2019-10-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29566: Summary: Imputer should support single-column input/ouput Key: SPARK-29566 URL: https://issues.apache.org/jira/browse/SPARK-29566 Project: Spark Issue Type:

[jira] [Created] (SPARK-29565) OneHotEncoder should support single-column input/ouput

2019-10-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29565: Summary: OneHotEncoder should support single-column input/ouput Key: SPARK-29565 URL: https://issues.apache.org/jira/browse/SPARK-29565 Project: Spark Issue

[jira] [Assigned] (SPARK-29093) Remove automatically generated param setters in _shared_params_code_gen.py

2019-10-23 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29093: Assignee: Huaxin Gao > Remove automatically generated param setters in

[jira] [Commented] (SPARK-29093) Remove automatically generated param setters in _shared_params_code_gen.py

2019-10-23 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957601#comment-16957601 ] zhengruifeng commented on SPARK-29093: -- [~huaxingao] Thanks! > Remove automatically generated

[jira] [Assigned] (SPARK-29232) RandomForestRegressionModel does not update the parameter maps of the DecisionTreeRegressionModels underneath

2019-10-22 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29232: Assignee: Huaxin Gao > RandomForestRegressionModel does not update the parameter maps of

[jira] [Resolved] (SPARK-29232) RandomForestRegressionModel does not update the parameter maps of the DecisionTreeRegressionModels underneath

2019-10-22 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29232. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26154

[jira] [Assigned] (SPARK-29489) ml.evaluation support log-loss

2019-10-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29489: Assignee: zhengruifeng > ml.evaluation support log-loss > --

[jira] [Resolved] (SPARK-29489) ml.evaluation support log-loss

2019-10-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29489. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26135

[jira] [Assigned] (SPARK-23578) Add multicolumn support for Binarizer

2019-10-16 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-23578: Assignee: zhengruifeng > Add multicolumn support for Binarizer >

[jira] [Resolved] (SPARK-23578) Add multicolumn support for Binarizer

2019-10-16 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-23578. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26064

[jira] [Created] (SPARK-29489) ml.evaluation support log-loss

2019-10-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29489: Summary: ml.evaluation support log-loss Key: SPARK-29489 URL: https://issues.apache.org/jira/browse/SPARK-29489 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-29381) Add 'private' _XXXParams classes for classification & regression

2019-10-15 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951657#comment-16951657 ] zhengruifeng commented on SPARK-29381: -- [~huaxingao]  Hi, I think we need another PR to add

[jira] [Assigned] (SPARK-29377) parity between scala ml tuning and python ml tuning

2019-10-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29377: Assignee: Huaxin Gao > parity between scala ml tuning and python ml tuning >

[jira] [Resolved] (SPARK-29377) parity between scala ml tuning and python ml tuning

2019-10-14 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29377. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26057

[jira] [Assigned] (SPARK-29380) RFormula avoid repeated 'first' jobs to get vector size

2019-10-12 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29380: Assignee: zhengruifeng > RFormula avoid repeated 'first' jobs to get vector size >

[jira] [Resolved] (SPARK-29380) RFormula avoid repeated 'first' jobs to get vector size

2019-10-12 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29380. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26052

[jira] [Resolved] (SPARK-29116) Refactor py classes related to DecisionTree

2019-10-12 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29116. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25929

[jira] [Assigned] (SPARK-29116) Refactor py classes related to DecisionTree

2019-10-12 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29116: Assignee: Huaxin Gao > Refactor py classes related to DecisionTree >

[jira] [Created] (SPARK-29381) Add 'private' _XXXParams classes for classification & regression

2019-10-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29381: Summary: Add 'private' _XXXParams classes for classification & regression Key: SPARK-29381 URL: https://issues.apache.org/jira/browse/SPARK-29381 Project: Spark

[jira] [Created] (SPARK-29380) RFormula avoid repeated 'first' jobs to get vector size

2019-10-08 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29380: Summary: RFormula avoid repeated 'first' jobs to get vector size Key: SPARK-29380 URL: https://issues.apache.org/jira/browse/SPARK-29380 Project: Spark

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-10-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946583#comment-16946583 ] zhengruifeng commented on SPARK-29212: -- [~zero323]  ??we should remove Java specific mixins, if

[jira] [Assigned] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-10-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reassigned SPARK-29269: Assignee: Huaxin Gao > Pyspark ALSModel support getters/setters >

[jira] [Issue Comment Deleted] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-10-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29269: - Comment: was deleted (was: It seems that I do not have the permission to assign a tickect: ```

[jira] [Commented] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-10-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946524#comment-16946524 ] zhengruifeng commented on SPARK-29269: -- It seems that I do not have the permission to assign a

[jira] [Resolved] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-10-08 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29269. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25947

[jira] [Resolved] (SPARK-29258) parity between ml.evaluator and mllib.metrics

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29258. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25940

[jira] [Created] (SPARK-29269) Pyspark ALSModel support getters/setters

2019-09-26 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29269: Summary: Pyspark ALSModel support getters/setters Key: SPARK-29269 URL: https://issues.apache.org/jira/browse/SPARK-29269 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-29142) Pyspark clustering models support column setters/getters/predict

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29142. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 25859

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-09-26 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939082#comment-16939082 ] zhengruifeng commented on SPARK-29212: -- [~zero323] I had not notice the base hierarchy without

[jira] [Created] (SPARK-29258) parity between ml.evaluator and mllib.metrics

2019-09-26 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29258: Summary: parity between ml.evaluator and mllib.metrics Key: SPARK-29258 URL: https://issues.apache.org/jira/browse/SPARK-29258 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-09-25 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938205#comment-16938205 ] zhengruifeng commented on SPARK-29212: -- [~zero323] Would you like to help work on this? > Add

[jira] [Commented] (SPARK-29212) Add common classes without using JVM backend

2019-09-23 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935678#comment-16935678 ] zhengruifeng commented on SPARK-29212: -- It seems useful to impl some algs in pure python (like wrap

[jira] [Created] (SPARK-29212) Add common classes without using JVM backend

2019-09-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29212: Summary: Add common classes without using JVM backend Key: SPARK-29212 URL: https://issues.apache.org/jira/browse/SPARK-29212 Project: Spark Issue Type:

[jira] [Updated] (SPARK-29144) Binarizer handle sparse vectors incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29144: - Summary: Binarizer handle sparse vectors incorrectly with negative threshold (was: Binarizer

[jira] [Commented] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932272#comment-16932272 ] zhengruifeng commented on SPARK-29144: -- I prefer option 2, and will send a PR for this. >

[jira] [Updated] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29144: - Description: the process on sparse vector is wrong if thread<0: {code:java} scala> val data =

[jira] [Created] (SPARK-29144) Binarizer handel sparse vector incorrectly with negative threshold

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29144: Summary: Binarizer handel sparse vector incorrectly with negative threshold Key: SPARK-29144 URL: https://issues.apache.org/jira/browse/SPARK-29144 Project: Spark

[jira] [Reopened] (SPARK-23578) Add multicolumn support for Binarizer

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-23578: -- this ticket is for Binarizer not Bucketizer > Add multicolumn support for Binarizer >

[jira] [Resolved] (SPARK-23578) Add multicolumn support for Binarizer

2019-09-18 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-23578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-23578. -- Resolution: Duplicate > Add multicolumn support for Binarizer >

[jira] [Created] (SPARK-29143) Pyspark feature models support column setters/getters

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29143: Summary: Pyspark feature models support column setters/getters Key: SPARK-29143 URL: https://issues.apache.org/jira/browse/SPARK-29143 Project: Spark Issue

[jira] [Created] (SPARK-29142) Pyspark clustering models support column setters/getters/predict

2019-09-18 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29142: Summary: Pyspark clustering models support column setters/getters/predict Key: SPARK-29142 URL: https://issues.apache.org/jira/browse/SPARK-29142 Project: Spark

[jira] [Updated] (SPARK-29118) Avoid redundant computation in GMM.transform && GLR.transform

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29118: - Description: In SPARK-27944, the computation for output columns with empty name is skipped.

[jira] [Updated] (SPARK-29118) Avoid redundant computation in GMM.transform && GLR.transform

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-29118: - Summary: Avoid redundant computation in GMM.transform && GLR.transform (was: Avoid redundant

[jira] [Created] (SPARK-29118) Avoid redundant computation in GMM.transform

2019-09-17 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29118: Summary: Avoid redundant computation in GMM.transform Key: SPARK-29118 URL: https://issues.apache.org/jira/browse/SPARK-29118 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29116) Refactor py classes related to DecisionTree

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931179#comment-16931179 ] zhengruifeng commented on SPARK-29116: -- friendly ping [~huaxingao] , are you willing to work on

[jira] [Created] (SPARK-29116) Refactor py classes related to DecisionTree

2019-09-17 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29116: Summary: Refactor py classes related to DecisionTree Key: SPARK-29116 URL: https://issues.apache.org/jira/browse/SPARK-29116 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931125#comment-16931125 ] zhengruifeng commented on SPARK-22796: -- [~huaxingao]  

[jira] [Resolved] (SPARK-22797) Add multiple column support to PySpark Bucketizer

2019-09-17 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-22797. -- Resolution: Done > Add multiple column support to PySpark Bucketizer >

[jira] [Resolved] (SPARK-29094) Add extractInstances method

2019-09-16 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-29094. -- Resolution: Duplicate > Add extractInstances method > --- > >

[jira] [Created] (SPARK-29095) add extractInstances

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29095: Summary: add extractInstances Key: SPARK-29095 URL: https://issues.apache.org/jira/browse/SPARK-29095 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-29094) Add extractInstances method

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29094: Summary: Add extractInstances method Key: SPARK-29094 URL: https://issues.apache.org/jira/browse/SPARK-29094 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-29093) Remove automatically generated param setters in _shared_params_code_gen.py

2019-09-16 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-29093: Summary: Remove automatically generated param setters in _shared_params_code_gen.py Key: SPARK-29093 URL: https://issues.apache.org/jira/browse/SPARK-29093 Project:

[jira] [Commented] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-11 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927498#comment-16927498 ] zhengruifeng commented on SPARK-28985: -- [~huaxingao] You can refer to my old prs

[jira] [Commented] (SPARK-9612) Add instance weight support for GBTs

2019-09-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924090#comment-16924090 ] zhengruifeng commented on SPARK-9612: - https://issues.apache.org/jira/browse/SPARK-19591 is now

[jira] [Resolved] (SPARK-28968) Add HasNumFeatures in the scala side

2019-09-06 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-28968. -- Resolution: Resolved > Add HasNumFeatures in the scala side >

[jira] [Updated] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28985: - Description: 1, add common abstract classes like JavaClassificationModel &

[jira] [Commented] (SPARK-28927) ArrayIndexOutOfBoundsException and Not-stable AUC metrics in ALS for datasets with 12 billion instances

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923163#comment-16923163 ] zhengruifeng commented on SPARK-28927: -- [~JerryHouse]  As to AUC, which impl do you use?

[jira] [Updated] (SPARK-28958) pyspark.ml function parity

2019-09-05 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28958: - Description: I looked into the hierarchy of both py and scala sides, and found that they are

[jira] [Created] (SPARK-28985) Pyspark ClassificationModel and RegressionModel support column setters/getters/predict

2019-09-05 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28985: Summary: Pyspark ClassificationModel and RegressionModel support column setters/getters/predict Key: SPARK-28985 URL: https://issues.apache.org/jira/browse/SPARK-28985

[jira] [Updated] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28969: - Parent: SPARK-28958 Issue Type: Sub-task (was: Improvement) > OneVsRestModel in the py

[jira] [Commented] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-04 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922994#comment-16922994 ] zhengruifeng commented on SPARK-28969: -- friendly ping [~huaxingao] > OneVsRestModel in the py side

[jira] [Created] (SPARK-28969) OneVsRestModel in the py side should not set WeightCol and Classifier

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28969: Summary: OneVsRestModel in the py side should not set WeightCol and Classifier Key: SPARK-28969 URL: https://issues.apache.org/jira/browse/SPARK-28969 Project: Spark

[jira] [Created] (SPARK-28968) Add HasNumFeatures in the scala side

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28968: Summary: Add HasNumFeatures in the scala side Key: SPARK-28968 URL: https://issues.apache.org/jira/browse/SPARK-28968 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-28958) pyspark.ml function parity

2019-09-03 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28958: - Attachment: ML_SYNC.pdf > pyspark.ml function parity > -- > >

[jira] [Created] (SPARK-28958) pyspark.ml function parity

2019-09-03 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28958: Summary: pyspark.ml function parity Key: SPARK-28958 URL: https://issues.apache.org/jira/browse/SPARK-28958 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28372) Document Spark WEB UI

2019-09-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921151#comment-16921151 ] zhengruifeng commented on SPARK-28372: -- [~smilegator] I think we may need to add a subtask for

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-02 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921138#comment-16921138 ] zhengruifeng commented on SPARK-28373: -- [~planga82] Thanks!:D > Document JDBC/ODBC Server page >

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-09-01 Thread zhengruifeng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920599#comment-16920599 ] zhengruifeng commented on SPARK-28373: -- [~smilegator] [~yumwang]  I am afraid I have no time to do

[jira] [Created] (SPARK-28858) add tree-based transformation in the py side

2019-08-23 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28858: Summary: add tree-based transformation in the py side Key: SPARK-28858 URL: https://issues.apache.org/jira/browse/SPARK-28858 Project: Spark Issue Type:

[jira] [Created] (SPARK-28780) Delete the incorrect setWeightCol method in LinearSVCModel

2019-08-20 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-28780: Summary: Delete the incorrect setWeightCol method in LinearSVCModel Key: SPARK-28780 URL: https://issues.apache.org/jira/browse/SPARK-28780 Project: Spark

[jira] [Commented] (SPARK-28542) Document Stages page

2019-08-18 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910145#comment-16910145 ] zhengruifeng commented on SPARK-28542: -- [~planga82]  Just go ahead! Thanks! > Document Stages page

[jira] [Commented] (SPARK-28373) Document JDBC/ODBC Server page

2019-08-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905937#comment-16905937 ] zhengruifeng commented on SPARK-28373: -- [~yumwang]  I had just create a page in

[jira] [Commented] (SPARK-28543) Document Spark Jobs page

2019-08-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905935#comment-16905935 ] zhengruifeng commented on SPARK-28543: -- [~planga82] I had just create a page in

[jira] [Created] (SPARK-28579) MaxAbsScaler avoids conversion to breeze.vector

2019-07-31 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28579: Summary: MaxAbsScaler avoids conversion to breeze.vector Key: SPARK-28579 URL: https://issues.apache.org/jira/browse/SPARK-28579 Project: Spark Issue Type:

[jira] [Created] (SPARK-28514) Remove the redundant transformImpl method in RF & GBT

2019-07-25 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28514: Summary: Remove the redundant transformImpl method in RF & GBT Key: SPARK-28514 URL: https://issues.apache.org/jira/browse/SPARK-28514 Project: Spark Issue

[jira] [Updated] (SPARK-28499) Optimize MinMaxScaler

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28499: - Description: current impl of MinMaxScaler has some small places to be optimized: 1, avoid call

[jira] [Created] (SPARK-28499) Optimize MinMaxScaler

2019-07-24 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28499: Summary: Optimize MinMaxScaler Key: SPARK-28499 URL: https://issues.apache.org/jira/browse/SPARK-28499 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-07-24 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Created] (SPARK-28421) SparseVector.apply performance optimization

2019-07-17 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28421: Summary: SparseVector.apply performance optimization Key: SPARK-28421 URL: https://issues.apache.org/jira/browse/SPARK-28421 Project: Spark Issue Type:

[jira] [Updated] (SPARK-28399) Impl RobustScaler

2019-07-15 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28399: - Issue Type: New Feature (was: Improvement) > Impl RobustScaler > - > >

[jira] [Created] (SPARK-28399) Impl RobustScaler

2019-07-15 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28399: Summary: Impl RobustScaler Key: SPARK-28399 URL: https://issues.apache.org/jira/browse/SPARK-28399 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-27656) Safely register class for GraphX

2019-06-25 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-27656. -- Resolution: Not A Problem > Safely register class for GraphX >

[jira] [Updated] (SPARK-28159) Make the transform natively in ml framework to avoid extra conversion

2019-06-25 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-28159: - Description: It is a long time since ML was released. However, there are still many TODOs

[jira] [Created] (SPARK-28159) Make the transform natively in ml framework to avoid extra conversion

2019-06-25 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28159: Summary: Make the transform natively in ml framework to avoid extra conversion Key: SPARK-28159 URL: https://issues.apache.org/jira/browse/SPARK-28159 Project: Spark

[jira] [Created] (SPARK-28154) GMM fix double caching

2019-06-24 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28154: Summary: GMM fix double caching Key: SPARK-28154 URL: https://issues.apache.org/jira/browse/SPARK-28154 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-28117) LDA and BisectingKMeans cache the input dataset if necessary

2019-06-19 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28117: Summary: LDA and BisectingKMeans cache the input dataset if necessary Key: SPARK-28117 URL: https://issues.apache.org/jira/browse/SPARK-28117 Project: Spark

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-06-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-06-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Priority: Major (was: Minor) > Support Tree-Based Feature Transformation for ML >

[jira] [Commented] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-06-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867507#comment-16867507 ] zhengruifeng commented on SPARK-13677: -- I closed this ticket since the old pr was based on

[jira] [Reopened] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-06-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng reopened SPARK-13677: -- update the design > Support Tree-Based Feature Transformation for ML >

[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2019-06-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Description: It would be nice to be able to use RF and GBT for feature transformation: First

[jira] [Updated] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-06-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-27018: - Component/s: Spark Core > Checkpointed RDD deleted prematurely when using GBTClassifier >

[jira] [Resolved] (SPARK-27925) Better control numBins of curves in BinaryClassificationMetrics

2019-06-13 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng resolved SPARK-27925. -- Resolution: Not A Problem > Better control numBins of curves in BinaryClassificationMetrics >

[jira] [Created] (SPARK-28045) add missing RankingEvaluator

2019-06-13 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28045: Summary: add missing RankingEvaluator Key: SPARK-28045 URL: https://issues.apache.org/jira/browse/SPARK-28045 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-28044) MulticlassClassificationEvaluator support more metrics

2019-06-13 Thread zhengruifeng (JIRA)
zhengruifeng created SPARK-28044: Summary: MulticlassClassificationEvaluator support more metrics Key: SPARK-28044 URL: https://issues.apache.org/jira/browse/SPARK-28044 Project: Spark Issue

[jira] [Comment Edited] (SPARK-24875) MulticlassMetrics should offer a more efficient way to compute count by label

2019-06-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860922#comment-16860922 ] zhengruifeng edited comment on SPARK-24875 at 6/11/19 10:59 AM: The

[jira] [Commented] (SPARK-24875) MulticlassMetrics should offer a more efficient way to compute count by label

2019-06-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860922#comment-16860922 ] zhengruifeng commented on SPARK-24875: -- The dataset is usually much smaller than the training

[jira] [Commented] (SPARK-26185) add weightCol in python MulticlassClassificationEvaluator

2019-06-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860919#comment-16860919 ] zhengruifeng commented on SPARK-26185: -- Seems resolved? > add weightCol in python

[jira] [Commented] (SPARK-25360) Parallelized RDDs of Ranges could have known partitioner

2019-06-11 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860759#comment-16860759 ] zhengruifeng commented on SPARK-25360: -- But I think it maybe worth to impl a direct version of

<    2   3   4   5   6   7   8   9   10   11   >