[jira] [Commented] (SPARK-22974) CountVectorModel does not attach attributes to output column

2019-05-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832691#comment-16832691 ] yuhao yang commented on SPARK-22974: On a business trip from April 29th to May 3rd .

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2019-03-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791938#comment-16791938 ] yuhao yang commented on SPARK-20082: Yuhao is taking family bonding leave from March

[jira] [Updated] (SPARK-25011) Add PrefixSpan to __all__ in fpm.py

2018-08-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-25011: --- Summary: Add PrefixSpan to __all__ in fpm.py (was: Add PrefixSpan to __all__) > Add PrefixSpan to

[jira] [Created] (SPARK-25011) Add PrefixSpan to __all__

2018-08-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-25011: -- Summary: Add PrefixSpan to __all__ Key: SPARK-25011 URL: https://issues.apache.org/jira/browse/SPARK-25011 Project: Spark Issue Type: Bug Components: M

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566326#comment-16566326 ] yuhao yang commented on SPARK-23742: [~maropu] Can you be more specific about the su

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564858#comment-16564858 ] yuhao yang commented on SPARK-23742: The redundant rule may have different confidenc

[jira] [Commented] (SPARK-15064) Locale support in StopWordsRemover

2018-06-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502929#comment-16502929 ] yuhao yang commented on SPARK-15064: Yuhao will be OOF from May 29th to June 6th (an

[jira] [Commented] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328310#comment-16328310 ] yuhao yang commented on SPARK-22943: Thanks for the reply, yet I cannot see how can u

[jira] [Commented] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314412#comment-16314412 ] yuhao yang commented on SPARK-22943: Feel free to work on this but I would suggest to

[jira] [Created] (SPARK-22943) OneHotEncoder supports manual specification of categorySizes

2018-01-02 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22943: -- Summary: OneHotEncoder supports manual specification of categorySizes Key: SPARK-22943 URL: https://issues.apache.org/jira/browse/SPARK-22943 Project: Spark Iss

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-12-19 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297887#comment-16297887 ] yuhao yang commented on SPARK-19053: Plan for further development: 1. Initial API an

[jira] [Commented] (SPARK-8418) Add single- and multi-value support to ML Transformers

2017-12-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275723#comment-16275723 ] yuhao yang commented on SPARK-8418: --- second Nick's comments. > Add single- and multi-va

[jira] [Commented] (SPARK-22331) Make MLlib string params case-insensitive

2017-11-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269169#comment-16269169 ] yuhao yang commented on SPARK-22331: Thanks for the interests [~smurakozi]. I tried t

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-20 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259587#comment-16259587 ] yuhao yang commented on SPARK-22427: I tried with larger scale data but did not repro

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-12 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249017#comment-16249017 ] yuhao yang commented on SPARK-22427: Hi [~lyt] does increasing stack size resolve you

[jira] [Created] (SPARK-22502) OnlineLDAOptimizer variationalTopicInference might be able to handle empty documents

2017-11-12 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22502: -- Summary: OnlineLDAOptimizer variationalTopicInference might be able to handle empty documents Key: SPARK-22502 URL: https://issues.apache.org/jira/browse/SPARK-22502 Proj

[jira] [Commented] (SPARK-18755) Add Randomized Grid Search to Spark ML

2017-11-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247870#comment-16247870 ] yuhao yang commented on SPARK-18755: Thanks for all the interests. For anyone who w

[jira] [Commented] (SPARK-22427) StackOverFlowError when using FPGrowth

2017-11-02 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237174#comment-16237174 ] yuhao yang commented on SPARK-22427: Could you please try to increase the stack size,

[jira] [Commented] (SPARK-13030) Change OneHotEncoder to Estimator

2017-10-31 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227094#comment-16227094 ] yuhao yang commented on SPARK-13030: I see. Thanks for the response [~mlnick]. The E

[jira] [Commented] (SPARK-13030) Change OneHotEncoder to Estimator

2017-10-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226307#comment-16226307 ] yuhao yang commented on SPARK-13030: Sorry to jumping in so late. I can see there's b

[jira] [Created] (SPARK-22381) Add StringParam that supports valid options

2017-10-28 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22381: -- Summary: Add StringParam that supports valid options Key: SPARK-22381 URL: https://issues.apache.org/jira/browse/SPARK-22381 Project: Spark Issue Type: New Featu

[jira] [Commented] (SPARK-18755) Add Randomized Grid Search to Spark ML

2017-10-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221800#comment-16221800 ] yuhao yang commented on SPARK-18755: Thanks for sending the update here. Feel free

[jira] [Commented] (SPARK-22331) Make MLlib string params case-insensitive

2017-10-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215489#comment-16215489 ] yuhao yang commented on SPARK-22331: Yes, I don't see the change will break any exist

[jira] [Commented] (SPARK-22331) Strength consistency for supporting string params: case-insensitive or not

2017-10-22 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214667#comment-16214667 ] yuhao yang commented on SPARK-22331: cc [~WeichenXu123] > Strength consistency for s

[jira] [Created] (SPARK-22331) Strength consistency for supporting string params: case-insensitive or not

2017-10-22 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22331: -- Summary: Strength consistency for supporting string params: case-insensitive or not Key: SPARK-22331 URL: https://issues.apache.org/jira/browse/SPARK-22331 Project: Spark

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208614#comment-16208614 ] yuhao yang commented on SPARK-22289: Thanks for the reply. I'll start compose a PR.

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-17 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207115#comment-16207115 ] yuhao yang commented on SPARK-22289: cc [~yanboliang] [~dbtsai] > Cannot save Logist

[jira] [Comment Edited] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207063#comment-16207063 ] yuhao yang edited comment on SPARK-22289 at 10/17/17 6:43 AM: -

[jira] [Comment Edited] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207063#comment-16207063 ] yuhao yang edited comment on SPARK-22289 at 10/17/17 6:28 AM: -

[jira] [Commented] (SPARK-22289) Cannot save LogisticRegressionClassificationModel with bounds on coefficients

2017-10-16 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207063#comment-16207063 ] yuhao yang commented on SPARK-22289: Thanks for reporting the issue. Should be a stra

[jira] [Comment Edited] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193844#comment-16193844 ] yuhao yang edited comment on SPARK-22195 at 10/6/17 7:33 AM: -

[jira] [Commented] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193844#comment-16193844 ] yuhao yang commented on SPARK-22195: Thanks for the feedback. I don't see the existi

[jira] [Created] (SPARK-22210) Online LDA variationalTopicInference should use random seed to have stable behavior

2017-10-05 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22210: -- Summary: Online LDA variationalTopicInference should use random seed to have stable behavior Key: SPARK-22210 URL: https://issues.apache.org/jira/browse/SPARK-22210 Proj

[jira] [Commented] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2017-10-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192217#comment-16192217 ] yuhao yang commented on SPARK-3181: --- Regarding to whether to separate Huber loss an an i

[jira] [Commented] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190884#comment-16190884 ] yuhao yang commented on SPARK-22195: Exactly, the implementation is straight forward,

[jira] [Created] (SPARK-22195) Add cosine similarity to org.apache.spark.ml.linalg.Vectors

2017-10-03 Thread yuhao yang (JIRA)
yuhao yang created SPARK-22195: -- Summary: Add cosine similarity to org.apache.spark.ml.linalg.Vectors Key: SPARK-22195 URL: https://issues.apache.org/jira/browse/SPARK-22195 Project: Spark Issu

[jira] [Commented] (SPARK-21866) SPIP: Image support in Spark

2017-10-03 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190239#comment-16190239 ] yuhao yang commented on SPARK-21866: My two cents, 1. In most scenarios, deep learni

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139023#comment-16139023 ] yuhao yang commented on SPARK-21535: Thank for for the comments. > Reduce memory req

[jira] [Resolved] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-21535. Resolution: Not A Problem The new implementation will load the evaluation dataset when training mod

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103547#comment-16103547 ] yuhao yang commented on SPARK-21535: It's not in my opinion. https://issues.apache.o

[jira] [Comment Edited] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100860#comment-16100860 ] yuhao yang edited comment on SPARK-21535 at 7/26/17 6:30 PM: -

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-26 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101870#comment-16101870 ] yuhao yang commented on SPARK-21535: The basic idea is that we should release the dri

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100860#comment-16100860 ] yuhao yang commented on SPARK-21535: https://github.com/apache/spark/pulls > Reduce

[jira] [Updated] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-21535: --- Description: CrossValidator and TrainValidationSplit both use {code}models = est.fit(trainingDataset

[jira] [Created] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-07-25 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21535: -- Summary: Reduce memory requirement for CrossValidator and TrainValidationSplit Key: SPARK-21535 URL: https://issues.apache.org/jira/browse/SPARK-21535 Project: Spark

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-07-25 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100447#comment-16100447 ] yuhao yang commented on SPARK-21087: Withdrawing my PR, anyone with interests please

[jira] [Commented] (SPARK-21524) ValidatorParamsSuiteHelpers generates wrong temp files

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099313#comment-16099313 ] yuhao yang commented on SPARK-21524: https://github.com/apache/spark/pull/18728 > Va

[jira] [Created] (SPARK-21524) ValidatorParamsSuiteHelpers generates wrong temp files

2017-07-24 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21524: -- Summary: ValidatorParamsSuiteHelpers generates wrong temp files Key: SPARK-21524 URL: https://issues.apache.org/jira/browse/SPARK-21524 Project: Spark Issue Type

[jira] [Commented] (SPARK-14239) Add load for LDAModel that supports both local and distributedModel

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098948#comment-16098948 ] yuhao yang commented on SPARK-14239: Close overlooked stale jira. > Add load for LDA

[jira] [Resolved] (SPARK-14239) Add load for LDAModel that supports both local and distributedModel

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-14239. Resolution: Won't Do > Add load for LDAModel that supports both local and distributedModel > --

[jira] [Commented] (SPARK-12875) Add Weight of Evidence and Information value to Spark.ml as a feature transformer

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098946#comment-16098946 ] yuhao yang commented on SPARK-12875: Close stale jira. > Add Weight of Evidence and

[jira] [Resolved] (SPARK-12875) Add Weight of Evidence and Information value to Spark.ml as a feature transformer

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-12875. Resolution: Won't Do > Add Weight of Evidence and Information value to Spark.ml as a feature > tra

[jira] [Comment Edited] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098940#comment-16098940 ] yuhao yang edited comment on SPARK-14760 at 7/24/17 6:23 PM: -

[jira] [Commented] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098940#comment-16098940 ] yuhao yang commented on SPARK-14760: Close it since it's been overlooked for some tim

[jira] [Resolved] (SPARK-13223) Add stratified sampling to ML feature engineering

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-13223. Resolution: Not A Problem > Add stratified sampling to ML feature engineering > ---

[jira] [Commented] (SPARK-13223) Add stratified sampling to ML feature engineering

2017-07-24 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098933#comment-16098933 ] yuhao yang commented on SPARK-13223: Close it since it's been overlooked for some tim

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-07-21 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097062#comment-16097062 ] yuhao yang commented on SPARK-21086: sure, indices sounds fine. For the driver memor

[jira] [Updated] (SPARK-18724) Add TuningSummary for TrainValidationSplit and CountVectorizer

2017-07-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-18724: --- Summary: Add TuningSummary for TrainValidationSplit and CountVectorizer (was: Add TuningSummary for

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073987#comment-16073987 ] yuhao yang edited comment on SPARK-11069 at 7/4/17 6:32 PM:

[jira] [Comment Edited] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073987#comment-16073987 ] yuhao yang edited comment on SPARK-11069 at 7/4/17 6:31 PM:

[jira] [Commented] (SPARK-11069) Add RegexTokenizer option to convert to lowercase

2017-07-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073987#comment-16073987 ] yuhao yang commented on SPARK-11069: [~levente.torok.ge] use val regexTokenizer

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070883#comment-16070883 ] yuhao yang commented on SPARK-20082: I'm OK with only supporting initialModel for Onl

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070849#comment-16070849 ] yuhao yang commented on SPARK-19053: Not sure if this is still wanted. cc [~josephkb]

[jira] [Commented] (SPARK-18441) Add Smote in spark mlib and ml

2017-06-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067494#comment-16067494 ] yuhao yang commented on SPARK-18441: Move the Smote code to https://gist.github.com/

[jira] [Commented] (SPARK-21152) Use level 3 BLAS operations in LogisticAggregator

2017-06-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065694#comment-16065694 ] yuhao yang commented on SPARK-21152: This is something that we should investigate any

[jira] [Created] (SPARK-21108) convert LinearSVC to aggregator framework

2017-06-15 Thread yuhao yang (JIRA)
yuhao yang created SPARK-21108: -- Summary: convert LinearSVC to aggregator framework Key: SPARK-21108 URL: https://issues.apache.org/jira/browse/SPARK-21108 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-21087) CrossValidator, TrainValidationSplit should preserve all models after fitting: Scala

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048723#comment-16048723 ] yuhao yang commented on SPARK-21087: I'd like to work on this if my [comment|https:/

[jira] [Comment Edited] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048647#comment-16048647 ] yuhao yang edited comment on SPARK-21086 at 6/14/17 5:22 AM: -

[jira] [Comment Edited] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048647#comment-16048647 ] yuhao yang edited comment on SPARK-21086 at 6/14/17 5:12 AM: -

[jira] [Commented] (SPARK-20988) Convert logistic regression to new aggregator framework

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048698#comment-16048698 ] yuhao yang commented on SPARK-20988: Eh.. I was trying to add the squared_hinge loss

[jira] [Resolved] (SPARK-20348) Support squared hinge loss (L2 loss) for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang resolved SPARK-20348. Resolution: Duplicate Combine it with SPARK-20602 and resolve this as duplicate. > Support squared

[jira] [Commented] (SPARK-20602) Adding LBFGS optimizer and Squared_hinge loss for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048663#comment-16048663 ] yuhao yang commented on SPARK-20602: Combining this with SPARK-20348. Support squared

[jira] [Updated] (SPARK-20602) Adding LBFGS optimizer and Squared_hinge loss for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20602: --- Summary: Adding LBFGS optimizer and Squared_hinge loss for LinearSVC (was: Adding LBFGS as optimizer

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048647#comment-16048647 ] yuhao yang commented on SPARK-21086: Sounds good. About the default path for saving d

[jira] [Updated] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-06-13 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20602: --- Description: Currently LinearSVC in Spark only supports OWLQN as the optimizer ( check https://issue

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-05-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022379#comment-16022379 ] yuhao yang commented on SPARK-20082: refer to https://issues.apache.org/jira/browse/S

[jira] [Commented] (SPARK-20767) The training continuation for saved LDA model

2017-05-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022375#comment-16022375 ] yuhao yang commented on SPARK-20767: Note there's already an issue about setInitialMo

[jira] [Commented] (SPARK-20864) I tried to run spark mllib PIC algorithm, but got error

2017-05-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022345#comment-16022345 ] yuhao yang commented on SPARK-20864: [~yuanjie] Could you please provide more code to

[jira] [Commented] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016116#comment-16016116 ] yuhao yang commented on SPARK-20768: Thanks for the ping. [~mlnick] We should just tr

[jira] [Commented] (SPARK-20797) mllib lda's LocalLDAModel's save: out of memory.

2017-05-18 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016061#comment-16016061 ] yuhao yang commented on SPARK-20797: [~d0evi1] Thanks for reporting the issue and pro

[jira] [Created] (SPARK-20670) Simplify FPGrowth transform

2017-05-08 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20670: -- Summary: Simplify FPGrowth transform Key: SPARK-20670 URL: https://issues.apache.org/jira/browse/SPARK-20670 Project: Spark Issue Type: Improvement Com

[jira] [Commented] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-05-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997314#comment-15997314 ] yuhao yang commented on SPARK-20602: cc [~josephkb] > Adding LBFGS as optimizer for

[jira] [Created] (SPARK-20602) Adding LBFGS as optimizer for LinearSVC

2017-05-04 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20602: -- Summary: Adding LBFGS as optimizer for LinearSVC Key: SPARK-20602 URL: https://issues.apache.org/jira/browse/SPARK-20602 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-20526) Load doesn't work in PCAModel

2017-04-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989591#comment-15989591 ] yuhao yang commented on SPARK-20526: Can you please provide more context? like which

[jira] [Commented] (SPARK-20502) ML, Graph 2.2 QA: API: Experimental, DeveloperApi, final, sealed audit

2017-04-28 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989317#comment-15989317 ] yuhao yang commented on SPARK-20502: Check here https://issues.apache.org/jira/browse

[jira] [Created] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20351: -- Summary: Add trait hasTrainingSummary to replace the duplicate code Key: SPARK-20351 URL: https://issues.apache.org/jira/browse/SPARK-20351 Project: Spark Issue

[jira] [Created] (SPARK-20348) Support squared hinge loss (L2 loss) for LinearSVC

2017-04-15 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20348: -- Summary: Support squared hinge loss (L2 loss) for LinearSVC Key: SPARK-20348 URL: https://issues.apache.org/jira/browse/SPARK-20348 Project: Spark Issue Type: Ne

[jira] [Commented] (SPARK-7128) Add generic bagging algorithm to spark.ml

2017-04-11 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965121#comment-15965121 ] yuhao yang commented on SPARK-7128: --- I would vote for adding this now. This is quite h

[jira] [Updated] (SPARK-20271) Add FuncTransformer to simplify custom transformer creation

2017-04-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20271: --- Description: Just to share some code I implemented to help easily create a custom Transformer in one

[jira] [Created] (SPARK-20271) Add FuncTransformer to simplify custom transformer creation

2017-04-09 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20271: -- Summary: Add FuncTransformer to simplify custom transformer creation Key: SPARK-20271 URL: https://issues.apache.org/jira/browse/SPARK-20271 Project: Spark Issu

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-04-06 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959368#comment-15959368 ] yuhao yang commented on SPARK-20082: Sorry I'm occupied by some internal project this

[jira] [Commented] (SPARK-20203) Change default maxPatternLength value to Int.MaxValue in PrefixSpan

2017-04-04 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955705#comment-15955705 ] yuhao yang commented on SPARK-20203: [~Syrux] Since you got some experiences using th

[jira] [Comment Edited] (SPARK-20180) Unlimited max pattern length in Prefix span

2017-04-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952377#comment-15952377 ] yuhao yang edited comment on SPARK-20180 at 4/1/17 8:14 PM: I

[jira] [Commented] (SPARK-20180) Unlimited max pattern length in Prefix span

2017-04-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952377#comment-15952377 ] yuhao yang commented on SPARK-20180: I assume user can achieve the same effect by set

[jira] [Comment Edited] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944239#comment-15944239 ] yuhao yang edited comment on SPARK-20114 at 3/27/17 11:42 PM: -

[jira] [Commented] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944239#comment-15944239 ] yuhao yang commented on SPARK-20114: Currently I prefer to implement the dummy Prefix

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20114: --- Description: Creating this jira to track the feature parity for PrefixSpan and sequential pattern mi

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-20114: --- Description: Creating this jira to track the feature parity for PrefixSpan and sequential pattern mi

[jira] [Created] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2017-03-27 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20114: -- Summary: spark.ml parity for sequential pattern mining - PrefixSpan Key: SPARK-20114 URL: https://issues.apache.org/jira/browse/SPARK-20114 Project: Spark Issue

[jira] [Commented] (SPARK-20083) Change matrix toArray to not create a new array when matrix is already column major

2017-03-27 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943857#comment-15943857 ] yuhao yang commented on SPARK-20083: So the result array will allow users to manipula

  1   2   3   4   5   6   >