[
https://issues.apache.org/jira/browse/SPARK-17704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-17704:
------------------------------
Assignee: Yanbo Liang
> ChiSqSelector performance improvement.
> --------------------------------------
>
> Key: SPARK-17704
> URL: https://issues.apache.org/jira/browse/SPARK-17704
> Project: Spark
> Issue Type: Improvement
> Components: ML, MLlib
> Reporter: Yanbo Liang
> Assignee: Yanbo Liang
> Priority: Minor
> Fix For: 2.1.0
>
>
> Several performance improvement for {{ChiSqSelector}}:
> 1, Keep {{selectedFeatures}} ordered ascendent.
> {{ChiSqSelectorModel.transform}} need {{selectedFeatures}} ordered to make
> prediction. We should sort it when training model rather than making
> prediction, since users usually train model once and use the model to do
> prediction multiple times.
> 2, When training {{fpr}} type {{ChiSqSelectorModel}}, it's not necessary to
> sort the ChiSq test result by statistic.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]