[
https://issues.apache.org/jira/browse/IGNITE-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Zinoviev updated IGNITE-12079:
-------------------------------------
Fix Version/s: (was: 2.8)
2.9
> [ML][Umbrella] Add advanced preprocessing techniques
> ----------------------------------------------------
>
> Key: IGNITE-12079
> URL: https://issues.apache.org/jira/browse/IGNITE-12079
> Project: Ignite
> Issue Type: New Feature
> Components: ml
> Affects Versions: 2.9
> Reporter: Alexey Zinoviev
> Assignee: Alexey Zinoviev
> Priority: Major
> Fix For: 2.9
>
>
> *Main goal:*
> To reduce the gap between Apache Spark and Apache Ignite in preprocessing
> operations. The reducing of the gap could help with loading Spark ML
> Pipelines to Ignite ML.
>
> Next steps:
> # Add Frequency Encoder
> # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT,
> LEAST_FREQUENT)
> # Add RobustScaler (will be added in Spark 3.0)
> # Add CountVectorizer
> # Add FeatureHasher
> # Add QuantileDiscretizer
> # Add Locality Sensitive Hashing (LSH)
> # Add LabelEncoder
> # Add RevertStringIndexing
> # Add multi-column preprocessor
--
This message was sent by Atlassian Jira
(v8.3.4#803005)