[ https://issues.apache.org/jira/browse/IGNITE-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Zinoviev updated IGNITE-12079: -------------------------------------- Priority: Blocker (was: Major) > [ML][Umbrella] Add advanced preprocessing techniques > ---------------------------------------------------- > > Key: IGNITE-12079 > URL: https://issues.apache.org/jira/browse/IGNITE-12079 > Project: Ignite > Issue Type: New Feature > Components: ml > Affects Versions: 2.8 > Reporter: Aleksey Zinoviev > Assignee: Aleksey Zinoviev > Priority: Blocker > Fix For: 2.8 > > > *Main goal:* > To reduce the gap between Apache Spark and Apache Ignite in preprocessing > operations. The reducing of the gap could help with loading Spark ML > Pipelines to Ignite ML. > > Next steps: > # Add Frequency Encoder > # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, > LEAST_FREQUENT) > # Add RobustScaler (will be added in Spark 3.0) > # Add CountVectorizer > # Add FeatureHasher > # Add QuantileDiscretizer > # Add Locality Sensitive Hashing (LSH) > # Add LabelEncoder > # Add RevertStringIndexing > # Add multi-column preprocessor -- This message was sent by Atlassian Jira (v8.3.2#803003)