[
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Pavlov updated IGNITE-8666:
-----------------------------------
Fix Version/s: (was: 2.6)
2.7
> Add ability of filtering data during datasets creation
> ------------------------------------------------------
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
> Issue Type: New Feature
> Components: ml
> Reporter: Yury Babak
> Assignee: Anton Dmitriev
> Priority: Major
> Fix For: 2.7
>
>
> So far we use straightforward strategy to feed data into partition based
> dataset. We retrieve all entries from an upstream cache partition, transform
> it somehow and write into correspondent dataset partition (data and context).
> As result we can't choose the data to be fed into dataset and data to be not
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache
> to dataset. It will allow us to create different dataset (training, testing,
> k-fold, etc...) based on a single cache.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)