[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634664#comment-14634664
 ] 

Till Rohrmann commented on FLINK-1901:
--------------------------------------

Hi Chengxiang,

good to hear that you want to work in this. I can assign you the ticket. 
However, it is not only about the sampling strategy but also about the 
integration within Flink. The reason is that we have to make sure that the 
sampling operator also works within iterations. This means that it has to be 
part of the dynamic path so that it is triggered for every iteration again and 
again. This will need a special operator type.

But you can start with the sampling strategies and then continue with the 
iteration integration.

> Create sample operator for Dataset
> ----------------------------------
>
>                 Key: FLINK-1901
>                 URL: https://issues.apache.org/jira/browse/FLINK-1901
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Theodore Vasiloudis
>
> In order to be able to implement Stochastic Gradient Descent and a number of 
> other machine learning algorithms we need to have a way to take a random 
> sample from a Dataset.
> We need to be able to sample with or without replacement from the Dataset, 
> choose the relative size of the sample, and set a seed for reproducibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to