[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636124#comment-14636124
 ] 

Chengxiang Li commented on FLINK-1901:
--------------------------------------

"every point is sampled with probability 1/N" is one of the sampling 
case(sampling with fraction, without replacement), there are 3 others kind of 
sampling case which is normally used as well, like "sampling with fraction, 
with replacement", "sampling with fixed size, without replacement" and 
"sampling with fixed size, with replacement". We should support all of them 
while expose a sampling operator to user. 

> Create sample operator for Dataset
> ----------------------------------
>
>                 Key: FLINK-1901
>                 URL: https://issues.apache.org/jira/browse/FLINK-1901
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Theodore Vasiloudis
>            Assignee: Chengxiang Li
>
> In order to be able to implement Stochastic Gradient Descent and a number of 
> other machine learning algorithms we need to have a way to take a random 
> sample from a Dataset.
> We need to be able to sample with or without replacement from the Dataset, 
> choose the relative size of the sample, and set a seed for reproducibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to