[
https://issues.apache.org/jira/browse/FLINK-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736031#comment-14736031
]
ASF GitHub Bot commented on FLINK-2533:
---------------------------------------
Github user ChengXiangLi commented on a diff in the pull request:
https://github.com/apache/flink/pull/1110#discussion_r39002678
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/sampling/BernoulliSampler.java
---
@@ -28,12 +28,16 @@
* Bernoulli experiment.
*
* @param <T> The type of sample.
+ * @see <a
href="http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/">Gap
Sampling</a>
*/
public class BernoulliSampler<T> extends RandomSampler<T> {
private final double fraction;
private final Random random;
+ //THRESHOLD is a tuning parameter for choosing sampling method
according to the fraction
--- End diff --
Format: A blank after // and no extra blanks after THRESHOULD, and a period
at the stop.
> Gap based random sample optimization
> ------------------------------------
>
> Key: FLINK-2533
> URL: https://issues.apache.org/jira/browse/FLINK-2533
> Project: Flink
> Issue Type: Improvement
> Components: Core
> Reporter: Chengxiang Li
> Priority: Minor
>
> For random sampler with fraction, like BernoulliSampler and PoissonSampler,
> Gap based random sampler could exploit O(k) sample implementation instead of
> previous O\(n\) sample implementation, it should perform better while sample
> fraction is very small. [This
> blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/]
> describes more detail about gap based random sampler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)