Rodrigo Benenson created BEAM-3000: -------------------------------------- Summary: No python equivalent of org.apache.beam.sdk.transforms.Sample.any(100)? Key: BEAM-3000 URL: https://issues.apache.org/jira/browse/BEAM-3000 Project: Beam Issue Type: Improvement Components: sdk-py-core Reporter: Rodrigo Benenson Assignee: Ahmet Altay Priority: Critical
Java's org.apache.beam.sdk.transforms.Sample.any will return a PCollection with bounded size (filtering strategy). The closest python eqiuvalent is beam.Sample.FixedSizeGlobally(n) whover, this version uses a combiner strategy, returning a list with n elements; which does not scale if n is "bigger than what fits in memory". -- This message was sent by Atlassian JIRA (v6.4.14#64029)