[ https://issues.apache.org/jira/browse/BEAM-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Darshan Jani updated BEAM-9946: ------------------------------- Description: Currently _Partition_ transform can partition a collection into n collections based on only _element_ value in _PartitionFn_ to decide on which partition a particular element belongs to. {code:java} public interface PartitionFn<T> extends Serializable { int partitionFor(T elem, int numPartitions); } public static <T> Partition<T> of(int numPartitions, PartitionFn<? super T> partitionFn) { return new Partition<>(new PartitionDoFn<T>(numPartitions, partitionFn)); } {code} It will be useful to introduce new API with additional _sideInputs_ provided to partition function. User will be able to write logic to use both _element_ value and _sideInputs_ to decide on which partition a particular element belongs to. Proposed new API: {code:java} public interface PartitionWithSideInputsFn<T> extends Serializable { int partitionFor(T elem, int numPartitions, Requirements requirements); } public static <T> Partition<T> of(int numPartitions, PartitionWithSideInputsFn<? super T> partitionFn) { ... } {code} was: Currently _Partition_ transform can partition a collection into n collections based on only _element_ value in _PartitionFn_ to decide on which partition a particular element belongs to. {code:java} public interface PartitionFn<T> extends Serializable { int partitionFor(T elem, int numPartitions); } public static <T> Partition<T> of(int numPartitions, PartitionFn<? super T> partitionFn) { return new Partition<>(new PartitionDoFn<T>(numPartitions, partitionFn)); } {code} It will be useful to have new API with additional _sideInputs_ provided to partition function. User will be able to write logic to use both _element_ value and _sideInputs_ to decide on which partition a particular element belongs to. Proposed new API: {code:java} public interface PartitionWithSideInputsFn<T> extends Serializable { int partitionFor(T elem, int numPartitions, Requirements requirements); } public static <T> Partition<T> of(int numPartitions, PartitionWithSideInputsFn<? super T> partitionFn) { ... } {code} > Enhance Partition transform to provide partitionfn with SideInputs > ------------------------------------------------------------------ > > Key: BEAM-9946 > URL: https://issues.apache.org/jira/browse/BEAM-9946 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core > Reporter: Darshan Jani > Assignee: Darshan Jani > Priority: Major > Original Estimate: 96h > Remaining Estimate: 96h > > Currently _Partition_ transform can partition a collection into n collections > based on only _element_ value in _PartitionFn_ to decide on which partition a > particular element belongs to. > {code:java} > public interface PartitionFn<T> extends Serializable { > int partitionFor(T elem, int numPartitions); > } > public static <T> Partition<T> of(int numPartitions, PartitionFn<? super T> > partitionFn) { > return new Partition<>(new PartitionDoFn<T>(numPartitions, partitionFn)); > } > {code} > It will be useful to introduce new API with additional _sideInputs_ provided > to partition function. User will be able to write logic to use both _element_ > value and _sideInputs_ to decide on which partition a particular element > belongs to. > Proposed new API: > {code:java} > public interface PartitionWithSideInputsFn<T> extends Serializable { > int partitionFor(T elem, int numPartitions, Requirements requirements); > } > public static <T> Partition<T> of(int numPartitions, > PartitionWithSideInputsFn<? super T> partitionFn) { > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)