Hi Ioannis, with a flatMap operation which replicates elements and assigning them a proper key followed by a keyBy operation you can practically generate all different kinds of partitionings.
So if you first collect the data in parallel windows, you can then replicate half of the data of each window for each other window (assigning the replicates for each other window a distinct key). Next you group on this key and calculate the cartesian product for each resulting group. This should give you a parallel cartesian product. Cheers, Till On Thu, Feb 16, 2017 at 2:09 PM, Ioannis Kontopoulos <kls.yan...@gmail.com> wrote: > Hello everyone, > > Given a stream of events (each event has a timestamp and a key), I want to > create all possible combinations of the keys in a window (sliding, event > time) and then process those combinations in parallel. > > For example, if the stream contains events with keys 1,2,3,4 in a given > window and the possible combinations are: > > 1-2 > 1-3 > 1-4 > 2-3 > 2-4 > 3-4 > > and if the parallelism is set to 2, I want to have events with these keys: > > 1-2 2-3 > 1-3 2-4 > 1-4 3-4 > > You can see that there is some replication. So when I use the apply method > on a window it will have the keys separated like the example above. > > Is there a way to do that? >