Anyone? On Fri, Apr 21, 2017 at 10:15 PM, Elias Levy <fearsome.lucid...@gmail.com> wrote:
> This is something that has come up before on the list, but in a different > context. I have a need to rekey a stream but would prefer the stream to > not be repartitioned. There is no gain to repartitioning, as the new > partition key is a composite of the stream key, going from a key of A to a > key of (A, B), so all values for the resulting streams are already being > rerouted to the same node and repartitioning them to other nodes would > simply generate unnecessary network traffic and serde overhead. > > Unlike previous use cases, I am not trying to perform aggregate > operations. Instead I am executing CEP patterns. Some patterns apply the > the stream keyed by A and some on the stream keyed by (A,B). > > The API does not appear to have an obvious solution to this situation. > keyBy() will repartition and there is isn't something like subKey() to > subpartion a stream without repartitioning (e.g. keyBy(A).subKey(B)). > > I suppose I could accomplish it by using partitionCustom(), ignoring the > second element in the key, and delegating to the default partitioner > passing it only the first element, thus resulting in no change of task > assignment. > > Thoughts? >