Re: [Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition

2018-05-10 Thread Cody Koeninger
As long as you aren't doing any spark operations that involve a shuffle, the order you see in spark should be the same as the order in the partition. Can you link to a minimal code example that reproduces the issue? On Wed, May 9, 2018 at 7:05 PM, karthikjay wrote: > On the

[Structured-Streaming][Beginner] Out of order messages with Spark kafka readstream from a specific partition

2018-05-09 Thread karthikjay
On the producer side, I make sure data for a specific user lands on the same partition. On the consumer side, I use a regular Spark kafka readstream and read the data. I also use a console write stream to print out the spark kafka DataFrame. What I observer is, the data for a specific user (even