partitioning - by itself - is a property of RDD. so essentially it is no
different in case of streaming where each batch is one RDD. You can use
partitionBy on RDD and pass on your custom partitioner function to it.

One thing you should consider is how balanced are your partitions ie your
partition scheme should not skew data into one partition too much.

Best
Ayan

On Wed, Aug 12, 2015 at 9:06 AM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:

> How does partitioning in spark work when it comes to streaming? What's the
> best way to partition a time series data grouped by a certain tag like
> categories of product video, music etc.
>



-- 
Best Regards,
Ayan Guha

Reply via email to