I think one of the primary cases where mapPartitions is useful if you are
going to be doing any setup work that can be re-used between processing
each element, this way the setup work only needs to be done once per
partition (for example creating an instance of jodatime).

Both map and mapPartitions are implemented using the MapPartitionsRDD.

In general if your logic is easily expressed with map, and there isn't any
setup work you are doing that could be shared, using map instead of map
partitions tends to result in more readable code which is valuable in and
off its self.

On Tue, Jun 23, 2015 at 4:57 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote:

> I know when to use a map () but when should i use mapPartitions() ?
>
> Which is faster ?
>
> --
> Deepak
>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau
Linked In: https://www.linkedin.com/in/holdenkarau

Reply via email to