Could you give more details of your code?


On Wed, Mar 22, 2017 at 2:40 AM, Shashank Mandil <mandil.shash...@gmail.com>
wrote:

> Hi All,
>
> I have a spark data frame which has 992 rows inside it.
> When I run a map on this data frame I expect that the map should work for
> all the 992 rows.
>
> As a mapper runs on an executor on  a cluster I did a distributed count of
> the number of rows the mapper is being run on.
>
> dataframe.map(r => {
>    //distributed count inside here using zookeeper
> })
>
> I have found that this distributed count inside the mapper is not exactly
> 992. I have found this number to vary with different runs.
>
> Does anybody have any idea what might be happening ? By the way, I am
> using spark 1.6.1
>
> Thanks,
> Shashank
>
>

Reply via email to