Trying local[m], where m is the number of workers. For tests, local[2]
should be ideal. This is the way to accomplish writing tests for Spark code
generally.


On Tue, Nov 19, 2013 at 10:03 PM, Eugen Cepoi <[email protected]> wrote:

> Maybe a bug with HttpBroadcast or maybe my fault but can't find where :)
>
> The problem:
>   At the beginning a job computes a treemap(string, someobject) with a
> custom order (some dummy lowercase), this treemap is broadcasted.
>   Then i use this map to do some matching against input rdd (excluding
> those that don't exist).
>   What happens? In local (bc is in that case not used) or by passing all
> the treemap without broadcast I got more than 3M matchings, after broadcast
> it falls to 20K.
>
>  Replacing HttpBroadcastFactory with TreeBroadcastFactory solves the
> problem (I obtain expected results). I am trying to implement a test case
> to reproduce it, but it is quite tricky in that case...
>
> BTW is there a way to reproduce the broadcast mechanism in local (I see
> that the SparkEnv instance is shared as static, so I guess there is no easy
> way)?
>
> Thanks,
> Eugen
>



-- 
It's just about how deep your longing is!

Reply via email to