Trying local[m], where m is the number of workers. For tests, local[2] should be ideal. This is the way to accomplish writing tests for Spark code generally.
On Tue, Nov 19, 2013 at 10:03 PM, Eugen Cepoi <[email protected]> wrote: > Maybe a bug with HttpBroadcast or maybe my fault but can't find where :) > > The problem: > At the beginning a job computes a treemap(string, someobject) with a > custom order (some dummy lowercase), this treemap is broadcasted. > Then i use this map to do some matching against input rdd (excluding > those that don't exist). > What happens? In local (bc is in that case not used) or by passing all > the treemap without broadcast I got more than 3M matchings, after broadcast > it falls to 20K. > > Replacing HttpBroadcastFactory with TreeBroadcastFactory solves the > problem (I obtain expected results). I am trying to implement a test case > to reproduce it, but it is quite tricky in that case... > > BTW is there a way to reproduce the broadcast mechanism in local (I see > that the SparkEnv instance is shared as static, so I guess there is no easy > way)? > > Thanks, > Eugen > -- It's just about how deep your longing is!
