Hi all,

I have a large file ( > 5 gigs) which I need to lookup. Since each slave
need to perform the search operation on the hashmap (built out of the file)
in parallel I need to broadcast the file. I was wondering if broadcasting
such a huge file is really a good idea. Do we have any benchmarks for the
broadcast variables. I am on a Standalone cluster and machine configuration
is not a problem at the moment.
Has anyone exploited broadcast to such an extent ?

Thanks,
Purav

Reply via email to