Have not upgraded yet...
On Wed, Mar 12, 2014 at 3:06 PM, Aureliano Buendia <buendia...@gmail.com> wrote: > Thanks, Ryan. Was your problem solved in spark 0.9? > > > On Wed, Mar 12, 2014 at 9:59 PM, Ryan Compton <compton.r...@gmail.com> > wrote: >> >> In 0.8 I had problems broadcasting variables around that size, for >> more info see here: >> >> https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201310.mbox/%3ccamgysq9sivs0j9dhv9qgdzp9qxgfadqkrd58b3ynbnhdgkp...@mail.gmail.com%3E >> >> On Wed, Mar 12, 2014 at 2:12 PM, Matei Zaharia <matei.zaha...@gmail.com> >> wrote: >> > You should try Torrent for this one, it will be faster. It’s still >> > experimental but I believe it works pretty well and it just needs more >> > testing to become the default. >> > >> > Matei >> > >> > On Mar 12, 2014, at 1:12 PM, Aureliano Buendia <buendia...@gmail.com> >> > wrote: >> > >> > Is TorrentBroadcastFactory out of beta? IS it preferred over >> > HttpBroadcastFactory for large broadcasts? >> > >> > What are the benefits of HttpBroadcastFactory as the default factory? >> > >> > >> > On Wed, Mar 12, 2014 at 7:09 PM, Stephen Boesch <java...@gmail.com> >> > wrote: >> >> >> >> Hi Josh, >> >> So then 2^31 (2.2Bilion) * 2^6 (length of double) = 128GB would >> >> be >> >> max array byte length with Doubles? >> >> >> >> >> >> 2014-03-12 11:30 GMT-07:00 Josh Marcus <jmar...@meetup.com>: >> >> >> >>> Aureliano, >> >>> >> >>> Just to answer your second question (unrelated to Spark), arrays in >> >>> java >> >>> and scala can't be larger than the maximum value of an Integer >> >>> (Integer.MAX_VALUE), which means that arrays are limited to about 2.2 >> >>> billion elements. >> >>> >> >>> --j >> >>> >> >>> >> >>> >> >>> On Wed, Mar 12, 2014 at 1:08 PM, Aureliano Buendia >> >>> <buendia...@gmail.com> >> >>> wrote: >> >>>> >> >>>> Hi, >> >>>> >> >>>> I asked a similar question a while ago, didn't get any answers. >> >>>> >> >>>> I'd like to share a 10 gb double array between 50 to 100 workers. The >> >>>> physical memory of workers is over 40 gb, so it can fit in each >> >>>> memory. The >> >>>> reason I'm sharing this array is that a cartesian operation is >> >>>> applied to >> >>>> this array, and I want to avoid network shuffling. >> >>>> >> >>>> 1. Is Spark broadcast built for pushing variables of gb size? Does it >> >>>> need special configurations (eg akka config, etc) to work under this >> >>>> condition? >> >>>> >> >>>> 2. (Not directly related to spark) Is the an upper limit for >> >>>> scala/java >> >>>> arrays other than the physical memory? Do they stop working when the >> >>>> array >> >>>> elements count exceeds a certain number? >> >>> >> >>> >> >> >> > >> > > >