That works great. Thanks!
On Thu, Jan 23, 2014 at 8:39 PM, Amit Murthy <amit.mur...@gmail.com> wrote: > The SharedArray object ha a field loc_shmarr which represents the backing > array. So S.loc_shmarr should work everywhere. But you are right, we need > to ensure that the SharedArray can be used just as a regular array. > > > On Fri, Jan 24, 2014 at 9:00 AM, Madeleine Udell < > madeleine.ud...@gmail.com> wrote: > >> even more problematic: I can't multiply by my SharedArray: >> >> no method *(SharedArray{Float64,2}, Array{Float64,2}) >> >> >> On Thursday, January 23, 2014 7:22:59 PM UTC-8, Madeleine Udell wrote: >>> >>> Thanks! I'm trying out a SharedArray solution now, but wondered if you >>> can tell me if there's an easy way to reimplement many of the convenience >>> wrappers on arrays for shared arrays. Eg I get the following errors: >>> >>> >> shared_array[1,:] >>> no method getindex(SharedArray{Float64,2}, Float64, Range1{Int64}) >>> >>> >> repmat(shared_array,2,1) >>> no method similar(SharedArray{Float64,2}, Type{Float64}, (Int64,Int64)) >>> in repmat at abstractarray.jl:1043 >>> >>> I'm surprised these aren't inherited properties from AbstractArray! >>> >>> On Wednesday, January 22, 2014 8:05:45 PM UTC-8, Amit Murthy wrote: >>>> >>>> 1. The SharedArray object can be sent to any of the processes that >>>> mapped the shared memory segment during construction. The backing array is >>>> not copied. >>>> 2. User defined composite types are fine as long as isbits(T) is true. >>>> >>>> >>>> >>>> On Thu, Jan 23, 2014 at 1:01 AM, Madeleine Udell >>>> <madelei...@gmail.com>wrote: >>>> >>>>> That's not a problem for me; all of my data is numeric. To summarize a >>>>> long post, I'm interested in understanding >>>>> >>>>> 1) good programming paradigms for using shared memory together with >>>>> parallel maps. In particular, can a shared array and other nonshared data >>>>> structure be combined into a single data structure and "passed" in a >>>>> remote >>>>> call without unnecessarily copying the shared array? and >>>>> 2) possibilities for extending shared memory in julia to other data >>>>> types, and even to user defined types. >>>>> >>>>> >>>>> On Tuesday, January 21, 2014 11:17:10 PM UTC-8, Amit Murthy wrote: >>>>> >>>>>> I have not gone through your post in detail, but would like to point >>>>>> out that SharedArray can only be used for bitstypes. >>>>>> >>>>>> >>>>>> On Wed, Jan 22, 2014 at 12:23 PM, Madeleine Udell < >>>>>> madelei...@gmail.com> wrote: >>>>>> >>>>>>> # Say I have a list of tasks, eg tasks i=1:n >>>>>>> # For each task I want to call a function foo >>>>>>> # that depends on that task and some fixed data >>>>>>> # I have many types of fixed data: eg, arrays, dictionaries, >>>>>>> integers, etc >>>>>>> >>>>>>> # Imagine the data comes from eg loading a file based on user input, >>>>>>> # so we can't hard code the data into the function foo >>>>>>> # although it's constant during program execution >>>>>>> >>>>>>> # If I were doing this in serial, I'd do the following >>>>>>> >>>>>>> type MyData >>>>>>> myint >>>>>>> mydict >>>>>>> myarray >>>>>>> end >>>>>>> >>>>>>> function foo(task,data::MyData) >>>>>>> data.myint + data.myarray[data.mydict[task]] >>>>>>> end >>>>>>> >>>>>>> n = 10 >>>>>>> const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n)) >>>>>>> >>>>>>> results = zeros(n) >>>>>>> for i = 1:n >>>>>>> results[i] = foo(i,data) >>>>>>> end >>>>>>> >>>>>>> # What's the right way to do this in parallel? Here are a number of >>>>>>> ideas >>>>>>> # To use @parallel or pmap, we have to first copy all the code and >>>>>>> data everywhere >>>>>>> # I'd like to avoid that, since the data is huge (10 - 100 GB) >>>>>>> >>>>>>> @everywhere begin >>>>>>> type MyData >>>>>>> myint >>>>>>> mydict >>>>>>> myarray >>>>>>> end >>>>>>> >>>>>>> function foo(task,data::MyData) >>>>>>> data.myint + data.myarray[data.mydict[task]] >>>>>>> end >>>>>>> >>>>>>> n = 10 >>>>>>> const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n)) >>>>>>> end >>>>>>> >>>>>>> ## @parallel >>>>>>> results = zeros(n) >>>>>>> @parallel for i = 1:n >>>>>>> results[i] = foo(i,data) >>>>>>> end >>>>>>> >>>>>>> ## pmap >>>>>>> @everywhere foo(task) = foo(task,data) >>>>>>> results = pmap(foo,1:n) >>>>>>> >>>>>>> # To avoid copying data, I can make myarray a shared array >>>>>>> # In that case, I don't want to use @everywhere to put data on each >>>>>>> processor >>>>>>> # since that would reinstantiate the shared array. >>>>>>> # My current solution is to rewrite my data structure to *not* >>>>>>> include myarray, >>>>>>> # and pass the array to the function foo separately. >>>>>>> # But the code gets much less pretty as I tear apart my data >>>>>>> structure, >>>>>>> # especially if I have a large number of shared arrays. >>>>>>> # Is there a way for me to avoid this while using shared memory? >>>>>>> # really, I'd like to be able to define my own shared memory data >>>>>>> types... >>>>>>> >>>>>>> @everywhere begin >>>>>>> type MySmallerData >>>>>>> myint >>>>>>> mydict >>>>>>> end >>>>>>> >>>>>>> function foo(task,data::MySmallerData,myarray::SharedArray) >>>>>>> data.myint + myarray[data.mydict[task]] >>>>>>> end >>>>>>> >>>>>>> n = 10 >>>>>>> const data = MySmallerData(rand(),Dict(1:n,randperm(n))) >>>>>>> end >>>>>>> >>>>>>> myarray = SharedArray(randperm(n)) >>>>>>> >>>>>>> ## @parallel >>>>>>> results = zeros(n) >>>>>>> @parallel for i = 1:n >>>>>>> results[i] = foo(i,data,myarray) >>>>>>> end >>>>>>> >>>>>>> ## pmap >>>>>>> @everywhere foo(task) = foo(task,data,myarray) >>>>>>> results = pmap(foo,1:n) >>>>>>> >>>>>>> # Finally, what can I do to avoid copying mydict to each processor? >>>>>>> # Is there a way to use shared memory for it? >>>>>>> # Once again, I'd really like to be able to define my own shared >>>>>>> memory data types... >>>>>>> >>>>>> >>>>>> >>>> > -- Madeleine Udell PhD Candidate in Computational and Mathematical Engineering Stanford University www.stanford.edu/~udell