The SharedArray object ha a field loc_shmarr which represents the backing array. So S.loc_shmarr should work everywhere. But you are right, we need to ensure that the SharedArray can be used just as a regular array.
On Fri, Jan 24, 2014 at 9:00 AM, Madeleine Udell <[email protected]>wrote: > even more problematic: I can't multiply by my SharedArray: > > no method *(SharedArray{Float64,2}, Array{Float64,2}) > > > On Thursday, January 23, 2014 7:22:59 PM UTC-8, Madeleine Udell wrote: >> >> Thanks! I'm trying out a SharedArray solution now, but wondered if you >> can tell me if there's an easy way to reimplement many of the convenience >> wrappers on arrays for shared arrays. Eg I get the following errors: >> >> >> shared_array[1,:] >> no method getindex(SharedArray{Float64,2}, Float64, Range1{Int64}) >> >> >> repmat(shared_array,2,1) >> no method similar(SharedArray{Float64,2}, Type{Float64}, (Int64,Int64)) >> in repmat at abstractarray.jl:1043 >> >> I'm surprised these aren't inherited properties from AbstractArray! >> >> On Wednesday, January 22, 2014 8:05:45 PM UTC-8, Amit Murthy wrote: >>> >>> 1. The SharedArray object can be sent to any of the processes that >>> mapped the shared memory segment during construction. The backing array is >>> not copied. >>> 2. User defined composite types are fine as long as isbits(T) is true. >>> >>> >>> >>> On Thu, Jan 23, 2014 at 1:01 AM, Madeleine Udell >>> <[email protected]>wrote: >>> >>>> That's not a problem for me; all of my data is numeric. To summarize a >>>> long post, I'm interested in understanding >>>> >>>> 1) good programming paradigms for using shared memory together with >>>> parallel maps. In particular, can a shared array and other nonshared data >>>> structure be combined into a single data structure and "passed" in a remote >>>> call without unnecessarily copying the shared array? and >>>> 2) possibilities for extending shared memory in julia to other data >>>> types, and even to user defined types. >>>> >>>> >>>> On Tuesday, January 21, 2014 11:17:10 PM UTC-8, Amit Murthy wrote: >>>> >>>>> I have not gone through your post in detail, but would like to point >>>>> out that SharedArray can only be used for bitstypes. >>>>> >>>>> >>>>> On Wed, Jan 22, 2014 at 12:23 PM, Madeleine Udell < >>>>> [email protected]> wrote: >>>>> >>>>>> # Say I have a list of tasks, eg tasks i=1:n >>>>>> # For each task I want to call a function foo >>>>>> # that depends on that task and some fixed data >>>>>> # I have many types of fixed data: eg, arrays, dictionaries, >>>>>> integers, etc >>>>>> >>>>>> # Imagine the data comes from eg loading a file based on user input, >>>>>> # so we can't hard code the data into the function foo >>>>>> # although it's constant during program execution >>>>>> >>>>>> # If I were doing this in serial, I'd do the following >>>>>> >>>>>> type MyData >>>>>> myint >>>>>> mydict >>>>>> myarray >>>>>> end >>>>>> >>>>>> function foo(task,data::MyData) >>>>>> data.myint + data.myarray[data.mydict[task]] >>>>>> end >>>>>> >>>>>> n = 10 >>>>>> const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n)) >>>>>> >>>>>> results = zeros(n) >>>>>> for i = 1:n >>>>>> results[i] = foo(i,data) >>>>>> end >>>>>> >>>>>> # What's the right way to do this in parallel? Here are a number of >>>>>> ideas >>>>>> # To use @parallel or pmap, we have to first copy all the code and >>>>>> data everywhere >>>>>> # I'd like to avoid that, since the data is huge (10 - 100 GB) >>>>>> >>>>>> @everywhere begin >>>>>> type MyData >>>>>> myint >>>>>> mydict >>>>>> myarray >>>>>> end >>>>>> >>>>>> function foo(task,data::MyData) >>>>>> data.myint + data.myarray[data.mydict[task]] >>>>>> end >>>>>> >>>>>> n = 10 >>>>>> const data = MyData(rand(),Dict(1:n,randperm(n)),randperm(n)) >>>>>> end >>>>>> >>>>>> ## @parallel >>>>>> results = zeros(n) >>>>>> @parallel for i = 1:n >>>>>> results[i] = foo(i,data) >>>>>> end >>>>>> >>>>>> ## pmap >>>>>> @everywhere foo(task) = foo(task,data) >>>>>> results = pmap(foo,1:n) >>>>>> >>>>>> # To avoid copying data, I can make myarray a shared array >>>>>> # In that case, I don't want to use @everywhere to put data on each >>>>>> processor >>>>>> # since that would reinstantiate the shared array. >>>>>> # My current solution is to rewrite my data structure to *not* >>>>>> include myarray, >>>>>> # and pass the array to the function foo separately. >>>>>> # But the code gets much less pretty as I tear apart my data >>>>>> structure, >>>>>> # especially if I have a large number of shared arrays. >>>>>> # Is there a way for me to avoid this while using shared memory? >>>>>> # really, I'd like to be able to define my own shared memory data >>>>>> types... >>>>>> >>>>>> @everywhere begin >>>>>> type MySmallerData >>>>>> myint >>>>>> mydict >>>>>> end >>>>>> >>>>>> function foo(task,data::MySmallerData,myarray::SharedArray) >>>>>> data.myint + myarray[data.mydict[task]] >>>>>> end >>>>>> >>>>>> n = 10 >>>>>> const data = MySmallerData(rand(),Dict(1:n,randperm(n))) >>>>>> end >>>>>> >>>>>> myarray = SharedArray(randperm(n)) >>>>>> >>>>>> ## @parallel >>>>>> results = zeros(n) >>>>>> @parallel for i = 1:n >>>>>> results[i] = foo(i,data,myarray) >>>>>> end >>>>>> >>>>>> ## pmap >>>>>> @everywhere foo(task) = foo(task,data,myarray) >>>>>> results = pmap(foo,1:n) >>>>>> >>>>>> # Finally, what can I do to avoid copying mydict to each processor? >>>>>> # Is there a way to use shared memory for it? >>>>>> # Once again, I'd really like to be able to define my own shared >>>>>> memory data types... >>>>>> >>>>> >>>>> >>>
