Thanks Tim. 

For me `elty=Float32' so if I use `CudaArray(elty, ones(10))' or `CudaArray(
elty, ones(10)...)' I get a conversion error. [I am running Julia 
0.5.0-dev+749]
The result of my CudaArray creation above looks like:

julia> to_host(CudaArray(map(elty, ones(10))))'
1x10 Array{Float32,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

I tried putting a `device_synchronize()' call in the `p2' block above like 
so, which was probably needed anyway, but doesn't fix the error:

julia> p2 = quote 
           elty = eltype(d_M)
           n1, n2 = size(d_M)
           d_dots = CudaArray(map(elty, ones(n1)))
           dev = device(d_dots)
           dotf = cudakernels.ptxdict[(dev, "sqrownorms", elty)]
           numblox = Int(ceil(n1/cudakernels.maxBlock))
           CUDArt.launch(dotf, numblox, cudakernels.maxBlock, (d_M, n1, n2, 
d_dots))
           device_synchronize()
           dots = to_host(d_dots)
           free(d_dots)
           dots
       end

julia> sow(reps[3], :d_M, :(residual_shared(Y,A_init,S_init,1,sig)))
RemoteRef{Channel{Any}}(51,1,40341)

julia> reap(reps[3], :(string(d_M)))
Dict{Int64,Any} with 1 entry:
  51 => "CUDArt.CudaArray{Float32,2}(CUDArt.CudaPtr{Float32}(Ptr{Float32} 
@0x0000000b041e0000),(4000,2500),0)"

julia> reap(reps[3], p2)
ERROR: On worker 51:
"an illegal memory access was encountered"
 [inlined code] from essentials.jl:111
 in checkerror at /home/mcp50/.julia/v0.5/CUDArt/src/libcudart-6.5.jl:16
 [inlined code] from /home/mcp50/.julia/v0.5/CUDArt/src/../gen-6.5/
gen_libcudart.jl:16
 in device_synchronize at /home/mcp50/.julia/v0.5/CUDArt/src/device.jl:28
 in anonymous at multi.jl:892
 in run_work_thunk at multi.jl:645
 [inlined code] from multi.jl:892
 in anonymous at task.jl:59
 in remotecall_fetch at multi.jl:731
 [inlined code] from multi.jl:368
 in remotecall_fetch at multi.jl:734
 in anonymous at task.jl:443
 in sync_end at ./task.jl:409
 [inlined code] from task.jl:418
 in reap at /home/mcp50/.julia/v0.5/ClusterUtils/src/ClusterUtils.jl:203

One thing I have noted is that a remote process crashes if I ever attempt 
to move a `CudaArray' type/pointer from it to the host. 
That shouldn't be happening in the above, but I wonder if, inadevertently 
something similar is happening.

If I try calling the kernel on another process on the same machine, I don't 
get the error:

julia> sow(62, :d_M, :(residual_shared($Y_init,$A_init,$S_init,1,$sig)))
RemoteRef{Channel{Any}}(62,1,40936)

julia> sum(reap(62, p2)[62])
5.149127f6

Hmm...




Reply via email to