Thanks Tim.
For me `elty=Float32' so if I use `CudaArray(elty, ones(10))' or `CudaArray(
elty, ones(10)...)' I get a conversion error. [I am running Julia
0.5.0-dev+749]
The result of my CudaArray creation above looks like:
julia> to_host(CudaArray(map(elty, ones(10))))'
1x10 Array{Float32,2}:
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
I tried putting a `device_synchronize()' call in the `p2' block above like
so, which was probably needed anyway, but doesn't fix the error:
julia> p2 = quote
elty = eltype(d_M)
n1, n2 = size(d_M)
d_dots = CudaArray(map(elty, ones(n1)))
dev = device(d_dots)
dotf = cudakernels.ptxdict[(dev, "sqrownorms", elty)]
numblox = Int(ceil(n1/cudakernels.maxBlock))
CUDArt.launch(dotf, numblox, cudakernels.maxBlock, (d_M, n1, n2,
d_dots))
device_synchronize()
dots = to_host(d_dots)
free(d_dots)
dots
end
julia> sow(reps[3], :d_M, :(residual_shared(Y,A_init,S_init,1,sig)))
RemoteRef{Channel{Any}}(51,1,40341)
julia> reap(reps[3], :(string(d_M)))
Dict{Int64,Any} with 1 entry:
51 => "CUDArt.CudaArray{Float32,2}(CUDArt.CudaPtr{Float32}(Ptr{Float32}
@0x0000000b041e0000),(4000,2500),0)"
julia> reap(reps[3], p2)
ERROR: On worker 51:
"an illegal memory access was encountered"
[inlined code] from essentials.jl:111
in checkerror at /home/mcp50/.julia/v0.5/CUDArt/src/libcudart-6.5.jl:16
[inlined code] from /home/mcp50/.julia/v0.5/CUDArt/src/../gen-6.5/
gen_libcudart.jl:16
in device_synchronize at /home/mcp50/.julia/v0.5/CUDArt/src/device.jl:28
in anonymous at multi.jl:892
in run_work_thunk at multi.jl:645
[inlined code] from multi.jl:892
in anonymous at task.jl:59
in remotecall_fetch at multi.jl:731
[inlined code] from multi.jl:368
in remotecall_fetch at multi.jl:734
in anonymous at task.jl:443
in sync_end at ./task.jl:409
[inlined code] from task.jl:418
in reap at /home/mcp50/.julia/v0.5/ClusterUtils/src/ClusterUtils.jl:203
One thing I have noted is that a remote process crashes if I ever attempt
to move a `CudaArray' type/pointer from it to the host.
That shouldn't be happening in the above, but I wonder if, inadevertently
something similar is happening.
If I try calling the kernel on another process on the same machine, I don't
get the error:
julia> sow(62, :d_M, :(residual_shared($Y_init,$A_init,$S_init,1,$sig)))
RemoteRef{Channel{Any}}(62,1,40936)
julia> sum(reap(62, p2)[62])
5.149127f6
Hmm...