Dear Evgeny,
*I did something similar while back and remember I had some difficulties. *
*I used another library called pycula (which there is no support for that),
and I solved the above equation using truncated eigen decomposition*
*. In attached my source code where you can see both cpu and gpu
implementation and the run times. for my code I used *
*cula.gpu_devsyevx_index*
*which return a numpy array. i remember the is another function that return
gpu array. *
*another issue is pycula and Scikit **gpu array are not compatible (at
least when I was doing this).*
*Hope I could help.*
*Cheers,*
*Mohsen*
On Mon, Feb 24, 2014 at 8:56 PM, Evgeny Lazutkin
<[email protected]>wrote:
> Dear all,
>
> sorry for the delayed answer, I have problem with installation. But now
> everything is just fine.
>
> So, I have installed Scikit (as it was proposed from GitHub) and CULA.
>
> I am confused. I'd like to solve very simple system A*X = B, but it
> raises the error:
> *TypeError: only length-1 arrays can be converted to python scalars.*
> Could you please tell me, what is going wrong?
>
> I suppose, that I do everything wrong. Even if it works...how to obtain
> parallelization? From the example by Andreas, he used SourceModule with C
> language and for me it is obvious, what is happen there.
>
> But here, I cannot understand. I have tried to write "own" SourceModule
> and call functions from CULA - but when I try to manipulate with memory or
> write function - comes error - that I cannot do that from __device__
> /__global__.
>
> Oh...I am stuck (
>
> Could you please make a code corrections and give me an answers! Find
> please py-file in attach.
>
> Best regards,
> Evgeny
>
>
> Am 23.02.2014 15:03, schrieb Lev Givon:
>
> Received from Evgeny Lazutkin on Sun, Feb 23, 2014 at 03:53:12AM EST:
>
> Dear Andreas, dear all,
>
> thank you very much! I will install this package and perform the
> sample code! I hope after that you can correct me.
>
> Best regards,
> Evgeny
>
> I suggest that you install the latest revision of the package from GitHub
> rather
> than the tarball on PyPI. If you encounter any problems, feel free to submit a
> report via the project's GitHub issue tracker (scikits.cuda is developed
> separately from pycuda).
>
>
>
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda
>
>
--
Mohsen
#===============================================================================
# CUDA libraries
#===============================================================================
#import pycuda.gpuarray as gpuarray
#import pycuda.autoinit
#import pycuda.driver as cuda
#import PyCULA.cula as cula #@UnresolvedImport
#import scikits.cuda.linalg as la
##===============================================================================
# Routin Libraries
#===============================================================================
import numpy as np
import scipy.linalg as spla
from numpy.lib import stride_tricks
#cula.culaInitialize()
#cula.mixed_init()
#la.init()
#===============================================================================
# Largest Eigenvalue
#===============================================================================
def cpu(k_,y_, lo, hi,flag):
# start = cuda.Event()
# end = cuda.Event()
# start.record()
w, v = spla.eigh(k_, eigvals=(lo, hi))
temp= np.dot(np.dot(v,np.diag(1.0/w)),v.T)
c=np.dot(temp,y_)
# end.record()
# end.synchronize()
# time = start.time_till(end) * 1e-3
if flag=="t":
value= round(time,5)
print 'cpu: %f' %value
elif flag=="r":
value=c
else:
raise ValueError('unrecognized flag')
return value
#def gpu(k_,y_, il, iu,flag):
#
# k_gpu = cula.cula_gpuarray_like(k_)
# n=k_.shape[0]
# cuda.Context.synchronize()
# start = cuda.Event()
# end = cuda.Event()
# start.record()
#
# w,v=cula.gpu_devsyevx_index(k_gpu,il,iu,vectors=True,uplo='L')
# newShape=(n,iu-il+1)
#
# elmSize=v.itemsize
# z=stride_tricks.as_strided(v, shape=newShape, strides=(elmSize,elmSize*newShape[0]))
#
#
# w=np.delete(w,np.s_[iu-il+1:],0)#delete zeros from eigenvalues
# temp= np.dot(np.dot(z,np.diag(1.0/w)),z.T)
# c=np.dot(temp,y_)
#
# end.record()
# end.synchronize()
# time = start.time_till(end) * 1e-3
#
# #delete zero cols. from vectors
## v=np.delete(v_gpu.get().T, np.s_[iu-il+1:], 1)
#
# if flag=="t":
# value= round(time,5)
# print 'gpu: %f' %value
#
# elif flag=="r":
#
# value=c
#
# else:
# raise ValueError('unrecognized flag')
# return value
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda