The daemon of forgotten attachment strikes again. Corrected. Sorry about that.

On Sat, Jul 17, 2010 at 5:39 PM, Julien Cornebise
<[email protected]> wrote:
> Awesome, thanks Andreas ! Indeed test_driver.py works perfectly now.
> test_gpuarray.py still fails 10 out of 23, are they the same reason ?
> [output attached if you want to have a look].
>
> In any case, that means that the GPU is indeed working good, and that
> so does pyCUDA, then, thank you !
> Have fun in SIAM meeting -- here, enjoying SAMSI Summer Workshop,
> where the possibilities of GPU to speed-up Bayesian analysis of
> Pharmacokinetics/Pharmacodynamics (PK/PD) data seem to have interested
> the PK/PD people, often bothered by the amount of time required by
> Monte-Carlo methods.
>
> Cheers,
>
> Julien
>
> On Fri, Jul 16, 2010 at 12:07 PM, Andreas Kloeckner
> <[email protected]> wrote:
>> Hi Julien,
>>
>> On Fri, 9 Jul 2010 21:16:36 -0700, Julien Cornebise 
>> <[email protected]> wrote:
>>> New to pyCuda, and very excited by the possibilities, I'm
>>> unfortunately having a LaunchError problem with test_driver.py. I have
>>> tried to trace it down using printf() and such, and it seems that the
>>> last push over the cliff is in cuda.hpp
>>> context::prepare_context_switch(), line 505, a cuCtxPopCurrent returns
>>> CUDA_LAUNCH_FAILED -- although CUDA 3.1 Reference Manual p. 412 does
>>> not list it as a possible return but adds that it "may also return
>>> error codes from previous, asynchronous launches.", which I hence
>>> assume is the case.
>>>
>>> My rusted C/C++ skills do not allow me to go further, nor to be even
>>> sure that's the real problem.
>>>
>>> I'm using PyCUDA as checked out from git tonight, and Cuda 3.1 on
>>> Linux Ubuntu 10.04, with Python 2.6.5, and a brand new Geforce GTX
>>> 470.
>>> The GPU is fully functioning: I run all CUDA SDK's examples without a 
>>> problem.
>>> I also run pyCUDA test_math.py with no problem. (20 passed)
>>> Attached is the output of running
>>> python test_driver.py &>> out
>>> with CUDA_TRACE = 1.
>>>
>>> That's how far I've been able to go by myself, so now I'm turning to
>>> you: any help much welcome, thank you ! :)
>>
>> Sorry for the late reply--at SIAM AN10 currently. (Btw Bryan: I'll be at
>> your talk later. Anyone else want to meet up?)
>>
>> This should be fixed in git--Fermi GPUs apparently minds misaligned
>> access, which was what one test in there was doing.
>>
>> HTH,
>> Andreas
>>
>>
>
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxCreate
cuCtxGetDevice
==================================================== test session starts 
====================================================
platform linux2 -- Python 2.6.5 -- pytest-1.3.2
test path 1: test_gpuarray.py

test_gpuarray.py ...F.FF.FF.....F........F..F.F..F

========================================================= FAILURES 
==========================================================
_________________________________________________ TestGPUArray.test_minmax 
__________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x936baac>

    @mark_cuda_test
    def test_minmax(self):
        from pycuda.curandom import rand as curand
    
        if has_double_support():
            dtypes = [numpy.float64, numpy.float32, numpy.int32]
        else:
            dtypes = [numpy.float32, numpy.int32]
    
        for what in ["min", "max"]:
            for dtype in dtypes:
                a_gpu = curand((200000,), dtype)
                a = a_gpu.get()
    
                op_a = getattr(numpy, what)(a)
                op_a_gpu = getattr(gpuarray, what)(a_gpu).get()
    
>               assert op_a_gpu == op_a, (op_a_gpu, op_a, dtype, what)
E               AssertionError: (array(0.00043310853652656078, dtype=float32), 
6.633345e-07, <type 'numpy.float32'>, 'min')

test_gpuarray.py:424: AssertionError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuCtxGetDevice
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (md5_rng_float)
cuParamSetSize (md5_rng_float)
cuFuncSetBlockShape (md5_rng_float)
cuParamSetv (md5_rng_float)
cuLaunchGrid (md5_rng_float)
cuModuleUnload
cuMemcpyDtoH
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage1)
cuParamSetSize (reduce_kernel_stage1)
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage2)
cuParamSetSize (reduce_kernel_stage2)
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuParamSetv (reduce_kernel_stage1)
cuLaunchGrid (reduce_kernel_stage1)
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuParamSetv (reduce_kernel_stage2)
cuLaunchGrid (reduce_kernel_stage2)
cuMemFree
cuMemcpyDtoH
cuMemFree
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
________________________________________________ TestGPUArray.test_array_gt 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x9374fec>

    @mark_cuda_test
    def test_array_gt(self):
        """Test whether array contents are > the other array's
            contents"""
    
        a = numpy.array([5,10]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (a_gpu > b_gpu).get()

test_gpuarray.py:242: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9374b4c>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x937480c>

    def __gt__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:549:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
________________________________________________ TestGPUArray.test_array_eq 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x938522c>

    @mark_cuda_test
    def test_array_eq(self):
        """Test whether array contents are == the other array's
            contents"""
    
        a = numpy.array([5,10]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (a_gpu == b_gpu).get()

test_gpuarray.py:296: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x938574c>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x93858cc>

    def __eq__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:534:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
________________________________________________ TestGPUArray.test_array_ge 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x943e28c>

    @mark_cuda_test
    def test_array_ge(self):
        """Test whether array contents are >= the other array's
            contents"""
    
        a = numpy.array([5,10,1]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10,2]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (a_gpu >= b_gpu).get()

test_gpuarray.py:282: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9374eac>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9374c2c>

    def __ge__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:543:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
______________________________________________ TestGPUArray.test_subset_minmax 
______________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x9385bcc>

    @mark_cuda_test
    def test_subset_minmax(self):
        from pycuda.curandom import rand as curand
    
        l_a = 200000
        gran = 5
        l_m = l_a - l_a // gran + 1
    
        if has_double_support():
            dtypes = [numpy.float64, numpy.float32, numpy.int32]
        else:
            dtypes = [numpy.float32, numpy.int32]
    
        for dtype in dtypes:
            a_gpu = curand((l_a,), dtype)
            a = a_gpu.get()
    
            meaningful_indices_gpu = gpuarray.zeros(l_m, dtype=numpy.int32)
            meaningful_indices = meaningful_indices_gpu.get()
            j = 0
            for i in range(len(meaningful_indices)):
                meaningful_indices[i] = j
                j = j + 1
                if j % gran == 0:
                    j = j + 1
    
            meaningful_indices_gpu = gpuarray.to_gpu(meaningful_indices)
            b = a[meaningful_indices]
    
            min_a = numpy.min(b)
            min_a_gpu = gpuarray.subset_min(meaningful_indices_gpu, a_gpu).get()
    
>           assert min_a_gpu == min_a
E           assert array(6.3671031966805458e-05, dtype=float32) == 1.9656261e-05

test_gpuarray.py:458: AssertionError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuCtxGetDevice
cuMemAlloc
cuCtxGetDevice
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (md5_rng_float)
cuParamSetSize (md5_rng_float)
cuFuncSetBlockShape (md5_rng_float)
cuParamSetv (md5_rng_float)
cuLaunchGrid (md5_rng_float)
cuModuleUnload
cuMemcpyDtoH
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (fill)
cuParamSetSize (fill)
cuFuncSetBlockShape (fill)
cuParamSetv (fill)
cuLaunchGrid (fill)
cuMemcpyDtoH
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuMemFree
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage1)
cuParamSetSize (reduce_kernel_stage1)
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage2)
cuParamSetSize (reduce_kernel_stage2)
cuMemAlloc
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceComputeCapability
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuParamSetv (reduce_kernel_stage1)
cuLaunchGrid (reduce_kernel_stage1)
cuMemAlloc
cuCtxGetDevice
cuParamSetv (reduce_kernel_stage2)
cuLaunchGrid (reduce_kernel_stage2)
cuMemFree
cuMemcpyDtoH
cuMemFree
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
___________________________________________________ TestGPUArray.test_sum 
___________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x9443b4c>

    @mark_cuda_test
    def test_sum(self):
        from pycuda.curandom import rand as curand
        a_gpu = curand((200000,))
        a = a_gpu.get()
    
        sum_a = numpy.sum(a)
    
        from pycuda.reduction import get_sum_kernel
        sum_a_gpu = gpuarray.sum(a_gpu).get()
    
>       assert abs(sum_a_gpu-sum_a)/abs(sum_a) < 1e-4
E       assert (abs((array(1547.6207275390625, dtype=float32) - 99663.914)) / 
abs(99663.914)) < 0.0001

test_gpuarray.py:405: AssertionError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (md5_rng_float)
cuParamSetSize (md5_rng_float)
cuFuncSetBlockShape (md5_rng_float)
cuParamSetv (md5_rng_float)
cuLaunchGrid (md5_rng_float)
cuModuleUnload
cuMemcpyDtoH
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage1)
cuParamSetSize (reduce_kernel_stage1)
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage2)
cuParamSetSize (reduce_kernel_stage2)
cuMemAlloc
cuCtxGetDevice
cuParamSetv (reduce_kernel_stage1)
cuLaunchGrid (reduce_kernel_stage1)
cuMemAlloc
cuCtxGetDevice
cuParamSetv (reduce_kernel_stage2)
cuLaunchGrid (reduce_kernel_stage2)
cuMemFree
cuMemcpyDtoH
cuMemFree
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
________________________________________________ TestGPUArray.test_array_le 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x9447b2c>

    @mark_cuda_test
    def test_array_le(self):
        """Test whether array contents are <= the other array's
            contents"""
    
        a = numpy.array([5,10, 1]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10, 2]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (b_gpu <= a_gpu).get()

test_gpuarray.py:268: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x94475ac>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x944790c>

    def __le__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:540:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
________________________________________________ TestGPUArray.test_array_ne 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x9448bac>

    @mark_cuda_test
    def test_array_ne(self):
        """Test whether array contents are != the other array's
            contents"""
    
        a = numpy.array([5,10]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (a_gpu != b_gpu).get()

test_gpuarray.py:309: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9448f8c>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x945212c>

    def __ne__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:537:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
________________________________________________ TestGPUArray.test_array_lt 
_________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x945280c>

    @mark_cuda_test
    def test_array_lt(self):
        """Test whether array contents are < the other array's
            contents"""
    
        a = numpy.array([5,10]).astype(numpy.float32)
        a_gpu = gpuarray.to_gpu(a)
        b = numpy.array([2,10]).astype(numpy.float32)
        b_gpu = gpuarray.to_gpu(b)
>       result = (b_gpu < a_gpu).get()

test_gpuarray.py:255: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9452bec>
other = <[LogicError("cuMemcpyDtoH failed: invalid value") raised in repr()] 
SafeRepr object at 0x9452ccc>

    def __lt__(self, other):
>       raise NotImplementedError
E       NotImplementedError

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/gpuarray.py:546:
 NotImplementedError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuMemAlloc
cuCtxGetDevice
cuMemcpyHtoD
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
cuMemcpyDtoH
cuMemcpyDtoH
___________________________________________________ TestGPUArray.test_dot 
___________________________________________________

    def f(*args, **kwargs):
        import pycuda.driver
        # appears to be idempotent, i.e. no harm in calling it more than once
        pycuda.driver.init()
    
        ctx = make_default_context()
        try:
            assert isinstance(ctx.get_device().name(), str)
            assert isinstance(ctx.get_device().compute_capability(), tuple)
            assert isinstance(ctx.get_device().get_attributes(), dict)
>           inner_f(*args, **kwargs)

/usr/local/lib/python2.6/dist-packages/pycuda-0.94rc-py2.6-linux-i686.egg/pycuda/tools.py:502:
 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <test_gpuarray.TestGPUArray instance at 0x945818c>

    @mark_cuda_test
    def test_dot(self):
        from pycuda.curandom import rand as curand
        a_gpu = curand((200000,))
        a = a_gpu.get()
        b_gpu = curand((200000,))
        b = b_gpu.get()
    
        dot_ab = numpy.dot(a, b)
    
        dot_ab_gpu = gpuarray.dot(a_gpu, b_gpu).get()
    
>       assert abs(dot_ab_gpu-dot_ab)/abs(dot_ab) < 1e-4
E       assert (abs((array(777.8582763671875, dtype=float32) - 50083.789)) / 
abs(50083.789)) < 0.0001

test_gpuarray.py:472: AssertionError
------------------------------------------------------ Captured stderr 
------------------------------------------------------
cuInit
cuDeviceGetCount
cuDeviceGetCount
cuDeviceGet
cuCtxPopCurrent
cuCtxCreate
cuCtxGetDevice
cuDeviceGetName
cuCtxGetDevice
cuDeviceComputeCapability
cuCtxGetDevice
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuDeviceGetAttribute
cuMemAlloc
cuCtxGetDevice
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (md5_rng_float)
cuParamSetSize (md5_rng_float)
cuFuncSetBlockShape (md5_rng_float)
cuParamSetv (md5_rng_float)
cuLaunchGrid (md5_rng_float)
cuModuleUnload
cuMemcpyDtoH
cuMemAlloc
cuCtxGetDevice
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (md5_rng_float)
cuParamSetSize (md5_rng_float)
cuFuncSetBlockShape (md5_rng_float)
cuParamSetv (md5_rng_float)
cuLaunchGrid (md5_rng_float)
cuModuleUnload
cuMemcpyDtoH
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage1)
cuParamSetSize (reduce_kernel_stage1)
cuCtxGetDevice
cuDeviceComputeCapability
cuModuleLoadDataEx
cuModuleGetFunction
cuFuncSetBlockShape (reduce_kernel_stage2)
cuParamSetSize (reduce_kernel_stage2)
cuMemAlloc
cuCtxGetDevice
cuParamSetv (reduce_kernel_stage1)
cuLaunchGrid (reduce_kernel_stage1)
cuMemAlloc
cuCtxGetDevice
cuParamSetv (reduce_kernel_stage2)
cuLaunchGrid (reduce_kernel_stage2)
cuMemFree
cuMemcpyDtoH
cuMemFree
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxPushCurrent
cuCtxDetach
=========================================== 10 failed, 23 passed in 12.76 
seconds ===========================================
cuCtxPopCurrent
cuCtxPushCurrent
cuCtxDetach
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to