Re: [PyCUDA] Installation error with pycuda-2015.1.3

2015-10-05 Thread Cheung, Samson H. (ARC-TN)[Computer Sciences Corporation]
I believe the problem is the
   -Dboost=pycudaboost

I believe it is set to a wrong place. But the configure —help is not clear how 
to deal with this!
Thanks,

~SC

From: "Cheung, Samson H. (ARC-TN)[Computer Sciences Corporation]" 
>
Date: Monday, October 5, 2015 at 1:07 PM
To: "pycuda@tiker.net" 
>
Subject: Installation error with pycuda-2015.1.3

Hello,

I encounter the following error while I am installing PyCUDA (doing make).  The 
command I use is
%> python configure.py --boost-compiler=gcc-4.9.2

I also tried
%> python configure.py \
  --boost-inc-dir=/nasa/pkgsrc/2015Q2/include/boost \
  --boost-lib-dir=/nasa/pkgsrc/2015Q2/lib \
  --boost-python-libname=boost-python-1.58.0 \
  --boost-compiler=gcc-4.9.2



. . .  < the error > . . .
. . .
gcc -pthread -Wno-unused-result -pipe -I/usr/include 
-I/nasa/pkgsrc/2015Q2/include/db4 -I/nasa/pkgsrc/2015Q2/include 
-I/nasa/pkgsrc/2015Q2/include/ncurses -O3 -DNDEBUG -fPIC -DHAVE_CURAND=1 
-DBOOST_ALL_NO_LIB=1 -DPYGPU_PYCUDA=1 
-DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1 -DBOOST_PYTHON_SOURCE=1 
-Dboost=pycudaboost -DBOOST_THREAD_BUILD_DLL=1 -DPYGPU_PACKAGE=pycuda 
-DBOOST_THREAD_DONT_USE_CHRONO=1 -Isrc/cpp -Ibpl-subset/bpl_subset 
-I/nasa/cuda/7.0/include 
-I/home3/old_home3/scheung/opt/python3.4.3/lib/python3.4/site-packages/numpy/core/include
 
-I/home3/old_home3/scheung/opt/python3.4.3/lib/python3.4/site-packages/numpy/core/include
 -I/home3/old_home3/scheung/opt/python3.4.3/include/python3.4 -c 
bpl-subset/bpl_subset/libs/system/src/error_code.cpp -o 
build/temp.linux-x86_64-3.4/bpl-subset/bpl_subset/libs/system/src/error_code.o
gcc -pthread -Wno-unused-result -pipe -I/usr/include 
-I/nasa/pkgsrc/2015Q2/include/db4 -I/nasa/pkgsrc/2015Q2/include 
-I/nasa/pkgsrc/2015Q2/include/ncurses -O3 -DNDEBUG -fPIC -DHAVE_CURAND=1 
-DBOOST_ALL_NO_LIB=1 -DPYGPU_PYCUDA=1 
-DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1 -DBOOST_PYTHON_SOURCE=1 
-Dboost=pycudaboost -DBOOST_THREAD_BUILD_DLL=1 -DPYGPU_PACKAGE=pycuda 
-DBOOST_THREAD_DONT_USE_CHRONO=1 -Isrc/cpp -Ibpl-subset/bpl_subset 
-I/nasa/cuda/7.0/include 
-I/home3/old_home3/scheung/opt/python3.4.3/lib/python3.4/site-packages/numpy/core/include
 
-I/home3/old_home3/scheung/opt/python3.4.3/lib/python3.4/site-packages/numpy/core/include
 -I/home3/old_home3/scheung/opt/python3.4.3/include/python3.4 -c 
bpl-subset/bpl_subset/libs/thread/src/pthread/once.cpp -o 
build/temp.linux-x86_64-3.4/bpl-subset/bpl_subset/libs/thread/src/pthread/once.o
In file included from /nasa/pkgsrc/2015Q2/include/boost/atomic/atomic.hpp:19:0,
 from /nasa/pkgsrc/2015Q2/include/boost/atomic.hpp:12,
 from 
/nasa/pkgsrc/2015Q2/include/boost/thread/pthread/once_atomic.hpp:20,
 from /nasa/pkgsrc/2015Q2/include/boost/thread/once.hpp:20,
 from bpl-subset/bpl_subset/libs/thread/src/pthread/once.cpp:7:
/nasa/pkgsrc/2015Q2/include/boost/atomic/capabilities.hpp:22:63: fatal error: 
pycudaboost/atomic/detail/caps_gcc_atomic.hpp: No such file or directory
 #include BOOST_ATOMIC_DETAIL_HEADER(boost/atomic/detail/caps_)
   ^
compilation terminated.
error: command 'gcc' failed with exit status 1
make: *** [all] Error 1


I would be graceful if someone can give me some hint!
Thanks,

~Samson

___
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda


[PyCUDA] Questions on pinned memory

2015-10-05 Thread Walter White
Hello,

I have a question about pinned memory and hope that you can help me.

I found out that copying data from device to host takes
a very big part of my runtime, so I read about the issue
and came across "pinned memory".

There are several examples on the mailing list but I am not
sure if I am doing this the right way.

Do I need to initialize with drv.ctx_flags.MAP_HOST
or is this automatically activated if one of the
functions below is used?

drv.init()
dev = drv.Device(0)
ctx = dev.make_context(drv.ctx_flags.SCHED_AUTO | drv.ctx_flags.MAP_HOST)


Is drv.mem_host_register_flags.DEVICEMAP also needed if
the context is initialized with drv.ctx_flags.MAP_HOST ?

I found several methods that should do this
but none of them seems to work.
Are they all equivalent?

--
x = drv.register_host_memory(x, flags=drv.mem_host_register_flags.DEVICEMAP)
x_gpu_ptr = np.intp(x.base.get_device_pointer())

--
x = drv.pagelocked_empty(shape=x.shape, dtype=np.float32,
mem_flags=drv.mem_host_register_flags.DEVICEMAP)
--

from pycuda.tools import PageLockedMemoryPool
pool = PageLockedMemoryPool()
x_ptr = pool.allocate(dest.shape , np.float32)
--


If I use
np.intp(x.base.get_device_pointer())
and
drv.memcpy_dtoh(a_gpu, x_ptr)

there is an error message

"BufferError: Object is not writable."

Kind regards,
Joe
___
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda


Re: [PyCUDA] Questions on pinned memory

2015-10-05 Thread Andreas Kloeckner
Walter White  writes:

> Hello,
>
> I have a question about pinned memory and hope that you can help me.
>
> I found out that copying data from device to host takes
> a very big part of my runtime, so I read about the issue
> and came across "pinned memory".
>
> There are several examples on the mailing list but I am not
> sure if I am doing this the right way.
>
> Do I need to initialize with drv.ctx_flags.MAP_HOST
> or is this automatically activated if one of the
> functions below is used?
>
> drv.init()
> dev = drv.Device(0)
> ctx = dev.make_context(drv.ctx_flags.SCHED_AUTO | drv.ctx_flags.MAP_HOST)

No, this is necessary.

> Is drv.mem_host_register_flags.DEVICEMAP also needed if
> the context is initialized with drv.ctx_flags.MAP_HOST ?
>
> I found several methods that should do this
> but none of them seems to work.
> Are they all equivalent?
>
> --
> x = drv.register_host_memory(x, flags=drv.mem_host_register_flags.DEVICEMAP)
> x_gpu_ptr = np.intp(x.base.get_device_pointer())
>
> --
> x = drv.pagelocked_empty(shape=x.shape, dtype=np.float32,
> mem_flags=drv.mem_host_register_flags.DEVICEMAP)
> --
>
> from pycuda.tools import PageLockedMemoryPool
> pool = PageLockedMemoryPool()
> x_ptr = pool.allocate(dest.shape , np.float32)
> --

The former two are equivalent. The latter just uses 'page-locked' memory
(which *can* be pinned, but normally isn't).

> If I use
> np.intp(x.base.get_device_pointer())
> and
> drv.memcpy_dtoh(a_gpu, x_ptr)
>
> there is an error message
>
> "BufferError: Object is not writable."

This is a sign that it worked--the memory is no longer writable host-side.

Andreas

___
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda