Hello pocl-developers,
I've had some success implementing part of a new device driver in
pocl, where I'll be running on an x86_64 host and the device will be
ARM-ish based.
I've got a laundry list of questions, but overall it's gone quite
smoothly, and has been a lot of fun.
--------------------------------------------------------------------------------
It seems that cl_device_id.uninit() is never called?
--------------------------------------------------------------------------------
I notice in devices.c:pocl_init_devices(), that there is either an
environment variable:
device_list = getenv(POCL_DEVICES_ENV);
or a fallback to a hard coded list:
device_list = <list>;
I was wondering what the thoughts were on adding a device API like
.discover() or .probe() so that the driver can report whether it found
any of its devices ?
i.e. something like:
pocl_device_types[i].discover();
Also, I've been toying with the idea of having multiple devices of the
same type, e.g. like if you plugged two GPUs into the same board. Any
thoughts on that?
--------------------------------------------------------------------------------
I'm building the host to be staged installed on x86_64, and when I
installed pocl onto my staged system, I've got this ICD file:
bash$ cat /etc/OpenCL/vendors/pocl.icd
libpocl.so.1.2.0
yet annoyingly the ICD can't find the pocl lib when it does dlopen()
on it since dlopen only checks:
o The cache file /etc/ld.so.cache (maintained by ldconfig(8))
is
checked to see whether it contains an entry for filename.
o The directories /lib and /usr/lib are searched (in that order).
And I have:
bash$ ldconfig -p | grep pocl
libpoclu.so.1 (libc6,x86-64) => /usr/local/lib/libpoclu.so.1
libpoclu.so (libc6,x86-64) => /usr/local/lib/libpoclu.so
libpocl.so.1 (libc6,x86-64) => /usr/local/lib/libpocl.so.1
libpocl.so (libc6,x86-64) => /usr/local/lib/libpocl.so
Changing pocl.icd.in to:
@libdir@/libpocl.so.VER
Seemed to help me out there...
--------------------------------------------------------------------------------
Looking at this hunk in lib/CL/clCreateBuffer.c:
device_ptr = device->malloc(device->data, flags, size, host_ptr);
...
if (flags & (CL_MEM_ALLOC_HOST_PTR | CL_MEM_USE_HOST_PTR))
mem->mem_host_ptr = host_ptr;
I'm confused by the the _ALLOC_HOST_PTR. From the OpenCL doc:
This flag specifies that the application wants the OpenCL
implementation to allocate memory from host accessible memory.
i.e. since I don't see host_ptr set anywhere, and I'm pretty sure
host_ptr will be NULL in this case?
--------------------------------------------------------------------------------
What are your thoughts on debian and/or rpm packaging of libpocl ?
--------------------------------------------------------------------------------
I noticed cellspu.c seems to be using:
al = &(kernel->dyn_arguments[i]);
Whereas all the other devices are using run.arguments ?
--------------------------------------------------------------------------------
There is this hunk in the OpenCL spec:
If the argument is declared with the __local qualifier, the
arg_value entry must be NULL
yet I can generate:
arg[0]: Local arg size 8 (0x92d838)
Should this be detectable as an error somewhere?
i.e. the user was able to call:
clSetKernelArg(kernel, idx, sizeof(ptr), ptr);
For a __local void *ptr argument.
--------------------------------------------------------------------------------
When I compile, I see a few gcc warnings fly past, any plans for
"-Wall -Werror" ?
--------------------------------------------------------------------------------
I notice in pocl_device.h there is :
typedef void (*pocl_workgroup) (void **, struct pocl_context *);
Leading to:
void *arguments[kernel->num_args + kernel->num_locals];
w (arguments, pc);
Does this mean all arguments are effectively promoted to sizeof(void *)?
i.e. I guess there is no problem with 16-bit and 8-bit scalar
arguments here.
What happens if the host has 64-bit pointers and the device has 32-bit
pointers ? I guess *I* should carefully setup the argument list before
sending to the device and construct it in 32-bit quantity.
--------------------------------------------------------------------------------
As I'm building the host to be staged installed on x86_64, as well as
compiling an arm target, so I'm passing the following to configure:
LLVM_CONFIG=/home/alun/local-llvm/usr/local/bin/llvm-config
(Since I'm doing a DESTDIR:=~/local-llvm/ install, then tarball)
This of course means that I have a problem with @CLANG@
config.h:
/* LLVM compiler executable. */
#define LLC "/home/alun/local-llvm/usr/local/bin/llc"
/* clang executable. */
#define CLANG "/home/alun/local-llvm/usr/local/bin/clang"
So far I've just been hacking the scripts/pocl-* but some nicer
solution would be good...?
- I guess I don't care about these paths:
*** lib/CL/devices/common.c:llvm_codegen()
CLANG " -target %s %s -c -o %s.o %s",
LLC " " HOST_LLC_FLAGS " -o %s %s",
LINK_CMD " -target "OCL_KERNEL_TARGET
Unless I begin to try to call llvm_codegen() from my new
device.run()... but more about that in a minute...
--------------------------------------------------------------------------------
So that was the trivial stuff :) Now onto the harder issue.
I'm reasonably familiar with the autoconf host/build/target
selections, but I know it's a pain when you have multiple targets,
i.e. like when I've been building LLVM with:
--enable-targets=arm,x86,x86_64
For building pocl, it seems like the goal of configure.ac is to get
this triple of variables:
OCL_TARGETS, KERNEL_DIR, OCL_KERNEL_TARGET
And the default AC $TARGET stuff is unused.
KERNEL_DIR/ OCL_KERNEL_TARGET then get used as the llvm_target_triplet
and such in :
lib/CL/devices/basic/basic.h
lib/CL/devices/pthread/pocl-pthread.h
But they also get used in
*** lib/CL/devices/common.c:llvm_codegen()
CLANG " -target %s %s -c -o %s.o %s",
LLC " " HOST_LLC_FLAGS " -o %s %s",
LINK_CMD " -target "OCL_KERNEL_TARGET
- certainly meaning I can use llvm_codegen() unless I refactor it to
requre the target and options passed in.
Then I end up with a problem in lib/kernel/arm/Makefile.am
Since I have:
KERNEL_TARGET=@OCL_KERNEL_TARGET@
TARGET_DIR=arm
and that'll get me issues like:
,----
| /home/alun/local-llvm/usr/local/bin/clang -Xclang
-ffake-address-space-map -emit-llvm -fsigned-char -c -target
x86_64-unknown-linux-gnu -o add_sat.cl.bc -x cl ./../add_sat.cl -include
../../../include/arm/types.h -include
/home/alun/work/pocl-2/include/_kernel.h
| In file included from <built-in>:158:
| In file included from <command line>:2:
| /home/alun/work/pocl-2/include/_kernel.h:223:1: error: static_assert
failed
| "size_t"
| _CL_STATIC_ASSERT(size_t, sizeof(size_t) == sizeof(void*));
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| /home/alun/work/pocl-2/include/_kernel.h:76:37: note: expanded from macro
| '_CL_STATIC_ASSERT'
| # define _CL_STATIC_ASSERT(_t, _x) _Static_assert(_x, #_t)
| ^
`----
I notice cellspu is avoiding this world with :
CLANGFLAGS += -target cellspu-v0
CLANG_DEFAULT_INCLUDES = -include
$(top_builddir)/include/cellspu/types.h
I guess this is the issue of ARM being a host and device target, and I
was wondering what thoughts you had there?
Currently I've tweaked the arm/Makefile.am to :
KERNEL_TARGET:=arm-linux-gnueabihf
TARGET_DIR=arm
EXTRA_CLANGFLAGS:= \
-mcpu=cortex-a9 \
-mfloat-abi=softfp \
-mfpu=neon \
-mfpu=vfpv3
But this then means that while the kernel-arm-linux-gnueabihf.bc gets
made correctly, the kernel doesn't, so I get:
,----
| WARNING: Linking two modules of different target triples:
| /usr/local/lib/pocl/arm/kernel-arm-linux-gnueabihf.bc:
| 'armv7-linux-gnueabihf' and 'armv4t-linux-gnueabihf'
`----
i.e. I think in addition to the llvm target triple, the device driver
needs someway to set the compilation flags? I wonder if it shouldn't
be something like:
Usage: $0 [-t <llvm_target_triplet> -f <flags>] -o output input
btw I had to remove this section in pocl-kernel.in:
,----
| #pure clang doesn't allow "-target tce-tut-llvm"
| case $target in
| tce-*)
| target_flags="" ;;
| *)
| target_flags="-target $target";;
| esac
| @CLANG@ @HOST_CLANG_FLAGS@ $target_flags -c -o ${output_file}.o -x c -
<<EOF
`----
Since I was getting:
,----
| /usr/local/bin/clang -target arm-linux-gnueabihf -c -o
/tmp/poclN2oGr9/newdev/dot_product/descriptor.so.o -x c -
| /usr/bin/as: unrecognized option '-mfloat-abi=softfp'
`----
i.e. I'm not sure we want -target here for a host compilation ?
One last point, I also noticed that configure.ac declares
TARGET_LLC_FLAGS, but it seems to be unused.
--------------------------------------------------------------------------------
thanks for all the good work,
A.
--
Alun Evans
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel