Hi, On 10/09/2012 03:33 PM, Sam Parker wrote: > I am creating a custom target which will run just the kernels, I want > the kernels to be compiled and linked statically with everything needed > to run the kernel included. So I believe the standalone method of > compilation is right, is this correct? > > I am using a configurable VLIW so I will want to combine work items to > expose ILP but the device is also multi-core. I am compiling the simple > standalone example and just would like to know how get_global_id is > calculated in the produced bytecode? And how I should use the > _*kernel*_workgroup and _*kernel*_workgroup_fast functions? I'm trying > to go through the code, but without comments, I am not making very fast > progress.
It sounds you want to do exactly what we have done in TUT and what was the original use case for the pocl kernel compiler passes before pocl was published a year ago. The main difference, I suppose, is that we use TTA as a processor template instead of a traditional "OTA" VLIW. The problematic part for the "standalone mode" are the host API parts, unless you create a custom launcher for your kernel which is not "official OpenCL". The standalone compilation of the host API together with the kernel binary is not supported in pocl, but I implemented the APIs I need in TCE libraries in the TCE source tree. Basically the host API stubs assume that the kernels are linked with the program, and thus the clBuildProgram etc. are dummy no-operations. The pocl-standalone script generates the work group function assuming you have the required work group attribute in place which is then called with a "trampoline function" glued in using the compiler driver script. Thus, check the TCE sources and its tcecc compiler driver (a python script) (http://tce.cs.tut.fi). The standalone mode of TCE is an incomplete proof-of-concept and I think the best way to get it more robust is to reuse the pocl implementations for the standalone mode as well. It should be possible to make it almost transparent to the OpenCL app whether it's compiled in the standalone mode offline or with an online compiler. There's a quick tutorial in TCE user manual: http://tce.cs.tut.fi/user_manual/TCE/node21.html Some papers we have written about this subject are available in the http://tce.cs.tut.fi/publications.html page. "OpenCL-based Design Methodology for Application-Specific Processors" and "TCEMC: A Co-Design Flow for Application-Specific Multicores" are the most relevant ones. Do you have any publications of your work, BTW? BR, -- --Pekka ------------------------------------------------------------------------------ Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
