The clpp project is also out there.  The cantankerous issue with opencl is
getting the workgroups mapped propperly to each compute unit so for each
phase they can stream independently.

I CC'd Sean Baxter who is the current guru. I will be traveling to San
Diego later this week; so I won't be able do help out till next Monday.

-Chad
On Jan 16, 2012 2:19 PM, "Dieter Morgenroth" <[email protected]>
wrote:

>  Hi,
> I also worked on a radix sort implementation, I had a rough working
> implementation but I found that the numpy.argsort was much faster on my
> machine. So I delayed that task for now. But if someone comes up with a
> fast generic solution I would also be interested.
> I used the sorting for a sph simulation.
> http://youtu.be/1hHELRSCIm8
> I have only a notebook with an ATI graphics card. At least on that the
> numpy sort was about 5 times faster even on several million entries.
> Dieter
>
> Am 15.01.2012 22:22, schrieb Ian Johnson:
>
> Hi Andreas,
>
>  That code is the latest, I haven't touched it in a long time since my
> work has taken me away from opencl for the time being. As for the
> licensing, I put in an MIT license so its free as far as I'm concerned.
> Some of the radix code comes straight from the nvidia sdk example, we had
> to modify it a good bit to sort keys and values but I'm not sure what their
> licenses are.
>
>  This is also definitely not the best implementation of radix, as there
> is a much faster (and open) CUDA implementation. I would have hoped it
> would be ported to OpenCL by now, and there is this project:
> http://code.google.com/p/ocl-radix-sort/ which is GPL.
>
>  good luck! I'd like to hear about any improvements that come along!
> Ian
>
> On Sun, Jan 15, 2012 at 9:35 AM, Andreas Kloeckner <
> [email protected]> wrote:
>
>> Hi Ian,
>>
>> On Sun, 17 Apr 2011 22:29:41 -0400, Ian Johnson <[email protected]>
>> wrote:
>> > I finally bit the bullet and got radix working in PyOpenCL :)
>> > It's also improved over the SDK example because it does keys and values,
>> > mostly thanks to my advisor.
>> > Additionally this sort will handle any size array as long as it is a
>> power
>> > of 2. The shipped example does not allow for arrays smaller than 32768,
>> but
>> > I've hooked up their naive scan to allow all smaller arrays.
>> >
>> >
>> https://github.com/enjalot/adventures_in_opencl/tree/master/experiments/radix/nv
>> > all you really need are radix.py, RadixSort.cl and Scan_b.cl
>> >
>> > some simple tests are at the bottom of radix.py
>> >
>> > I hammered this out because I need it for a project, it's not all that
>> clean
>> > and I didn't add support for sorting on keys only (altho it wouldn't
>> take
>> > much to add that, and I intend to at a later time when I need the
>> > functionality). Hopefully this helps someone else out there. I'll also
>> be
>> > porting it using my own OpenCL C++ wrappers to include in my fluid
>> > simulation library at some point.
>> >
>> > I also began looking at AMD's radix from their SPH tutorial, but they
>> use
>> > local atomics which are not supported on my 9600M
>>
>>  Out of personal need, I'm thinking of bringing some kind of sort
>> functionality into PyOpenCL. I saw that you made a number of
>> enhancements to your sort code since you sent the announcement. Is your
>> most recent sort code still in the repo above? What is the license for
>> that code?  More generally, what course of action would you recommend?
>>
>> Thanks in advance for your help,
>> Andreas
>>
>
>
>
>  --
> Ian Johnson
> http://enja.org
>
>
>
> _______________________________________________
> PyOpenCL mailing 
> [email protected]http://lists.tiker.net/listinfo/pyopencl
>
>
>
> _______________________________________________
> PyOpenCL mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pyopencl
>
>
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to