On Tue, Oct 7, 2014 at 3:12 AM, Kalle Raiskila <[email protected]> wrote:
>
>> > If not, one has to set the feature set (-triple and -march passed to
>> > Clang/llc) to the lowest possible, and it results in much worse
>> > performing kernel compilation results (no SIMD or other special
>> > instructions used) which is not nice as OpenCL is all about
>> > performance.
>> >
>> > Do you know if there is such a minimal ISA feature set defined
>> > for CPU architectures in Debian?
>>
>> I will try to find them. But, another possibility is to require an
>> extended feature set but bails at runtime with a clear error message
>> if it appears that the current CPU is not powerful enough. I think
>> that, at least for x86-32, it would be a better option. I know from
>> previous Debian discussion that the minimal ISA feature set on this
>> architecture is pretty low but the use case would probably be x86-64
>> processors that run 32bits software...
>
> For x86_64 it seems all processors are supported on Debian:
> http://www.debian.org/releases/stable/amd64/ch02s01.html.en
> I think the minimum set is not that bad, it has e.g. SSE2. If I
> understand correctly, this is the "x86-64" CPU for LLVM (note the
> underscore in the arch vs. dash in the CPU variant...)
>
> For x86, the minimum requirement seems to be the 486:
> http://www.debian.org/releases/stable/i386/ch02s01.html.en
> which kills performance.
>
> ARM is supported on "any ARM CPU".
> http://www.debian.org/releases/stable/armhf/ch02s01.html.en
> This suggests some really old ARM ISA - but pocl is exclusively tested
> on ARMv7 (Cortex-series). To add to the confusion, the NEON SIMD
> extension is not to my knowledge mandated even by the ARMv7, so to be
> portable even here, it would have to disable the SIMD.
>
> And then there are all the other cpu architectures pocl 0.10 is not
> tested on...
>
> Bailing out run-time sounds like a nice solution, but probably would
> need quite a bit of work, or at least quite a bit of testing.
> The ultimate solution is here to compile the OCL kernel run-time at
> run-time. But for performance reasons, this would need a bit of code to
> selectively pick just the few needed kernel functions to compile.

We could probably pick a few (three?) CPU feature sets, and then
choose at run time the highest possible one. My suggestion for these
would be i486, SSE2, and AVX.

-erik

-- 
Erik Schnetter <[email protected]>
http://www.perimeterinstitute.ca/personal/eschnetter/
AIM: eschnett247, Skype: eschnett, Google Talk: [email protected]

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to