On 09/08/2013 11:53 PM, Erik Schnetter wrote:
> Some CPU attributes influence the ABI. These need to be set correctly at
> all times, otherwise the executable won't work. This influences e.g. the
> calling conventions for functions, which is explicitly represented in
> bytecode. That is, a fully generic bytecode library is not possible, but we
> may be able to get away with using just a few per architecture.

Ah, true, the ABI.

> One would probably also need to make sure that earlier optimizations don't
> already expand builtins, since a different CPU may offer a more efficient
> implementation in terms of a CPU instruction that exists only on some CPUs
> (e.g. popcount, clz).

Yes. This is the idea of the intrinsics. They are expanded to whatever
is the most efficient implementation for the target in the backend.

> Apart from this -- implementing the kernel library purely with scalar
> functions and builtins is possible. We would have to experiment with how to
> present this to the vectorizer to make things as easy as possible.
> Currently, we split e.g. int16 into two int8 operations; this is a nicely
> recursive implementation, but the vectorizer may prefer a loop instead.

Currently the WG autovectorizer reuses the loopvectorizer of LLVM.
It wants to see the work-group's parallel regions as parallel loops with
scalar code which is as free from control constructs as possible. Thus, I
think fully scalarizing (no loops) might lead to best solutions in the
current WG autovectorization.

> I should introduce an option to Vecmathlib to do this. This would easily
> allow comparing performance, and could give hints to shortcomings of the
> vectorizer (and conversely, of Vecmathlib) that could then be addressed.

Good. It would be ideal to provide two versions of the kernel lib: one
optimized for "intra vector usage" for when WG autovectorization is
hopeless (but it would still be nice to use vector instructions, e.g., a WG
of size 1), and another for WG autovectorization. OTOH, for the former, the
BB vectorizer might do the trick automatically. Another thing to try.

-- 
Pekka

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to