On Fri, 2009-07-24 at 09:45 -0700, Zack Rusin wrote:
> On Friday 24 July 2009 10:01:49 José Fonseca wrote:
> > On Thu, 2009-07-23 at 16:38 -0700, Zack Rusin wrote:
> > > I thought about that and discarded that for the following reasons:
> > > 1) it doesn't solve the main/core problem of the representation: how to
> > > represent vectors.
> >
> > Aren't LLVM vector types (http://llvm.org/docs/LangRef.html#t_vector)
> > good enough?
> 
> Yes, they are, but that's not what we're trying to answer. Looking at 
> DCL IN[0]
> DCL OUT[0]
> MOV OUT[0], IN[0].xzzz
> 
> in aos that will be
> decl <4 x float> out
> decl <4 x float > in
> out = shuffle(in, 0, 2, 2, 2);
> 
> for soa4 that will be
> decl <4 x float> outx
> decl <4 x float> outy
> decl <4 x float> outz
> decl <4 x float> outw
> decl <4 x float> inx
> decl <4 x float> iny
> decl <4 x float> inz
> decl <4 x float> inw
> outx = inx
> outy = inz
> outz = inz
> outw = inz
> 
> for soa16 that will be
> decl <16 x float> outx
> decl <16 x float> outy
> decl <16 x float> outz
> decl <16 x float> outw
> decl <16 x float> inx
> decl <16 x float> iny
> decl <16 x float> inz
> decl <16 x float> inw
> outx = inx
> outy = inz
> outz = inz
> outw = inz
> 
> And that's for a trivial mov. Each path obviously generates very different 
> code. The code that is currently in gallivm basically creates a new compiler 
> for each of these. Which is one way of dealing with it. I, personally, didn't 
> like it at all, but didn't have at the time better solution for that.
> 
> Furthermore it's not just the compilation - inputs and outputs needs to be 
> swizzled differently for each of those paths as well, so preamble and 
> postamble 
> has to be generated for them as well.
> 
> 
> > The vector width could be a global parameter computed before starting
> > the TGSI -> LLVM IR translation, which takes in account not only the
> > target platform but the input/output data types (e.g. SSE2 has different
> > vector widths for different data types).
> >
> > For mimd vs simd we could have two variations -- SoA and AoS. Again, we
> > could have this as a initial parameter, or two abstract classes derived
> > from Instruction, from which the driver would then derive from.
> 
> Yes, that's pretty much exactly what the code in gallivm does right now. Lets 
> you pick representation aos/soa, and vector width and then tries to generate 
> the code as it was told. It's not very successful at it though =) (mainly 
> because the actual generation paths for one are completely different from the 
> other, so if it's working in aos it doesn't mean anything for soa)
> 
> > My suggestion of an abstract Instruction class with virtual methods was
> > just for the sake of argument. You can achieve the same thing with a C
> > structure of function pointers together with the included LLVM C
> > bindings (http://llvm.org/svn/llvm-project/llvm/trunk/include/llvm-c/)
> > which appears to fully cover the IR generation interfaces.
> 
> To be honest I don't think that using LLVM from C for our purposes is a good 
> idea. Mainly because it will be impossible to do because backends (the actual 
> code-generators for given hardware) will need to be C++ anyway so we'll again 
> end up with a mix of C++ and C. It's just that the interface from C++ is a 
> lot 
> nicer than the stuff that was wrapped in C.

I understand that the C bindings might be more cumbersome and limited.
But I don't understand how using or not avoids the mixing C and C++: if
LLVM is in C++ and Gallium is in C something will have to make brigde.
It is a matter of what and how thick that bridge is.

> > > That wouldn't work because LLVM wouldn't know what to do with them which
> > > would defeat the whole reason for using LLVM (i.e. it would make
> > > optimization passes do nothing).
> >
> > Good point. But can't the same argument be done for intrinsics? The
> > existing optimization passes don't know what to do with them either.
> 
> Essentially they can treat them just as every other instructions with known 
> semantics
> e.g.  for
> <4 x float> madd(const <4 x float> a, const <4 x float>)
> all it needs to know is that it's an instruction that takes two inputs, 
> doesn't modify them and returns an output. So optimizations still work. While 
> generating assembly /before/ the code generator runs means that there's a 
> black hole in the code that can technically do, well, anything really.

Ok. Intrinsics are more powerful than I thought. Ignore my existential
comments LLVM.

Jose


------------------------------------------------------------------------------
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to