Hi Yavor, On Thu, 22 Mar 2018 12:04:33 +0200, Yavor Doganov <[email protected]> wrote: > Frédéric Bonnard wrote: > > I'm not an altivec expert but I was interested to look into this and > > maybe help. > > Many thanks for taking the time and effort, this is exactly the > response I was hoping for. > > > When I compile : > > __vector unsigned char Va = vec_pack(vec_pack(vec_ctu(Vperma, 0), > > vec_splat_u32(0)), vec_splat_u16(0)); > > __vector unsigned char Vb = vec_pack(vec_pack(vec_ctu(Vpermb, 0), > > vec_splat_u32(0)), vec_splat_u16(0)); > > > > and print those, I have : > > Va : 3 3 8 8 0 0 0 0 0 0 0 0 0 0 0 0 > > Vb : 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 > > This looks bogus; either a bug in my code or the unfortunate result of > the type conversion. Knowing me, I would definitely bet it's the > former :) > > Out of curiosity, how do you print vectors? Is there a specific > function or you write your own?
Good question : I used libvecpf which I linked against.
Then %vf displays the floats and %vlX the hexadecimal representation.
> > So with this :
> > vector unsigned char Va = { 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7
> > };
> > vector unsigned char Vb = { 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12,
> > 13, 14, 15};
> > we get the following indexes :
> > Va : 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7
> > Vb : 8 9 a b 8 9 a b c d e f c d e f
> >
> > which extracts good looking floats.
>
> Thanks, I will use your version of the patch.
Maybe some closer representation to the initaal four hex would be nice
too.
> > I also extracted part of the computation code to test the computation
> > done with some random floats and check if the results make some sense which
> > seems to be the case (in terms of addition, multiplication, load/store).
> > I also did this on powerpc, ppc64 and ppc64el to see if I had some
> > endianness issue, but I got the same results on the 3 archs.
>
> Thanks for doing this, I was going to ask if it's possible to check
> for endianness bugs as well but thought it would be too pushy. As
> ppc64el is a relevantly new architecture it's quite common to see code
> assuming big endian (powerpc/pcc64).
>
> If my grasp of GCC's configuration is correct, this particular snippet
> is conditionally compiled only on ppc64el because -mvsx is the default
> which implies -maltivec. This is not the case for the other PowerPC
> ports which (should) support machines without AltiVec, so it would be
> a bug in the package if it assumed AltiVec everywhere.
Indeed, I had to pass -maltivec to powerpc and ppc64.
> Anyway, it's good to know that it works as expected on the other
> PowerPC ports.
Well coherency seems preserved at least : either it's all correct on the
3 or it's all false :) .. it needs more test or some knowledge of the
computations done (see below).
> > The best would be to test all this in real by running lynkeos.app's
> > deconvulation, or at least compile part of the original code on Mac
> > OS X and check if the indexes used here gives the same results
> > compared to the original ones.
>
> I don't have access to Mac OS X and wouldn't want to use it even if I
> had.
:D
> It is natural to ask the upstream author to perform this test
> but his email bounces. And I never received a response from the
> person who ported the 1.x series to GNUstep.
Pity
> If you have some spare time and direct access to a GNU/Linux PowerPC
> machine, perhaps you can compare the results (visually) with another
> common architecture by processing the same image? Or on powerpc/ppc64
> with and without AltiVec.
Will try. I feel somehow unsatisfied to have no certainty at this point,
but wanted to give you some feedback early.
F.
pgpxj384jVvIy.pgp
Description: PGP signature

