Re: [E-devel] patch - imlib2 blend in AMD64

Tiago Victor Gehring Thu, 25 Aug 2005 02:45:37 -0700

Hi,
thanks for the info and help - nothing like hearing from who wrote the
code...
I then applied your patch and put the original lines back, with the
aligned copy, but I now get again the same problem as before,
segmentation faults.. 
The first line that gives me problem is #570, this is a piece of the
code so you can find it:


        LEAVE
SIZE(imlib_amd64_blend_rgba_to_rgb)
PR_(imlib_amd64_blend_rgba_to_rgba):
        ENTER

        pxor %xmm4, %xmm4
        movdqa c1(%rip), %xmm5          -> *** here's the first problem
        xorq %rax, %rax
        movdqa mX000X000X000X000(%rip), %xmm6
        movq [EMAIL PROTECTED](%rip), %r13

And I confirmed that now I compiled it with the ".align 16" in the .text
segment..
And just being curious now, could you perhaps explain what does a
instruction like the above does? I mean, what does the "mask" in front
of the %rip does, you take the address of the next instruction, apply
(AND) a mask and move it to %xmm5? Sorry, just curious...

Thanks,
Tiago



On Wed, 2005-08-24 at 18:45 -0600, John Slaten wrote:

> 
> Since I wrote the original code, I thought I'd weigh in on this.
> 
> 1. The memory that is causing the errors is statically allocated data in
> the .text segment, which I assumed would be correctly aligned and forgot
> to supply the .align directive. Adding this (as in the attached patch)
> _should_ fix the problem, though I have not tried to reproduce/test it.
> The code should then work with the aligned instructions as in the
> original.
> 2. The existing code jumps through quite a lot of hoops to make best use
> of the aligned instructions. For instance, when it encounters an odd
> pixel address, it will process a single pixel at the start of the loop
> to force alignment for the destination address (which uses both read and
> write, and is thus more important). In fact, due to the possiblity of
> odd scanline pitch, the alignment is checked at the start of each
> scanline, and the correct instructions are used accordingly.
> 3. The code was built to handle weird input that is only 1 byte aligned.
> Thus, it should handle any alignment that is thrown at it, and if it
> doesn't that's a bug, but it should be fixable.
> 4. I don't recall the exact statistics, but I ran tests on aligned vs
> unaligned instructions while I was writing the code, and using the
> aligned instructions gives a large speed boost. I think it was about
> 20%, but I might be wrong. The key is that movdqa is a double path
> instruction and movdqu is a vector path instruction, and double path
> instructions are a whole lot quicker than vector path ones.
> 


        
        
                
_______________________________________________________ 
Yahoo! Acesso Grátis - Internet rápida e grátis. 
Instale o discador agora! http://br.acesso.yahoo.com/



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] patch - imlib2 blend in AMD64

Reply via email to