Hi, thanks for the info and help - nothing like hearing from who wrote the code... I then applied your patch and put the original lines back, with the aligned copy, but I now get again the same problem as before, segmentation faults.. The first line that gives me problem is #570, this is a piece of the code so you can find it:
LEAVE SIZE(imlib_amd64_blend_rgba_to_rgb) PR_(imlib_amd64_blend_rgba_to_rgba): ENTER pxor %xmm4, %xmm4 movdqa c1(%rip), %xmm5 -> *** here's the first problem xorq %rax, %rax movdqa mX000X000X000X000(%rip), %xmm6 movq [EMAIL PROTECTED](%rip), %r13 And I confirmed that now I compiled it with the ".align 16" in the .text segment.. And just being curious now, could you perhaps explain what does a instruction like the above does? I mean, what does the "mask" in front of the %rip does, you take the address of the next instruction, apply (AND) a mask and move it to %xmm5? Sorry, just curious... Thanks, Tiago On Wed, 2005-08-24 at 18:45 -0600, John Slaten wrote: > > Since I wrote the original code, I thought I'd weigh in on this. > > 1. The memory that is causing the errors is statically allocated data in > the .text segment, which I assumed would be correctly aligned and forgot > to supply the .align directive. Adding this (as in the attached patch) > _should_ fix the problem, though I have not tried to reproduce/test it. > The code should then work with the aligned instructions as in the > original. > 2. The existing code jumps through quite a lot of hoops to make best use > of the aligned instructions. For instance, when it encounters an odd > pixel address, it will process a single pixel at the start of the loop > to force alignment for the destination address (which uses both read and > write, and is thus more important). In fact, due to the possiblity of > odd scanline pitch, the alignment is checked at the start of each > scanline, and the correct instructions are used accordingly. > 3. The code was built to handle weird input that is only 1 byte aligned. > Thus, it should handle any alignment that is thrown at it, and if it > doesn't that's a bug, but it should be fixable. > 4. I don't recall the exact statistics, but I ran tests on aligned vs > unaligned instructions while I was writing the code, and using the > aligned instructions gives a large speed boost. I think it was about > 20%, but I might be wrong. The key is that movdqa is a double path > instruction and movdqu is a vector path instruction, and double path > instructions are a whole lot quicker than vector path ones. > _______________________________________________________ Yahoo! Acesso Grátis - Internet rápida e grátis. Instale o discador agora! http://br.acesso.yahoo.com/ ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel