Gallium: Fix for tgsi_emit_sse2()
Hi, Here is my first patch for Nouveau project: a fix for 3D software rendering using SSE2 instruction set. The problem is that Gallium doesn't save/restore used registers (eax, edx, ecx, esi in my case). So I added push/pop in tgsi_emit_sse2(). My patch is quick and dirty, eg. tgsi_emit_sse2_fs() is not fixed. I don't know Gallium enough to write better fix ;-) Victor Stinner aka haypo diff --git a/src/gallium/auxiliary/draw/draw_vs_sse.c b/src/gallium/auxiliary/draw/draw_vs_sse.c index a4503c1..f11f9c6 100644 diff --git a/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c b/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c index 4e80597..b0bf49a 100755 --- a/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c +++ b/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c @@ -1998,6 +1998,18 @@ emit_instruction( case TGSI_OPCODE_RET: case TGSI_OPCODE_END: + emit_pop( + func, + get_temp_base() ); + emit_pop( + func, + get_const_base() ); + emit_pop( + func, + get_output_base() ); + emit_pop( + func, + get_input_base() ); #ifdef WIN32 emit_retw( func, 16 ); #else @@ -2248,22 +2260,35 @@ tgsi_emit_sse2( func-csr = func-store; + emit_push( + func, + get_input_base() ); + emit_push( + func, + get_output_base() ); + emit_push( + func, + get_const_base() ); + emit_push( + func, + get_temp_base() ); + emit_mov( func, get_input_base(), - get_argument( 0 ) ); + get_argument( 0+4 ) ); emit_mov( func, get_output_base(), - get_argument( 1 ) ); + get_argument( 1+4 ) ); emit_mov( func, get_const_base(), - get_argument( 2 ) ); + get_argument( 2+4 ) ); emit_mov( func, get_temp_base(), - get_argument( 3 ) ); + get_argument( 3+4 ) ); tgsi_parse_init( parse, tokens ); - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: DRM QWS
On Wednesday 26 March 2008 19:32:22 Kristian Høgsberg wrote: On Wed, Mar 26, 2008 at 1:50 PM, Tom Cooksey [EMAIL PROTECTED] wrote: ... I guess what I was thinking about was a single API which can be used on 3D-less (or legacy, if you want) hardware and on modern hardware. If the graphics hardware is a simple pointer to a main-memory buffer which is scanned out to the display, then your right, you might as well just use user-space shared memory, as we currently do. A new API would only be useful for devices with video memory and a hardware blitter. There are still new devices coming out with this kind of hardware, the Marvel PXA3x0 and Freescale i.MX27 for example spring to mind. I agree with you that it probably doesn't make sense to use gallium/mesa on everything everywhere. There are still small devices or early boot scenarios (you mention initramfs) where gallium isn't appropriate. However, there is no need to put this a 2D engine into the kernel. What the drm ttm gives us is a nice abstraction for memory management and command buffer submission, and drm modesetting builds on this to let the kernel set a graphics mode. And that's all that we need in the kernel. Building a small userspace library on top of this to accelerate blits and fills should be pretty easy. I had a think about this last night. I think Zack is probably right about future graphics hardware. There's always going to be devices with simple graphics, having a framebuffer in main memory and a few registers for configuration. I think in the future, if more advanced graphics are needed, it will take the form of programmable 3D hardware. Take the set-top-box example I gave: While I stand by the fact that a low-power 3D core can't render at 1920×1080, a software-only graphics stack also can't render at this resolution. I'm just thinking about the problems I've been trying to solve getting Qt to perform well on the Neo1973 with it's 480x640 display and 266Mhz CPU. So for simple, linear framebuffer devices we have fbdev. For programmable 3D, we have gallium/DRM. There's still the issue of early boot for 3D devices, but as Jesse mentioned, the DRM drivers can include an fbdev interface as the intel driver does already. Ok, I'm satisfied. Thanks to all. :-) Cheers, Tom - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15203] r300 lockup
http://bugs.freedesktop.org/show_bug.cgi?id=15203 --- Comment #5 from Jerome Glisse [EMAIL PROTECTED] 2008-03-27 07:39:36 PST --- (In reply to comment #4) Created an attachment (id=15493) -- (http://bugs.freedesktop.org/attachment.cgi?id=15493) [details] mesa patch This patch makes the lockup gone. I have absolutely no clue why. I just knew RADEON_DEBUG=sync fixes the lockup. So I peppered the source with radeonWaitForIdle until I found this hack This means that this lockup is due to sync problem. My wild guess is that we write a register that shouldn't be written before a flush (like changing vertex program while there is still fragment being rendered with old vertex program). I think we could apply the patch you point to as temporary fix. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Gallium: Fix for tgsi_emit_sse2()
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Victor Stinner wrote: | Here is my first patch for Nouveau project: a fix for 3D software | rendering using SSE2 instruction set. | | The problem is that Gallium doesn't save/restore used registers (eax, | edx, ecx, esi in my case). So I added push/pop in tgsi_emit_sse2(). Doesn't the ABI say that those are scratch registers? That is, they don't need to be saved by the callee. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFH69mVX1gOwKyEAw8RAjkFAJkBC8L/pRsrKjIcb4/YF4qQovm6qACghN6U VTvfpZR6FzWCxTEPuRp7sBs= =44VK -END PGP SIGNATURE- - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15203] r300 lockup
http://bugs.freedesktop.org/show_bug.cgi?id=15203 --- Comment #6 from Markus Amsler [EMAIL PROTECTED] 2008-03-27 10:52:37 PST --- Its has a 30% performance impact on Wow, but only 1% on glxgears. It also doesn't fix the Wow lockup I was after in the first place. It's also possible that it doesn't fix the bug, but only changes the timing, so the bug gets not triggered. I'm gonna investigate this further. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15203] r300 lockup
http://bugs.freedesktop.org/show_bug.cgi?id=15203 --- Comment #7 from Michel Dänzer [EMAIL PROTECTED] 2008-03-27 11:19:26 PST --- Synchronously waiting for idle on each flush is a pretty large hammer... does just emitting wait for idle commands to the CP instead help as well? -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Gallium: Fix for tgsi_emit_sse2()
Here is my first patch for Nouveau project: a fix for 3D software rendering using SSE2 instruction set. The problem is that Gallium doesn't save/restore used registers (eax, edx, ecx, esi in my case). So I added push/pop in tgsi_emit_sse2(). Doesn't the ABI say that those are scratch registers? That is, they don't need to be saved by the callee. On Linux, the calling method is cdecl. Marcheu told me that EAX, ECX and EDX can be used but not ESI (ESI should be saved). So here is a smaller patch: only save/restore ESI (temp base). About the crash: it occurs with Nouveau driver (yesterday GIT version) and NeverBall game. The bug only occurs with gcc 4.2, not with gcc 4.1. Gallium (mesa) is compiled with -O (-O1). Victor Stinner PS: I just subscribed to the mailing list diff --git a/src/gallium/auxiliary/draw/draw_vs_sse.c b/src/gallium/auxiliary/draw/draw_vs_sse.c index a4503c1..f11f9c6 100644 diff --git a/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c b/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c index 4e80597..2d4e707 100755 --- a/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c +++ b/src/gallium/auxiliary/tgsi/exec/tgsi_sse2.c @@ -1998,6 +1998,9 @@ emit_instruction( case TGSI_OPCODE_RET: case TGSI_OPCODE_END: + emit_pop( + func, + get_temp_base() ); #ifdef WIN32 emit_retw( func, 16 ); #else @@ -2248,22 +2251,26 @@ tgsi_emit_sse2( func-csr = func-store; + emit_push( + func, + get_temp_base() ); + emit_mov( func, get_input_base(), - get_argument( 0 ) ); + get_argument( 0+1 ) ); emit_mov( func, get_output_base(), - get_argument( 1 ) ); + get_argument( 1+1 ) ); emit_mov( func, get_const_base(), - get_argument( 2 ) ); + get_argument( 2+1 ) ); emit_mov( func, get_temp_base(), - get_argument( 3 ) ); + get_argument( 3+1 ) ); tgsi_parse_init( parse, tokens ); @@ -2327,22 +2334,26 @@ tgsi_emit_sse2_fs( func-csr = func-store; /* DECLARATION phase, do not load output argument. */ + emit_push( + func, + get_temp_base() ); + emit_mov( func, get_input_base(), - get_argument( 0 ) ); + get_argument( 0+1 ) ); emit_mov( func, get_const_base(), - get_argument( 2 ) ); + get_argument( 2+1 ) ); emit_mov( func, get_temp_base(), - get_argument( 3 ) ); + get_argument( 3+1 ) ); emit_mov( func, get_coef_base(), - get_argument( 4 ) ); + get_argument( 4+1 ) ); tgsi_parse_init( parse, tokens ); - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 10344] New: [2.6.25-rc6] possible regression: X server dying
http://bugzilla.kernel.org/show_bug.cgi?id=10344 Summary: [2.6.25-rc6] possible regression: X server dying Product: Drivers Version: 2.5 KernelVersion: 2.6.25-rc6 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI) AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] OtherBugsDependingO 9832 nThis: Regression: 1 Subject: [2.6.25-rc6] possible regression: X server dying Submitter : Tilman Schmidt [EMAIL PROTECTED] Date : 2008-03-24 23:38 References : http://lkml.org/lkml/2008/3/24/260 Handled-By : Dave Airlie [EMAIL PROTECTED] This entry is being used for tracking a regression from 2.6.24. Please don't close it until the problem is fixed in the mainline. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15194] glPolygonOffset has no effect on glPolygonMode(GL_LINE)
http://bugs.freedesktop.org/show_bug.cgi?id=15194 Eric Anholt [EMAIL PROTECTED] changed: What|Removed |Added Summary|glPolygonOffset not change |glPolygonOffset has no |pixel's depth component |effect on |value |glPolygonMode(GL_LINE) --- Comment #1 from Eric Anholt [EMAIL PROTECTED] 2008-03-27 15:46:08 PST --- I played with this a bit, and noted that making apply_offset do arbitrary changes to the vertex data had no effect, so we're probably writing to the wrong place or just not reaching that part of the loop. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15194] glPolygonOffset has no effect on glPolygonMode(GL_LINE)
http://bugs.freedesktop.org/show_bug.cgi?id=15194 --- Comment #2 from haihao [EMAIL PROTECTED] 2008-03-27 18:04:09 PST --- depth offset is only applied to TRIANGLE object in WM state. The CLIP thread tries to do that for glPolygonMode(GL_LINE). Unfortunately it doesn't take effect. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15193] mesa xdemo 'glthreads' draw nothing
http://bugs.freedesktop.org/show_bug.cgi?id=15193 --- Comment #7 from Colin.Joe [EMAIL PROTECTED] 2008-03-27 19:28:39 PST --- we always build xserver without xcb . -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15193] mesa xdemo 'glthreads' draw nothing
http://bugs.freedesktop.org/show_bug.cgi?id=15193 Zou Nan hai [EMAIL PROTECTED] changed: What|Removed |Added AssignedTo|[EMAIL PROTECTED]|mesa3d- ||[EMAIL PROTECTED] Component|Drivers/DRI/i915|GLX --- Comment #8 from Zou Nan hai [EMAIL PROTECTED] 2008-03-27 19:41:07 PST --- Use -l option which hold mutex around xlib call will make the issue disappear, and you will also see the issue with software rendering. So it should be a GLX issue. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15152] [i965] glean case texCombine Segmentation fault
http://bugs.freedesktop.org/show_bug.cgi?id=15152 Shuang He [EMAIL PROTECTED] changed: What|Removed |Added Status|RESOLVED|VERIFIED --- Comment #5 from Shuang He [EMAIL PROTECTED] 2008-03-27 20:37:41 PST --- Verified, thanks. Haihao This segment fault issue has gone, though there's still failure with this glean caes. I will open another bug for that. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 15203] r300 lockup
http://bugs.freedesktop.org/show_bug.cgi?id=15203 --- Comment #8 from Oliver McFadden [EMAIL PROTECTED] 2008-03-27 22:23:03 PST --- I would rather *NOT* see this patch committed to Mesa, as it really just hides the issue: we're doing something wrong, which should be fixed. This must be a sync problem because more aggressive flushing can solve the problem, though it's not the correct solution. A discussion on IRC between MrCooper and glisse mentioned the NQWAIT_UNTIL (0xe50) register. I've seen the blob write this register in Revenge dumps, but it's never used by our drivers. I wouldn't mind getting some comments from AMD describing the precisely correct way to flush and synchronize the ASIC. We're obviously missing something. I guess tcore might provide this information once it's released... -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel