I launched Doom III and waited for it to freeze, then I ssh'ed into the system and ran top.

The top most process was X (with some 99.9% CPU activity). Sometimes doom.x86 (the Doom III process, of course) would rise to the top for a few moments, but then go down again. The relevant lines from top:

When X was on top:
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11515 root      25   0  305m  32m 8824 R 99.8  2.2   0:45.56 X
   1 root      16   0  2548  536  452 S  0.0  0.0   0:01.15 init
   2 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0

When doom.x86 was on top:
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13237 keithvas  23   0  802m 715m 8320 R 97.3 47.7   2:04.93 doom.x86
11515 root      25   0  305m  32m 8824 R  2.4  2.2   0:36.50 X
13235 keithvas  16   0 10424 1276  928 R  0.2  0.1   0:00.38 top

My load average just kept rising, and rising. At a point it was 20.86, 8.14, 3.16 (whatever that means).

Before the game freezes, the top most process is doom.x86 (with around 85% CPU usage) and X comes in at third or fourth place.

Whilst the game is frozen, I occasionally here sound from my speakers. Other times I hear nothing. When I'm hearing sounds, I'm able to type in commands via SSH. When I'm not hearing sounds, nothing appears and the system grinds to a halt. This suggests that (perhaps) the system tries to recover but then is frozen again.

When I run strace on X, I keep getting the following:
rt_sigreturn(0xe)                       = 36984
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigreturn(0xe)                       = 36984

Doom III spawns several processes, here's what they say:
[ Process PID=13250 runs in 32 bit mode. ]
select(19, [18], NULL, NULL, NULL
[ Process PID=13251 runs in 32 bit mode. ]
getppid()                               = 13250
poll([{fd=15, events=POLLIN}], 1, 2000) = 0

[ Process PID=13253 runs in 32 bit mode. ]
gettimeofday({2811813141633945, 586448462527099936}, NULL) = 0
nanosleep({38654705664000000, 586448610090393576}, NULL) = 0
gettimeofday({2868712868371353, 586448630030824480}, NULL) = 0

I tried to reboot the PC. I did kill -9 (the X pid) and kill -9 (the doom III pid), but when I looked at the machine's monitor it just turned off (i.e. no output from the video card). I then tried to reboot via SSH. The system looked like it tried to reboot, but then a garbled screen appeared and nothing more happened.

The second time I tried this, kill -9 (the X pid) stopped sound coming from my speakers, but the image just froze (i.e. frozen, not turned off). I then killed doom, but the machine remained frozen. After this, the machine completely froze (i.e. no commands, even from SSH).

Keith

Andrew Cilia wrote:
Keith,
    just a point on something you wrote in an earlier email. If you can
ssh in, then the system has not crashed as such. It would seem that your
video subsystem is stuck. Given that you can get in via ssh, is there
anything that is hogging the CPU that you can see via top. If so, get an
strace on it as this will help in debugging.
Another thing. Are you running in SMP mode?

subsystem is
On Wed, 2005-11-16 at 07:26 +0100, Keith Vassallo wrote:
When I bought the card, I was using kernel 2.6.12, I re-compiled that, and it still didn't work. I also compiled 2.6.13 - still didn't work.

glxgears works. Also, some games like GTA: Vice City work with no problems. Also, If I set the quality of graphics in Doom III to medium, the game works.

It seems like only very intensive use makes the box die. It could be something to do with the bug Andrew mentioned earlier (the bug is only a problem with very intensive use). It can't be a hardware problem, Windows worked.

Keith

Jean Azzopardi wrote:
I will not pretend to be an expert on this, but have you tried recompiling the kernel? Also. do simpler 3D apps work? Such as glxgears? Try running glxgears in a terminal...

On Tuesday 15 November 2005 6:30 pm, Keith Vassallo wrote:
The messages I get from dmesg after start-up show no errors - just that AGPART is being loaded, finds a graphics card and everything seems normal.

I had already checked out the thread you pointed me to. The thread provides a patch to fix the problem, but I'm supposed to already have that patch - Gentoo have included it in their ebuild (of which I have the latest, 7676).

I tried to figure out what they're talking about when they mention global_flush_tlb(), and found this interesting post: http://marc.theaimsgroup.com/?l=linux-kernel&m=112928307319954&w=2

It seems, as is also said in the nvidia post, that this is being worked on in kernel 2.6.14. Unfortunately, the thread I found mentions delays - not total freezes - being caused by this "bug". The patch is against 2.6.14-rc4. I don't totally understand kernel terminology - but if the patch is against rc4, does that mean it will be included in rc5? Whichever it's included in, the latest (testing) kernel available on Portage is 2.6.14-r2.

Seems like this will be a waiting game, unless anyone else has suggestions.

Keith

Andrew Cilia wrote:
Do you get any messages after agpart module kicks in during startup? Did
you try some other forums besides gentoo? For example, I found this:

http://www.nvnews.net/vbulletin/showthread.php?t=57990

Cheers


On Tue, 2005-11-15 at 16:01 +0100, Keith Vassallo wrote:
Hey Guys,

I've recently upgraded to an XFX GeForce 6800 GT from my previous card,
a GeForce FX 5200. Since having done so, I'm having problems playing
games.
When starting Doom3 in "Ultra high" or "high" quality mode, the game
locks up either seconds, or minutes, after the game begins. The whole
system freezes, neither CTRL+ALT+BACKSPACE nor anything else works.
Starting doom in "medium" quality mode seems to stop the problem from
happening, although I haven't played the game for longer then 30mins.
When starting Half Life 2, the game freezes a few seconds after the menu
is displayed. With Counterstrike: Source, the game loads a map, then
sends me back to the desktop.

Here's some information you may need:

Gentoo running amd64 on AMD Athlon 64 3000+ (Socket 939)
XFX GeForce 6800 GT (AGP)
1.5GB DDR RAM

cat /proc/driver/nvidia/agp/status:

Status: Enabled
Driver: AGPGART
AGP Rate: 8x
Fast Writes: Disabled
SBA: Enabled

Kernel: 2.6.13-gentoo-r3
nvidia-kernel: 1.0.7676
nvidia-glx: 1.0.7676-r2

I've also used nvidia-settings to check the card temperature. 15mins
after boot, nvidia-settings reports:

Core Temperature: 44C
Ambient Temperature: 36C

I've searched through the Gentoo forums for similar problems, and this
has been reported one or two times, none of these people found a
solution (or posted about it). I looked on my motherboard for capacitor
decay (as described in another post) and haven't found any. I also don't
have X Composite extensions enabled.

Whenever the PC crashes, I can see the following in /var/log/messages (I
have to SSH to do this, the machine is too frozen to launch a terminal
locally)

Nov 3 22:05:06 silver NVRM: Xid: 25, L0 -> L0
Nov 3 22:05:06 silver NVRM: Xid: 6, PE0000 1f08 00000000 00000000
00f1efeb 00000000
Nov 3 22:05:09 silver NVRM: Xid: 8, Channel 00000020
Nov 3 22:05:17 silver NVRM: Xid: 8, Channel 00000020

etc...

I've installed Windows on this machine and a few games just to test -
everything worked fine, so it can't be a hardware problem.

Any help would be greatly appreciated.

_______________________________________________
MLUG-list mailing list
[email protected]
http://mailserv.megabyte.net/mailman/listinfo/mlug-list

_______________________________________________
MLUG-list mailing list
[email protected]
http://mailserv.megabyte.net/mailman/listinfo/mlug-list
_______________________________________________
MLUG-list mailing list
[email protected]
http://mailserv.megabyte.net/mailman/listinfo/mlug-list

Reply via email to