SUMMARY

Some of our Windows 7 PCs are going into a partial machine hang
condition (locked up/not responding/wedged/etc).  It's intermittent,
with no trigger or pattern I have been able to discern.  Definitely a
persistent, repeating problem, though.  It seems to be related to the
Microsoft networking (SMB) layer.  I'm wondering if there is anything
that can help me try and narrow down the cause.

  Ideally, I'm hoping for logging options, or something like Driver
Verifier.  Failing that, is there a way to force a bugcheck so I can
get a kernel dump and examine what the system was doing when it went
into extreme-navel-gazing mode?  Better ideas welcomed.

GORY DETAILS

Only effecting a handful of people, as far as I know.  One of them is
me.  Different users, PCs, PC models, user job roles, software usage,
locations within the building.  Some of the PCs are less than a year
old, some are up to ~4 years old.  At least one of the PCs (mine) is
on a UPS.

All effected PCs are Dell, running Windows 7 64-bit with latest
updates.  All had OS installed from our WDS server.  All had other
software installed from the same server as all other PCs.  Should be a
relative homogeneous environment, although we have a lot of one-off
apps that only a few people run, some of which are in the effected
population (but nothing common to all of them).

Only effecting Windows 7 PCs.  Seems to have started with our
migration to Win 7 (from XP), which we started at the beginning of
this year.  It's almost all Win 7 PCs now.  So the question, "Has
anything changed recently?" is unfortunately answered with "Yes,
almost everything".  :-/  New OS version, all new installs, different
drivers, new MS Office version, in some cases other new app versions
too.  Hasn't hit any XP machines.  ;-)

Since I'm one of the effected users, I can provide some first-hand observations.

The first symptom I see always seems to be in association with network
activity.  Reading or writing a file on a server, or browsing a folder
(reading directory) on a server.  The program I'm using will just
hang.  For GUI, generally a total app hang, entire app window gets
grayed out, title changes to include "(Not responding)".  For command
prompt windows, the command I'm running will hang and never come back.

Once this happens, the rest of the system quickly grinds to a halt.
It seems like at some point, the network just dies, and anything that
tries to use networking is dragged down with it.  Since most
everything uses the network to some degree, it doesn't take long for
the machine to become unusable.  As soon as Windows Explorer/shell
touches anything network, it hangs too, and from there there's not
much one can do.

But,  it's only killing things using Microsoft networking.  Just now,
when it happened again, I happened to have a PuTTY window open,
connected via SSH to a Linux box, and that kept working dandy.  At
least a couple other apps were hung (one was Excel), but as long as I
didn't touch Explorer, the PuTTY window kept working.

I can also ping the effected PC from other PCs.  "NET VIEW" against
the dying PC returns "Network path not found" (code 53).  PSLIST does
similar.

Using Samba tools from a Linux box, "nmblookup -S" (NetBIOS node
status) can get the PC's name list.  But "smbclient -L" (list shares)
returns an error to the effect of the connection failed.  (I was a bad
admin, and didn't write down the exact message.)

The mouse pointer has remained responsive, as have the CAPS/NUM LOCK
keys on the keyboard.  Sometimes the system will beep/chirp when I try
to type.

At least once I've had a Process Explorer window open, and when the
system hung, I didn't see anything obvious in any of the graphs, e.g.,
no CPU or memory spikes.  Unfortunately it seems like Process Explorer
(and Task Manager) get caught up in whatever happens, so I haven't
been able to use them to examine the hung system in any detail.

-- Ben


Reply via email to