Re: stop scheduler on panic patches updated

2011-09-06 Thread Eugene Grosbein
02.09.2011 23:22, Andriy Gapon пишет:
 
 The patches can be found at the same locations:
 head - http://people.freebsd.org/~avg/stop_scheduler_on_panic.diff
 stable/8 - http://people.freebsd.org/~avg/stop_scheduler_on_panic.8.x.diff
 
 Additionally, if you use a USB keyboard, then the following patch is required 
 for
 proper operation in post-panic environment:
 http://people.freebsd.org/~avg/stop_scheduler_on_panic.usb.diff
 The patch is the same for both head and stable/8.
 It shouldn't hurt if you don't use USB devices or use other USB devices.
 

Thanks, I'm going to try it just now because my dummynet panics are back
with start of new school year :-)

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: UFS_DIRHASH panics on a dozen server within 30 hours

2011-09-06 Thread John Baldwin
On Monday, September 05, 2011 5:15:42 am Andreas Longwitz wrote:
 Hi,
 
 a week ago a dozen of my FreeBSD server crashed within a time span of
 30 hours. On the server run very different applications, some of them
 were only standby. All server has the same kernel with FreeBSD 6 STABLE
 and there were no problems for yours until the black monday.
 
 Yes I know that FreeBSD 6 is out of date now, but I don't like to
 change a very good running system. Another reason is that my hardware
 needs the amr driver and because of the outstanding solution of the
 amr_ioctl problem described in kern/155658 it is not possible for me
 to upgrade my production sytems without changing hardware.

Hmm, the patch in that PR should still apply to newer versions.  Also, you 
could just change the malloc() call to always allocate the maximum size 
(instead of using a static buffer) for a smaller diff.  It seems though that a 
specific command is overrunning its buffer.

 Now I have a dozen core dumps and try to understand what happened.
 All dumps looks very similar and the panic is always page fault
 in _mtx_lock_sleep called from ufsdirhash_recycle or ufsdirhash_free
 because the used mtx_object is overwritten with zeros by someone
 before _mtx_lock_sleep is called.

I don't know of anything in particular that would explain this, esp. as to
why you would see them all occur at the same time.  Maybe look to see if the
machines were doing something unusual at that time (a cron job, etc.)?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


FS corruption between 8-STABLE and 7-STABLE

2011-09-06 Thread Arnaud Lacombe
Hi,

I got a strange boot failure when booting a FreeBSD 7-stable after a
8-stable kernel, with a FreeBSD 7.4 kernel:

pid 100 (fsck_ufs), uid 0: exited on signal 8
pid 101 (fsck_ufs), uid 0: exited on signal 8
WARNING: R/W mount of / denied.  Filesystem is not clean - run fsck
WARNING: R/W mount of / denied.  Filesystem is not clean - run fsck

it would seem that going 8-STABLE change the filesystem in such a way
7-STABLE cannot be booted after.

Thanks guys ...

 - Arnaud

ps: please CC me, I'm not subscribed to that list
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic: spin lock held too long (RELENG_8 from today)

2011-09-06 Thread Hiroki Sato
Attilio Rao atti...@freebsd.org wrote
  in CAJ-FndAChGndC=lkzni7i6mot+spw3-ofto9rh0+5wnnvwz...@mail.gmail.com:

at This should be enough for someone NFS-aware to look into it.
at
at Were you also able to get a core?

 Yes.  But as kib@ pointed out it seems a deadlock in ZFS.  Some
 experiments I did showed that this deadlock can be triggered at least
 by doing rm -rf against a local directory that has a large number
 of files/sub-directories.

 Then, I updated the kernel with the latest 8-STABLE + WITNESS option
 because a fix for LOR of spa_config lock was committed and tracking
 locks without WITNESS was hard.  The deadlock can still be triggered
 after that.

 During this investigation an disk has to be replaced and resilvering
 it is now in progress.  A deadlock and a forced reboot after that
 make recovering of the zfs datasets take a long time (for committing
 logs, I think), so I will try to reproduce the deadlock and get a
 core dump after it finished.

 If the old kernel and core of the deadlock I reported on Saturday are
 still useful for debugging, I can put them to somewhere you can
 access.

-- Hiroki


pgptCZubr4hdM.pgp
Description: PGP signature