from:"Guy Helmer"

Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-10-01 Thread guy . helmer

On Wednesday, June 6, 2012 8:36:04 PM UTC-5, Mark Felder wrote:
> Hi guys I'm excitedly posting this from my phone. Good news for you guys, bad 
> news for us -- we were building HA storage on vmware for a client and can now 
> replicate the crash on demand. I'll be posting details when I get home to my 
> PC tonight, but this hopefully is enough to replicate the crash for any 
> curious followers:
> 
> 
> 
> ESXi 5
> 
> 9 or 9-STABLE
> 
> HAST 
> 
> 1 cpu is fine
> 
> 1GB of ram
> 
> UFS SUJ on HAST device
> 
> No special loader.conf, sysctl, etc
> 
> No need for VMWare tools
> 
> Run Bonnie++ on the HAST device
> 
> 
> 
> We can get the crash to happen on the first run of bonnie++ right now. I'll 
> post the exact specs and precise command run in the PR. We found an old post 
> from 2004 when we looked up the process state obtained from CTRL+T -- flswai 
> -- which describes the symptoms nearly perfectly.
> 
> 
> 
>  http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2004-02/0250.html 
> 
> 
> 
> Hopefully this gets us closer to a fix...

Is this a crash or a hang? Over the past couple of weeks, I've been working 
with a FreeBSD 9.1RC1 system under VMware ESXi 5.0 with a 64GB UFS root FS and 
2TB ZFS filesystem mounted via a virtual LSI SAS interface. Sometimes during 
heavy I/O load (rsync from other servers) on the ZFS FS, this shows up in 
/var/log/messages:

Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 5 ee 60 
16 0 1 0 0 
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 42 
51 0 1 0 0 
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 64 
51 0 1 0 0 
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 66 
51 0 1 0 0 
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
...
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 41 f3 94 
99 0 1 0 0 
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): Retrying command

These have been happening roughly every other day.

mpt0 and em0 were sharing int 18, so today I put 
hint.mpt.0.msi_enable="1"
into /boot/devices.hints and rebooted; now mpt0 is using int 256. I'll see if 
it helps.

Guy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

6.2-amd64 Hang at reboot on Supermicro X7DBR-i+

2007-03-15 Thread Guy Helmer

I'm investigating a problem where a pretty much stock 6.2 SMP kernel 
randomly hangs on multiple Supermicro X7DBR-i+ and X7DBR-8+ systems.  
The system syncs the filesystems and prints "Uptime: ...", then hangs.


So far, I've narrowed it down to the MOD_SHUTDOWN request to the 
"rootbus" module.  Adding a printf() before and after the 
"device_shutdown(child);" line in subr_bus.c method 
bus_generic_shutdown() seems to make the problem go away, as does 
running a kernel with INVARIANTS, WITNESS, and DDB/KDB.  I'm trying to 
reproduce the hang on a plain SMP kernel with just DDB/KDB, but it 
hasn't hung yet.


Any ideas?

Guy

--
Guy Helmer, Ph.D.
Chief System Architect
Palisade Systems, Inc.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

6.2-amd64 Hang at reboot on Supermicro X7DBR-i+

2 matches

Site Navigation

Mail list logo

Footer information