It looks to me like both glm and ncrs driver are missing
+ cmd->cmd_cdblen = (uchar_t)cdblen;
cmd->cmd_scblen = (uchar_t)statuslen;
cmd->cmd_privlen = (uchar_t)tgtlen;
in their glm_pkt_alloc_extern() implementation. It looks like this error has
been lurking for a long time (since glm_pkt_alloc_extern was introduced).
You may be seeing this now because you now using a disk that requires use
of >12 byte CDBs.
-Chris
jwa wrote:
> [ cross posted from help:
> http://opensolaris.org/jive/thread.jspa?threadID=65070 ]
>
>
> Hi there. I'm getting kernel panics, and I don't know why.
>
> I have a 6 drive SCSI multipack connected to a LSI Logic / Symbios Logic
> 53c875 (using the ncrs driver). The box itself is an older Dell 1600SC with
> 1.GB RAM. (32 bit xeon). The box, scsi card, and multipack have been rock
> solid for the past 7 years.
>
> I installed opensolaris 2008.05 (snv_86) and created a ZFS volume (raid 1+0)
> across the 6 drives. When I copy files across the network to the volume, the
> machine will eventually (anywhere between 5 minutes and 2 hours) panic.
>
> Interestingly, I have the same model card, another SCSI disk pack, and
> another machine (PowerEdge SC440, core2 duo). On this box, I'm also running
> opensolaris 2008.05. I get identical panics, whether using the 64 bit (glm?)
> driver or the 32 bit ncrs driver.
>
> I upgraded the Dell 1600SC to snv_91 in the hope that the problem would
> magically go away. It didn't :-(
>
> I added "set kmem_flags=0xf" to /etc/system & here's the most recent panic:
>
> Jun 26 21:31:03 barcelona genunix: [ID 478202 kern.notice] kernel memory
> allocator:
> Jun 26 21:31:03 barcelona genunix: [ID 432124 kern.notice] buffer freed to
> wrong cache
> Jun 26 21:31:03 barcelona genunix: [ID 815666 kern.notice] buffer was
> allocated from kmem_alloc_320,
> Jun 26 21:31:03 barcelona genunix: [ID 530907 kern.notice] caller attempting
> free to kmem_alloc_8.
> Jun 26 21:31:03 barcelona genunix: [ID 563406 kern.notice] buffer=e52c7400
> bufctl=e5279200 cache: kmem_alloc_8
> Jun 26 21:31:03 barcelona genunix: [ID 341866 kern.notice] previous
> transaction on buffer e52c7400:
> Jun 26 21:31:03 barcelona genunix: [ID 991227 kern.notice] thread=e12e7ce0
> time=T-0.013422618 slab=e509c088 cache: k
> mem_alloc_320
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> kmem_cache_alloc_debug+258
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] kmem_cache_alloc+8d
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] kmem_zalloc+4b
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> glm_pkt_alloc_extern+83
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> glm_scsi_init_pkt+129
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] scsi_init_pkt+48
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> sd_initpkt_for_uscsi+9e
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] sd_start_cmds+15f
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] sd_core_iostart+158
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> sd_uscsi_strategy+108
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] default_physio+31b
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] physio+1d
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> scsi_uscsi_handle_cmd+16d
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice]
> sd_send_scsi_cmd+13f
> Jun 26 21:31:03 barcelona genunix: [ID 851371 kern.notice] sdioctl+c86
> Jun 26 21:31:03 barcelona unix: [ID 836849 kern.notice]
> Jun 26 21:31:03 barcelona ^Mpanic[cpu0]/thread=d391cde0:
> Jun 26 21:31:03 barcelona genunix: [ID 812275 kern.notice] kernel heap
> corruption detected
> Jun 26 21:31:03 barcelona unix: [ID 100000 kern.notice]
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cc20
> genunix:kmem_error+421 (6, d1024398, e52c74)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cc5c
> genunix:kmem_free+bf (e52c7400, 8)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cc78
> ncrs:glm_pkt_destroy_extern+60 (d7a77600, e9767388)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cc90
> ncrs:glm_scsi_destroy_pkt+42 (e97674a8, e97674a4)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cca8
> scsi:scsi_destroy_pkt+16 (e97674a4)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391ccc8
> sd:sd_destroypkt_for_uscsi+89 (d9365de0)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391ccf4
> sd:sd_return_command+124 (d4106a80, d9365de0)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cd28
> sd:sdintr+499 (e97674a4)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cd4c
> ncrs:glm_doneq_empty+3b (d7a77600)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cd60
> ncrs:glm_intr+75 (d7a77600, 0)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cdac
> unix:av_dispatch_autovect+69 (14)
> Jun 26 21:31:03 barcelona genunix: [ID 353471 kern.notice] d391cdcc
> unix:dispatch_hardint+1a (14, 0)
>
>
>
>
> [EMAIL PROTECTED]:/var/crash/barcelona# mdb -k unix.8 vmcore.8
> Loading modules: [ unix genunix specfs dtrace cpu.generic uppc pcplusmp
> scsi_vhci zfs mpt sd ip hook neti sctp arp usba fctl md lofs random sppp
> crypto ptm nfs fcip fcp cpc logindmux nsctl ii sdbc ufs rdc nsmb sv ]
>
>>::status
>
> debugging crash dump vmcore.8 (32-bit) from barcelona
> operating system: 5.11 snv_91 (i86pc)
> panic message: kernel heap corruption detected
> dump content: kernel pages only
>
>>::panicinfo
>
> cpu 0
> thread d391cde0
> message kernel heap corruption detected
> gs fec301b0
> fs fec30000
> es fec30160
> ds fec30160
> edi f
> esi e5279200
> ebp d391cbd4
> esp d391cbc4
> ebx e5279264
> edx 0
> ecx f
> eax d391cbe0
> trapno 0
> err 0
> eip fe838350
> cs fec30158
> eflags 282
> uesp 0
> ss fec30160
> gdt fe7fe00002cf
> idt fe7fd00007ff
> ldt 0
> task 150
> cr0 8005003b
> cr2 cfe23174
> cr3 24c0000
> cr4 6d8
>
>>$C
>
> d391cbd4 vpanic(fea67a08)
> d391cc20 kmem_error+0x421(6, d1024398, e52c7400)
> d391cc5c kmem_free+0xbf(e52c7400, 8)
> d391cc78 glm_pkt_destroy_extern+0x60(d7a77600, e9767388)
> d391cc90 glm_scsi_destroy_pkt+0x42(e97674a8, e97674a4)
> d391cca8 scsi_destroy_pkt+0x16(e97674a4)
> d391ccc8 sd_destroypkt_for_uscsi+0x89(d9365de0)
> d391ccf4 sd_return_command+0x124(d4106a80, d9365de0)
> d391cd28 sdintr+0x499(e97674a4)
> d391cd4c glm_doneq_empty+0x3b(d7a77600)
> d391cd60 glm_intr+0x75(d7a77600, 0)
> d391cdac av_dispatch_autovect+0x69(14)
> d391cdcc dispatch_hardint+0x1a(14, 0)
> d918bc6c switch_sp_and_call+0xf(d391cddc, fe8196c4, 14, 0)
> d918bca8 do_interrupt+0x7c(d918bcb8, f6c57c80)
> d918bcb8 _interrupt+0x59()
> d918bd38 bcopy+0x13(d42e8b68)
> d918bd60 zio_done+0x2a(d42e8b68)
> d918bd78 zio_execute+0x66()
> d918bdc8 taskq_thread+0x176(d547e388, 0)
> d918bdd8 thread_start+8()
>
> [EMAIL PROTECTED]:/var/crash/barcelona# modinfo | grep ncrs
> 163 f8c1c000 abb4 75 1 ncrs (NCRS SCSI HBA Driver 1.25)
>
>
> I've also booted off of the 2008.05 CD and tried to do I/O (mostly tars &
> copying large files around); it panics from there, too. So it's not some
> funny thing I've done to /etc/system or a /kernel/drv/*.conf file.
>
>
> Because this is affecting two different machines with two different identical
> model SCSI cards, I'm tempted to point the finger at the SCSI driver... but
> about two years ago, I put one of these SCSI cards in an older x86 box
> running Solaris 10 (01/06 I believe) as well as an Ultra 10 running 06/06 and
> it worked w/o panicing.
>
> Another tidbit: sometimes it panics when I run the 'format' command.
>
> Any suggestions?
>
> thanks,
> James
>
>
> This message posted from opensolaris.org
> _______________________________________________
> storage-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss