Re: BAD_SG_DMA panic in aha1542
Alan Cox wrote: > The one I sent has a memory leak but it won't matter for basic testing. > Or you can change the final bit to > > > scsi_normalize_sense((char *)sense, sizeof(*sense), &sshdr); > > if (zebedee != cgc->buffer) { > if (cgc->data_direction == DMA_FROM_DEVICE) > memcpy(cgc->buffer, zebedee, cgc->buflen); > kfree(zebedee); /* Time for bed */ > } I changed it, because I'll be living with this for a while I'd bet... Works fine. No more BAD_SG_DMA() calls. Thanks! -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
> That's as close as it gets without redoing everything from scratch. > I'll give Alan's and James' patches a go within the next 13 hours. > > (Alan: what *else* would you name a variable associated with a bounce > buffer besides Zebedee? Thanks for the occasion to smile...) The one I sent has a memory leak but it won't matter for basic testing. Or you can change the final bit to scsi_normalize_sense((char *)sense, sizeof(*sense), &sshdr); if (zebedee != cgc->buffer) { if (cgc->data_direction == DMA_FROM_DEVICE) memcpy(cgc->buffer, zebedee, cgc->buflen); kfree(zebedee); /* Time for bed */ } Alan -- `I can hear you.` ,said Florence. `It s not true. Noddy and I are just good friends.` - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
Jens Axboe wrote: > Try Alan's patch, it should fix it. As mentioned earlier in this thread, > the real fix is to get rid of the cgc stuff and inject into the block > layer from cdrom.c. But Alan's patch should work-around the issue for > now. Trying that patch is the next thing on my plate. For now, here's the dmesg output James requested (aha1542 debug patch). Oddly enough, when the smoke cleared, the mount succeeded and I could access the cd :-). Initial state: SCSI cdrom drivers not loaded. Command: "mount -t iso9660 /dev/scd0 /mnt/cdrom -r" sr0: scsi3-mmc drive: 16x/16x writer cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sr 1:0:4:0: Attached scsi CD-ROM sr0 sgpnt[0:1] page c3489af0/0x3489af0 length 32 BUG: warning at drivers/scsi/aha1542.c:78/BAD_SG_DMA() [] aha1542_queuecommand+0x4a4/0x4ce [aha1542] [] scsi_done+0x0/0x16 [scsi_mod] [] scsi_dispatch_cmd+0x1b0/0x223 [scsi_mod] [] scsi_request_fn+0x22e/0x2ac [scsi_mod] [] __generic_unplug_device+0x1d/0x1f [] blk_execute_rq_nowait+0x64/0x6a [] blk_execute_rq+0x6e/0x8f [] blk_end_sync_rq+0x0/0x1d [] mempool_alloc+0x1c/0x97 [] bio_phys_segments+0xe/0x14 [] blk_rq_bio_prep+0x28/0x7c [] scsi_execute+0xc6/0xd9 [scsi_mod] [] sr_do_ioctl+0x80/0x1bd [sr_mod] [] scsi_set_medium_removal+0x43/0x67 [scsi_mod] [] sr_packet+0x1a/0x1f [sr_mod] [] cdrom_open+0x337/0x8ae [cdrom] [] wait_for_completion+0x5b/0x84 [] default_wake_function+0x0/0xc [] call_usermodehelper_keys+0xa6/0xb2 [] __call_usermodehelper+0x0/0x43 [] request_module+0xc2/0xd0 [] kobject_get+0xf/0x13 [] sr_block_open+0x74/0x81 [sr_mod] [] do_open+0x8a/0x313 [] scsi_request_fn+0x273/0x2ac [scsi_mod] [] io_schedule+0xe/0x16 [] __wait_on_bit+0x50/0x58 [] sync_buffer+0x0/0x2e [] sync_buffer+0x0/0x2e [] out_of_line_wait_on_bit+0x62/0x6a [] wake_bit_function+0x0/0x3c [] __wait_on_buffer+0x1c/0x1f [] __ext3_get_inode_loc+0x263/0x2b1 [ext3] [] d_splice_alias+0xa9/0xc3 [] ext3_lookup+0x98/0xb8 [ext3] [] do_lookup+0x4f/0x135 [] dput+0x1a/0x10b [] __link_path_walk+0xa5d/0xba8 [] blkdev_get+0x55/0x60 [] open_bdev_excl+0x32/0x6e [] get_sb_bdev+0x14/0x115 [] isofs_get_sb+0x12/0x16 [isofs] [] isofs_fill_super+0x0/0x899 [isofs] [] vfs_kern_mount+0x88/0xfd [] do_kern_mount+0x26/0x36 [] do_mount+0x589/0x5fb [] mntput_no_expire+0x11/0x59 [] mntput_no_expire+0x11/0x59 [] link_path_walk+0xaf/0xb9 [] __handle_mm_fault+0x341/0x620 [] do_path_lookup+0x195/0x1b5 [] __handle_mm_fault+0x187/0x620 [] get_page_from_freelist+0x6e/0x2bb [] __get_free_pages+0x25/0x3e [] copy_mount_options+0x27/0x10a [] sys_mount+0x6a/0xa2 [] syscall_call+0x7/0xb sgpnt[0:1] page c3489af0/0x3489af0 length 32 BUG: warning at drivers/scsi/aha1542.c:78/BAD_SG_DMA() [] aha1542_queuecommand+0x4a4/0x4ce [aha1542] [] scsi_done+0x0/0x16 [scsi_mod] [] scsi_dispatch_cmd+0x1b0/0x223 [scsi_mod] [] scsi_request_fn+0x22e/0x2ac [scsi_mod] [] blk_run_queue+0x2a/0x4b [] scsi_queue_insert+0x75/0x7d [scsi_mod] [] blk_done_softirq+0x4a/0x55 [] __do_softirq+0x35/0x75 [] do_softirq+0x22/0x26 [] do_IRQ+0x48/0x50 [] common_interrupt+0x1a/0x20 [] scsi_dispatch_cmd+0x1b4/0x223 [scsi_mod] [] scsi_request_fn+0x22e/0x2ac [scsi_mod] [] __generic_unplug_device+0x1d/0x1f [] blk_execute_rq_nowait+0x64/0x6a [] blk_execute_rq+0x6e/0x8f [] blk_end_sync_rq+0x0/0x1d [] mempool_alloc+0x1c/0x97 [] bio_phys_segments+0xe/0x14 [] blk_rq_bio_prep+0x28/0x7c [] scsi_execute+0xc6/0xd9 [scsi_mod] [] sr_do_ioctl+0x80/0x1bd [sr_mod] [] scsi_set_medium_removal+0x43/0x67 [scsi_mod] [] sr_packet+0x1a/0x1f [sr_mod] [] cdrom_open+0x337/0x8ae [cdrom] [] wait_for_completion+0x5b/0x84 [] default_wake_function+0x0/0xc [] call_usermodehelper_keys+0xa6/0xb2 [] __call_usermodehelper+0x0/0x43 [] request_module+0xc2/0xd0 [] kobject_get+0xf/0x13 [] sr_block_open+0x74/0x81 [sr_mod] [] do_open+0x8a/0x313 [] scsi_request_fn+0x273/0x2ac [scsi_mod] [] io_schedule+0xe/0x16 [] __wait_on_bit+0x50/0x58 [] sync_buffer+0x0/0x2e [] sync_buffer+0x0/0x2e [] out_of_line_wait_on_bit+0x62/0x6a [] wake_bit_function+0x0/0x3c [] __wait_on_buffer+0x1c/0x1f [] __ext3_get_inode_loc+0x263/0x2b1 [ext3] [] d_splice_alias+0xa9/0xc3 [] ext3_lookup+0x98/0xb8 [ext3] [] do_lookup+0x4f/0x135 [] dput+0x1a/0x10b [] __link_path_walk+0xa5d/0xba8 [] blkdev_get+0x55/0x60 [] open_bdev_excl+0x32/0x6e [] get_sb_bdev+0x14/0x115 [] isofs_get_sb+0x12/0x16 [isofs] [] isofs_fill_super+0x0/0x899 [isofs] [] vfs_kern_mount+0x88/0xfd [] do_kern_mount+0x26/0x36 [] do_mount+0x589/0x5fb [] mntput_no_expire+0x11/0x59 [] mntput_no_expire+0x11/0x59 [] link_path_walk+0xaf/0xb9 [] __handle_mm_fault+0x341/0x620 [] do_path_lookup+0x195/0x1b5 [] __handle_mm_fault+0x187/0x620 [] get_page_from_freelist+0x6e/0x2bb [] __get_free_pages+0x25/0x3e [] copy_mount_options+0x27/0x10a [] sys_mount+0x6a/0xa2 [] syscall_call+0x7/0xb sgpnt[0:1] page c3489af0/0x3489af0 length 32 BUG: warning at drivers/scsi/aha1542.c:78/BAD_SG_DMA() [] aha1542_queuecom
Re: BAD_SG_DMA panic in aha1542
On Mon, Apr 30 2007, Bob Tracy wrote: > rct wrote: > > Apologies to all concerned for an unfortunate delay in resolving this. > > (...) > > I'll go retrieve a more conservatively-configured source tree (closer to > > what DSL-N uses) and start over... > > Success with the Debian 2.6.18-4-486 build, which is known to work > almost as well on the test platform as the 2.6.12 kernel that DSL-N > comes with. I used an older compiler than Debian used for their > production build, so I binary-patched the aha1542.ko and sr_mod.ko > files so insmod wouldn't complain about different vermagic strings. > > That's as close as it gets without redoing everything from scratch. > I'll give Alan's and James' patches a go within the next 13 hours. > > (Alan: what *else* would you name a variable associated with a bounce > buffer besides Zebedee? Thanks for the occasion to smile...) Try Alan's patch, it should fix it. As mentioned earlier in this thread, the real fix is to get rid of the cgc stuff and inject into the block layer from cdrom.c. But Alan's patch should work-around the issue for now. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
rct wrote: > Apologies to all concerned for an unfortunate delay in resolving this. > (...) > I'll go retrieve a more conservatively-configured source tree (closer to > what DSL-N uses) and start over... Success with the Debian 2.6.18-4-486 build, which is known to work almost as well on the test platform as the 2.6.12 kernel that DSL-N comes with. I used an older compiler than Debian used for their production build, so I binary-patched the aha1542.ko and sr_mod.ko files so insmod wouldn't complain about different vermagic strings. That's as close as it gets without redoing everything from scratch. I'll give Alan's and James' patches a go within the next 13 hours. (Alan: what *else* would you name a variable associated with a bounce buffer besides Zebedee? Thanks for the occasion to smile...) -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
On Mon, Apr 30 2007, Christoph Hellwig wrote: > On Mon, Apr 30, 2007 at 07:32:45PM +0200, Jens Axboe wrote: > > It's due to the crappy ->generic_packet() ioctl stuff, it bypasses the > > block layer. So that needs to be converted to use block pc requests and > > the block layer interface, then things will just work. > > > > Christoph had a sort-of ready patch for that some time ago. Christoph, > > did that ever materialize into a full blown patch? > > I had a fullblown patch somewhere. The only drawback was that I didn't > convert pcd because I neither understand enough of the hardware nor do > I really understand what's going on there. Cool, pass the patch and I'll try to see if I can convert pcd. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
On Mon, Apr 30, 2007 at 07:32:45PM +0200, Jens Axboe wrote: > It's due to the crappy ->generic_packet() ioctl stuff, it bypasses the > block layer. So that needs to be converted to use block pc requests and > the block layer interface, then things will just work. > > Christoph had a sort-of ready patch for that some time ago. Christoph, > did that ever materialize into a full blown patch? I had a fullblown patch somewhere. The only drawback was that I didn't convert pcd because I neither understand enough of the hardware nor do I really understand what's going on there. - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
On Fri, Apr 27 2007, James Bottomley wrote: > > sgpnt[0:1] page c1ee5af0/0x1ee5af0 length 32 > > Kernel panic - not syncing: Buffer at physical address > 16 Mb used for > > aha1542 > > > > As before, no problems using the sda hard disk (which is the boot drive): > > everything works reliably until I touch the cdrom drive. > > > > I'll be happy to assist with the debugging, but the system with the > > aha1542 has no development facilities, i.e., I'll have to build test > > kernels on a different system, and turnaround is going to be slow :-(. > > I'm interested. > > This is clearly a use_sg==1 path that has failed to bounce the buffer > for some reason ... and I was contemplating eliminating the GFP_DMA from > our sr driver because I thought the block bouncing had it covered. > > It might also be helpful to apply this patch. It should give a stack > trace of the problem command and not immediately panic the box. It's due to the crappy ->generic_packet() ioctl stuff, it bypasses the block layer. So that needs to be converted to use block pc requests and the block layer interface, then things will just work. Christoph had a sort-of ready patch for that some time ago. Christoph, did that ever materialize into a full blown patch? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
Apologies to all concerned for an unfortunate delay in resolving this. I chose "unwisely" when I picked a popular experimental distro's 2.6.20 kernel source as a base for my troubleshooting efforts. The resulting kernel panics when it tries to load the initial ramdisk, and I don't have the patience to track down which of the turned-on-by-default experimental configuration parameters might be causing the problem. I'll go retrieve a more conservatively-configured source tree (closer to what DSL-N uses) and start over... -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
Alan Cox wrote: > > As before, no problems using the sda hard disk (which is the boot drive): > > everything works reliably until I touch the cdrom drive. > > A little quiet contemplation and gnome number 387 suggests trying the > following > (and providing more detailed information such as the last message printed > before > the DMA message). Stuff a BUG() before the panic in BAD_DMA (aha1542.c) if > needed > to get a good trace. > > Please report success/failure/change. Can do. I don't have access to the machine on weekends, so it will be at least Monday before I can give this a whirl. Thanks! -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
James Bottomley wrote: > On Fri, 2007-04-27 at 16:47 -0500, Bob Tracy wrote: > > I previously reported an ISA DMA issue for the 2.6.12 kernel. The issue > > persists through at least 2.6.18. SCSI controller is an Adaptec > > AHA-1542B (ISA). > > > > The action "mount -t iso9660 /dev/scd0 /mnt/cdrom -r" > > > > produces > > > > (cdrom detection messages as various modules autoload, then...) > > Knowing what these messages are is would be helpful; it tells me what > point in the initialisation it got to. Sorry about that... I'm running the DSL-N distribution (based on Knoppix), and having to transcribe the log messages by hand from the console, i.e., there's no logfile to cut-and-paste from :-(. I don't have access to the machine except on weekdays, but I'll repeat the crash first thing Monday morning and copy everything that's there... > I'm interested. > > This is clearly a use_sg==1 path that has failed to bounce the buffer > for some reason ... and I was contemplating eliminating the GFP_DMA from > our sr driver because I thought the block bouncing had it covered. > > It might also be helpful to apply this patch. It should give a stack > trace of the problem command and not immediately panic the box. I'll throw together a 2.6.21 kernel with this patch and give it a try. Again, it will be at least Monday before you hear back from me on this. Thanks! -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
On Fri, 2007-04-27 at 16:47 -0500, Bob Tracy wrote: > I previously reported an ISA DMA issue for the 2.6.12 kernel. The issue > persists through at least 2.6.18. SCSI controller is an Adaptec > AHA-1542B (ISA). > > The action "mount -t iso9660 /dev/scd0 /mnt/cdrom -r" > > produces > > (cdrom detection messages as various modules autoload, then...) Knowing what these messages are is would be helpful; it tells me what point in the initialisation it got to. > sgpnt[0:1] page c1ee5af0/0x1ee5af0 length 32 > Kernel panic - not syncing: Buffer at physical address > 16 Mb used for > aha1542 > > As before, no problems using the sda hard disk (which is the boot drive): > everything works reliably until I touch the cdrom drive. > > I'll be happy to assist with the debugging, but the system with the > aha1542 has no development facilities, i.e., I'll have to build test > kernels on a different system, and turnaround is going to be slow :-(. I'm interested. This is clearly a use_sg==1 path that has failed to bounce the buffer for some reason ... and I was contemplating eliminating the GFP_DMA from our sr driver because I thought the block bouncing had it covered. It might also be helpful to apply this patch. It should give a stack trace of the problem command and not immediately panic the box. Thanks, James diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c index 1d239f6..4ee7d99 100644 --- a/drivers/scsi/aha1542.c +++ b/drivers/scsi/aha1542.c @@ -75,7 +75,7 @@ static void BAD_SG_DMA(Scsi_Cmnd * SCpnt, /* * Not safe to continue. */ - panic("Buffer at physical address > 16Mb used for aha1542"); + WARN_ON(1); } #include @@ -725,8 +725,12 @@ static int aha1542_queuecommand(Scsi_Cmnd * SCpnt, void (*done) (Scsi_Cmnd *)) panic("Fod fight!"); }; any2scsi(cptr[i].dataptr, SCSI_SG_PA(&sgpnt[i])); - if (SCSI_SG_PA(&sgpnt[i]) + sgpnt[i].length - 1 > ISA_DMA_THRESHOLD) + if (SCSI_SG_PA(&sgpnt[i]) + sgpnt[i].length - 1 > ISA_DMA_THRESHOLD) { BAD_SG_DMA(SCpnt, sgpnt, SCpnt->use_sg, i); + SCpnt->result = DID_ERROR << 16; + done(SCpnt); + return 0; + } any2scsi(cptr[i].datalen, sgpnt[i].length); }; any2scsi(ccb[mbo].datalen, SCpnt->use_sg * sizeof(struct chain)); - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BAD_SG_DMA panic in aha1542
> As before, no problems using the sda hard disk (which is the boot drive): > everything works reliably until I touch the cdrom drive. A little quiet contemplation and gnome number 387 suggests trying the following (and providing more detailed information such as the last message printed before the DMA message). Stuff a BUG() before the panic in BAD_DMA (aha1542.c) if needed to get a good trace. Please report success/failure/change. --- drivers/scsi/sr_ioctl.c~2007-04-27 22:53:33.885035256 +0100 +++ drivers/scsi/sr_ioctl.c 2007-04-27 22:53:33.885035256 +0100 @@ -187,9 +187,10 @@ struct scsi_sense_hdr sshdr; int result, err = 0, retries = 0; struct request_sense *sense = cgc->sense; - + void *zebedee = cgc->buffer; + SDev = cd->device; - + if (!sense) { sense = kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL); if (!sense) { @@ -197,7 +198,22 @@ goto out; } } - + + if (cgc->buflen && cd->device->host->unchecked_isa_dma) { + switch(cgc->data_direction) { + case DMA_NONE: + break; + case DMA_FROM_DEVICE: + case DMA_TO_DEVICE: /* Boing said Zebedee */ + zebedee = kmalloc(cgc->buflen, GFP_KERNEL|GFP_DMA); + if (zebedee ==NULL) { + err = -ENOMEM; + goto out; + } + } + if (cgc->data_direction == DMA_TO_DEVICE) + memcpy(zebedee, cgc->buffer, cgc->buflen); + } retry: if (!scsi_block_when_processing_errors(SDev)) { err = -ENODEV; @@ -206,10 +222,13 @@ memset(sense, 0, sizeof(*sense)); result = scsi_execute(SDev, cgc->cmd, cgc->data_direction, - cgc->buffer, cgc->buflen, (char *)sense, + zebedee, cgc->buflen, (char *)sense, cgc->timeout, IOCTL_RETRIES, 0); scsi_normalize_sense((char *)sense, sizeof(*sense), &sshdr); + + if (zebedeee != cgc->buffer && cgc->data_direction == DMA_FROM_DEVICE) + memcpy(cgc->buffer, zebedee, cgc->buflen); /* Minimal error checking. Ignore cases we know about, and report the rest. */ if (driver_byte(result) != 0) { - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
BAD_SG_DMA panic in aha1542
I previously reported an ISA DMA issue for the 2.6.12 kernel. The issue persists through at least 2.6.18. SCSI controller is an Adaptec AHA-1542B (ISA). The action "mount -t iso9660 /dev/scd0 /mnt/cdrom -r" produces (cdrom detection messages as various modules autoload, then...) sgpnt[0:1] page c1ee5af0/0x1ee5af0 length 32 Kernel panic - not syncing: Buffer at physical address > 16 Mb used for aha1542 As before, no problems using the sda hard disk (which is the boot drive): everything works reliably until I touch the cdrom drive. I'll be happy to assist with the debugging, but the system with the aha1542 has no development facilities, i.e., I'll have to build test kernels on a different system, and turnaround is going to be slow :-(. Thanks in advance for helping me get this old machine working again. No issues with 2.4 kernels. I have no idea about 2.5 kernels and 2.6 kernels prior to 2.6.12. As for why I didn't report this before now, the aha1542b was in my parts bin until I cobbled a system together approx. two weeks ago, mostly to see if a useful system could still be had using legacy hardware and modern GNU/Linux software. I'm happy to report the answer is mostly "yes". -- --- Bob Tracy WTO + WIPO = DMCA? http://www.anti-dmca.org [EMAIL PROTECTED] --- - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html