Re: BUG: Null pointer dereference in fs/open.c

2007-04-26 Thread Jens Axboe
On Thu, Apr 26 2007, William Heimbigner wrote:
> On Wed, 25 Apr 2007, Andrew Morton wrote:
> >On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner 
> ><[EMAIL PROTECTED]> wrote:
> >
> >>On Wed, 25 Apr 2007, Andrew Morton wrote:
> >>
> >>>OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
> >>>drive.  It could be that something is going wrong at the IDE level for 
> >>>you.
> >>Perhaps; I'll try an external usb cd burner, and see where that goes.
> >>
> >>>Are you able to identify the most recent kernel which actually worked?
> >>No, because I haven't set packet writing up in Linux before - however, I 
> >>do know
> >>that I've successfully set up packet writing (using 2 of the 3 cd burners 
> >>I
> >>have) in another operating system before. I'll try 2.6.18 and see if that 
> >>gets
> >>me anywhere different, though.
> >
> >OK.
> >
> >A quick summary: mainline's pktcdvd isn't working for William using IDE.
> >It is working for me using sata.
> >
> 
> >
> >So what has happened here is that this code, in ide-cd.c's
> >cdrom_decode_status() is now triggering:
> >
> > } else if (blk_pc_request(rq) || rq->cmd_type == REQ_TYPE_ATA_PC) {
> > /* All other functions, except for READ. */
> > unsigned long flags;
> >
> > /*
> >  * if we have an error, pass back CHECK_CONDITION as the
> >  * scsi status byte
> >  */
> > if (blk_pc_request(rq) && !rq->errors)
> > rq->errors = SAM_STAT_CHECK_CONDITION;
> >
> >
> >I suspect this is a bug introduced by
> >406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
> >bug so far).
> >
> >Perhaps the IDE driver was previously not considering these requests to be
> >of type blk_pc_request(), and after
> >406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
> >blk_pc_request() and is incorrectly reporting an error.  Or something like
> >that.
> >
> >Guys: help!
> >
> A follow-up: after looking around a bit, I have managed to get packet 
> writing to work properly on /dev/hdc (before, it was reporting only 1.8 MB 
> available or so; this was a formatting issue).
> I've also gotten the external cd-rw drive to work. However, I'm still at a 
> loss as to why /dev/hdd won't work. I tried formatting a dvd-rw for this 
> drive, however, it consistently gives me:
> [27342.503933] drivers/ide/ide-cd.c:729: setting error to 2
> [27342.509251]  [] show_trace_log_lvl+0x1a/0x30
> [27342.514411]  [] show_trace+0x12/0x20
> [27342.518864]  [] dump_stack+0x16/0x20
> [27342.523317]  [] cdrom_decode_status+0x1f4/0x3b0
> [27342.528732]  [] cdrom_newpc_intr+0x38/0x320
> [27342.533791]  [] ide_intr+0x96/0x200
> [27342.538157]  [] handle_IRQ_event+0x28/0x60
> [27342.543139]  [] handle_edge_irq+0xa6/0x130
> [27342.548121]  [] do_IRQ+0x49/0xa0
> [27342.552228]  [] common_interrupt+0x2e/0x34
> [27342.557200]  [] mwait_idle+0x12/0x20
> [27342.561653]  [] cpu_idle+0x4a/0x80
> [27342.565934]  [] rest_init+0x37/0x40
> [27342.570300]  [] start_kernel+0x34b/0x420
> [27342.575109]  [<>] 0x0
> [27342.578089]  ===
> and doesn't work (the above output was generated by Andrew's patch to log 
> certain areas).
> 
> # dvd+rw-format /dev/hdd -force
> * BD/DVDRW/-RAM format utility by <[EMAIL PROTECTED]>, version 7.0.
> :-( failed to locate "Quick Format" descriptor.
> * 4.7GB DVD-RW media in Sequential mode detected.
> * formatting 0.0\:-[ READ TRACK INFORMATION failed with 
> SK=3h/ASC=11h/ACQ=05h]: Input/output error

That's an uncorrectable read error. Is the media good?

> I tried putting in a different dvd-rw, and this time I get:
> # dvd+rw-format /dev/hdd -force
> * BD/DVDRW/-RAM format utility by <[EMAIL PROTECTED]>, version 7.0.
> * 4.7GB DVD-RW media in Sequential mode detected.
> * formatting 0.0|:-[ FORMAT UNIT failed with SK=5h/ASC=26h/ACQ=00h]: 
> Input/output error

That's the drive complaining about an invalid bit being set in the
command descriptor block. That's usually a bug in the issuer.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-26 Thread Jens Axboe
On Wed, Apr 25 2007, Andrew Morton wrote:
> > # pktsetup 2 /dev/sr0
> > [19982.934793] drivers/scsi/scsi_lib.c:838: setting error to 134217730
> > [19982.941070]  [] show_trace_log_lvl+0x1a/0x30
> > [19982.946256]  [] show_trace+0x12/0x20
> > [19982.950744]  [] dump_stack+0x16/0x20
> > [19982.955232]  [] scsi_io_completion+0x28a/0x3a0
> > [19982.960586]  [] scsi_blk_pc_done+0x1b/0x30
> > [19982.965594]  [] scsi_finish_command+0x4c/0x60
> > [19982.970861]  [] scsi_softirq_done+0x77/0xe0
> > [19982.975955]  [] blk_done_softirq+0x6b/0x80
> > [19982.980962]  [] __do_softirq+0x62/0xc0
> > [19982.985624]  [] do_softirq+0x55/0x60
> > [19982.990112]  [] ksoftirqd+0x65/0x100
> > [19982.994599]  [] kthread+0xa3/0xd0
> > [19982.998827]  [] kernel_thread_helper+0x7/0x10
> > [19983.004095]  ===
> > [19983.009065] cdrom: This disc doesn't have any tracks I recognize!
> 
> So what has happened here is that this code, in ide-cd.c's
> cdrom_decode_status() is now triggering:
> 
>   } else if (blk_pc_request(rq) || rq->cmd_type == REQ_TYPE_ATA_PC) {
>   /* All other functions, except for READ. */
>   unsigned long flags;
> 
>   /*
>* if we have an error, pass back CHECK_CONDITION as the
>* scsi status byte
>*/
>   if (blk_pc_request(rq) && !rq->errors)
>   rq->errors = SAM_STAT_CHECK_CONDITION;
> 
> 
> I suspect this is a bug introduced by
> 406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
> bug so far).
> 
> Perhaps the IDE driver was previously not considering these requests to be
> of type blk_pc_request(), and after
> 406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
> blk_pc_request() and is incorrectly reporting an error.  Or something like
> that.

But it IS a block pc request. We've been setting the sam stats on
->errors for block pc request for a long time.

> Guys: help!

I'm not sure what your question is, if someone can some up what the what
goes wrong and what the expected result is, I can try and help.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-26 Thread Jens Axboe
On Wed, Apr 25 2007, Andrew Morton wrote:
  # pktsetup 2 /dev/sr0
  [19982.934793] drivers/scsi/scsi_lib.c:838: setting error to 134217730
  [19982.941070]  [c010521a] show_trace_log_lvl+0x1a/0x30
  [19982.946256]  [c0105952] show_trace+0x12/0x20
  [19982.950744]  [c0105a46] dump_stack+0x16/0x20
  [19982.955232]  [c034543a] scsi_io_completion+0x28a/0x3a0
  [19982.960586]  [c034556b] scsi_blk_pc_done+0x1b/0x30
  [19982.965594]  [c0340d0c] scsi_finish_command+0x4c/0x60
  [19982.970861]  [c0345c07] scsi_softirq_done+0x77/0xe0
  [19982.975955]  [c0257f8b] blk_done_softirq+0x6b/0x80
  [19982.980962]  [c01243a2] __do_softirq+0x62/0xc0
  [19982.985624]  [c0124455] do_softirq+0x55/0x60
  [19982.990112]  [c0124be5] ksoftirqd+0x65/0x100
  [19982.994599]  [c0132963] kthread+0xa3/0xd0
  [19982.998827]  [c0104e17] kernel_thread_helper+0x7/0x10
  [19983.004095]  ===
  [19983.009065] cdrom: This disc doesn't have any tracks I recognize!
 
 So what has happened here is that this code, in ide-cd.c's
 cdrom_decode_status() is now triggering:
 
   } else if (blk_pc_request(rq) || rq-cmd_type == REQ_TYPE_ATA_PC) {
   /* All other functions, except for READ. */
   unsigned long flags;
 
   /*
* if we have an error, pass back CHECK_CONDITION as the
* scsi status byte
*/
   if (blk_pc_request(rq)  !rq-errors)
   rq-errors = SAM_STAT_CHECK_CONDITION;
 
 
 I suspect this is a bug introduced by
 406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
 bug so far).
 
 Perhaps the IDE driver was previously not considering these requests to be
 of type blk_pc_request(), and after
 406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
 blk_pc_request() and is incorrectly reporting an error.  Or something like
 that.

But it IS a block pc request. We've been setting the sam stats on
-errors for block pc request for a long time.

 Guys: help!

I'm not sure what your question is, if someone can some up what the what
goes wrong and what the expected result is, I can try and help.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-26 Thread Jens Axboe
On Thu, Apr 26 2007, William Heimbigner wrote:
 On Wed, 25 Apr 2007, Andrew Morton wrote:
 On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner 
 [EMAIL PROTECTED] wrote:
 
 On Wed, 25 Apr 2007, Andrew Morton wrote:
 snip
 OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
 drive.  It could be that something is going wrong at the IDE level for 
 you.
 Perhaps; I'll try an external usb cd burner, and see where that goes.
 
 Are you able to identify the most recent kernel which actually worked?
 No, because I haven't set packet writing up in Linux before - however, I 
 do know
 that I've successfully set up packet writing (using 2 of the 3 cd burners 
 I
 have) in another operating system before. I'll try 2.6.18 and see if that 
 gets
 me anywhere different, though.
 
 OK.
 
 A quick summary: mainline's pktcdvd isn't working for William using IDE.
 It is working for me using sata.
 
 snip
 
 So what has happened here is that this code, in ide-cd.c's
 cdrom_decode_status() is now triggering:
 
  } else if (blk_pc_request(rq) || rq-cmd_type == REQ_TYPE_ATA_PC) {
  /* All other functions, except for READ. */
  unsigned long flags;
 
  /*
   * if we have an error, pass back CHECK_CONDITION as the
   * scsi status byte
   */
  if (blk_pc_request(rq)  !rq-errors)
  rq-errors = SAM_STAT_CHECK_CONDITION;
 
 
 I suspect this is a bug introduced by
 406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
 bug so far).
 
 Perhaps the IDE driver was previously not considering these requests to be
 of type blk_pc_request(), and after
 406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
 blk_pc_request() and is incorrectly reporting an error.  Or something like
 that.
 
 Guys: help!
 
 A follow-up: after looking around a bit, I have managed to get packet 
 writing to work properly on /dev/hdc (before, it was reporting only 1.8 MB 
 available or so; this was a formatting issue).
 I've also gotten the external cd-rw drive to work. However, I'm still at a 
 loss as to why /dev/hdd won't work. I tried formatting a dvd-rw for this 
 drive, however, it consistently gives me:
 [27342.503933] drivers/ide/ide-cd.c:729: setting error to 2
 [27342.509251]  [c010521a] show_trace_log_lvl+0x1a/0x30
 [27342.514411]  [c0105952] show_trace+0x12/0x20
 [27342.518864]  [c0105a46] dump_stack+0x16/0x20
 [27342.523317]  [c033f6e4] cdrom_decode_status+0x1f4/0x3b0
 [27342.528732]  [c033fae8] cdrom_newpc_intr+0x38/0x320
 [27342.533791]  [c0331106] ide_intr+0x96/0x200
 [27342.538157]  [c0150cf8] handle_IRQ_event+0x28/0x60
 [27342.543139]  [c0151f96] handle_edge_irq+0xa6/0x130
 [27342.548121]  [c0106449] do_IRQ+0x49/0xa0
 [27342.552228]  [c0104c3a] common_interrupt+0x2e/0x34
 [27342.557200]  [c01022d2] mwait_idle+0x12/0x20
 [27342.561653]  [c01023ca] cpu_idle+0x4a/0x80
 [27342.565934]  [c0101147] rest_init+0x37/0x40
 [27342.570300]  [c068ac7b] start_kernel+0x34b/0x420
 [27342.575109]  [] 0x0
 [27342.578089]  ===
 and doesn't work (the above output was generated by Andrew's patch to log 
 certain areas).
 
 # dvd+rw-format /dev/hdd -force
 * BD/DVDRW/-RAM format utility by [EMAIL PROTECTED], version 7.0.
 :-( failed to locate Quick Format descriptor.
 * 4.7GB DVD-RW media in Sequential mode detected.
 * formatting 0.0\:-[ READ TRACK INFORMATION failed with 
 SK=3h/ASC=11h/ACQ=05h]: Input/output error

That's an uncorrectable read error. Is the media good?

 I tried putting in a different dvd-rw, and this time I get:
 # dvd+rw-format /dev/hdd -force
 * BD/DVDRW/-RAM format utility by [EMAIL PROTECTED], version 7.0.
 * 4.7GB DVD-RW media in Sequential mode detected.
 * formatting 0.0|:-[ FORMAT UNIT failed with SK=5h/ASC=26h/ACQ=00h]: 
 Input/output error

That's the drive complaining about an invalid bit being set in the
command descriptor block. That's usually a bug in the issuer.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread William Heimbigner

On Wed, 25 Apr 2007, Andrew Morton wrote:

On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:


On Wed, 25 Apr 2007, Andrew Morton wrote:


OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
drive.  It could be that something is going wrong at the IDE level for you.

Perhaps; I'll try an external usb cd burner, and see where that goes.


Are you able to identify the most recent kernel which actually worked?

No, because I haven't set packet writing up in Linux before - however, I do know
that I've successfully set up packet writing (using 2 of the 3 cd burners I
have) in another operating system before. I'll try 2.6.18 and see if that gets
me anywhere different, though.


OK.

A quick summary: mainline's pktcdvd isn't working for William using IDE.
It is working for me using sata.





So what has happened here is that this code, in ide-cd.c's
cdrom_decode_status() is now triggering:

} else if (blk_pc_request(rq) || rq->cmd_type == REQ_TYPE_ATA_PC) {
/* All other functions, except for READ. */
unsigned long flags;

/*
 * if we have an error, pass back CHECK_CONDITION as the
 * scsi status byte
 */
if (blk_pc_request(rq) && !rq->errors)
rq->errors = SAM_STAT_CHECK_CONDITION;


I suspect this is a bug introduced by
406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
bug so far).

Perhaps the IDE driver was previously not considering these requests to be
of type blk_pc_request(), and after
406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
blk_pc_request() and is incorrectly reporting an error.  Or something like
that.

Guys: help!

A follow-up: after looking around a bit, I have managed to get packet writing to 
work properly on /dev/hdc (before, it was reporting only 1.8 MB available or so; 
this was a formatting issue).
I've also gotten the external cd-rw drive to work. However, I'm still at a loss 
as to why /dev/hdd won't work. I tried formatting a dvd-rw for this drive, 
however, it consistently gives me:

[27342.503933] drivers/ide/ide-cd.c:729: setting error to 2
[27342.509251]  [] show_trace_log_lvl+0x1a/0x30
[27342.514411]  [] show_trace+0x12/0x20
[27342.518864]  [] dump_stack+0x16/0x20
[27342.523317]  [] cdrom_decode_status+0x1f4/0x3b0
[27342.528732]  [] cdrom_newpc_intr+0x38/0x320
[27342.533791]  [] ide_intr+0x96/0x200
[27342.538157]  [] handle_IRQ_event+0x28/0x60
[27342.543139]  [] handle_edge_irq+0xa6/0x130
[27342.548121]  [] do_IRQ+0x49/0xa0
[27342.552228]  [] common_interrupt+0x2e/0x34
[27342.557200]  [] mwait_idle+0x12/0x20
[27342.561653]  [] cpu_idle+0x4a/0x80
[27342.565934]  [] rest_init+0x37/0x40
[27342.570300]  [] start_kernel+0x34b/0x420
[27342.575109]  [<>] 0x0
[27342.578089]  ===
and doesn't work (the above output was generated by Andrew's patch to log 
certain areas).


# dvd+rw-format /dev/hdd -force
* BD/DVDRW/-RAM format utility by <[EMAIL PROTECTED]>, version 7.0.
:-( failed to locate "Quick Format" descriptor.
* 4.7GB DVD-RW media in Sequential mode detected.
* formatting 0.0\:-[ READ TRACK INFORMATION failed with SK=3h/ASC=11h/ACQ=05h]: 
Input/output error

I tried putting in a different dvd-rw, and this time I get:
# dvd+rw-format /dev/hdd -force
* BD/DVDRW/-RAM format utility by <[EMAIL PROTECTED]>, version 7.0.
* 4.7GB DVD-RW media in Sequential mode detected.
* formatting 0.0|:-[ FORMAT UNIT failed with SK=5h/ASC=26h/ACQ=00h]: 
Input/output error

William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread Andrew Morton
On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:

> On Wed, 25 Apr 2007, Andrew Morton wrote:
> 
> > OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
> > drive.  It could be that something is going wrong at the IDE level for you.
> Perhaps; I'll try an external usb cd burner, and see where that goes.
> 
> > Are you able to identify the most recent kernel which actually worked?
> No, because I haven't set packet writing up in Linux before - however, I do 
> know 
> that I've successfully set up packet writing (using 2 of the 3 cd burners I 
> have) in another operating system before. I'll try 2.6.18 and see if that gets
> me anywhere different, though.

OK.

A quick summary: mainline's pktcdvd isn't working for William using IDE. 
It is working for me using sata.

> dmesg.1.txt is the dmesg output from immediately after system finishes 
> booting 
> (the unusually large printk times are due to kexec)
> 
> # pktsetup 0 /dev/hdc
> [19861.831160] pktcdvd: writer pktcdvd0 mapped to hdc
> [19861.837138]
> [19861.837142] =
> [19861.844343] [ INFO: possible recursive locking detected ]
> [19861.849738] 2.6.21-rc7 #2
> [19861.852361] -
> [19861.857750] vol_id/4433 is trying to acquire lock:
> [19861.862533]  (>bd_mutex){--..}, at: [] do_open+0x4f/0x2c0
> [19861.869386]
> [19861.869387] but task is already holding lock:
> [19861.875225]  (>bd_mutex){--..}, at: [] do_open+0x4f/0x2c0
> [19861.882070]
> [19861.882071] other info that might help us debug this:
> [19861.888602] 2 locks held by vol_id/4433:
> [19861.892518]  #0:  (>bd_mutex){--..}, at: [] 
> do_open+0x4f/0x2c0
> [19861.899813]  #1:  (_mutex#2){--..}, at: [] 
> mutex_lock+0x1c/0x20
> [19861.907046]
> [19861.907047] stack backtrace:
> [19861.911415]  [] show_trace_log_lvl+0x1a/0x30
> [19861.916569]  [] show_trace+0x12/0x20
> [19861.921021]  [] dump_stack+0x16/0x20
> [19861.925475]  [] __lock_acquire+0xbc0/0x1040
> [19861.930542]  [] lock_acquire+0x70/0x90
> [19861.935169]  [] mutex_lock_nested+0x7e/0x2e0
> [19861.940315]  [] do_open+0x4f/0x2c0
> [19861.944595]  [] __blkdev_get+0x79/0x90
> [19861.949222]  [] blkdev_get+0x15/0x20
> [19861.953674]  [] pkt_open+0xb7/0xd80
> [19861.958050]  [] do_open+0x85/0x2c0
> [19861.962330]  [] blkdev_open+0x33/0x70
> [19861.966870]  [] __dentry_open+0xf4/0x220
> [19861.971678]  [] nameidata_to_filp+0x35/0x40
> [19861.976738]  [] do_filp_open+0x49/0x50
> [19861.981365]  [] do_sys_open+0x47/0xd0
> [19861.985904]  [] sys_open+0x1c/0x20
> [19861.990184]  [] sysenter_past_esp+0x5f/0x99
> [19861.995243]  ===

Yeah, I'm not worrying about that one for now.

> # pktsetup 1 /dev/hdd
> [19909.635795] cdrom: This disc doesn't have any tracks I recognize!
> [19909.689394] pktcdvd: writer pktcdvd1 mapped to hdd
> [19909.820337] drivers/ide/ide-cd.c:729: setting error to 2
> [19909.825649]  [] show_trace_log_lvl+0x1a/0x30
> [19909.830810]  [] show_trace+0x12/0x20
> [19909.835263]  [] dump_stack+0x16/0x20
> [19909.839716]  [] cdrom_decode_status+0x1f4/0x3b0
> [19909.845131]  [] cdrom_newpc_intr+0x38/0x320
> [19909.850190]  [] ide_intr+0x96/0x200
> [19909.854557]  [] handle_IRQ_event+0x28/0x60
> [19909.859538]  [] handle_edge_irq+0xa6/0x130
> [19909.864511]  [] do_IRQ+0x49/0xa0
> [19909.868618]  [] common_interrupt+0x2e/0x34
> [19909.873591]  [] mwait_idle+0x12/0x20
> [19909.878044]  [] cpu_idle+0x4a/0x80
> [19909.882324]  [] rest_init+0x37/0x40
> [19909.886690]  [] start_kernel+0x34b/0x420
> [19909.891499]  [<>] 0x0
> [19909.894488]  ===
> [19909.921518] pktcdvd: pkt_get_last_written failed
> 
> # pktsetup 2 /dev/sr0
> [19982.934793] drivers/scsi/scsi_lib.c:838: setting error to 134217730
> [19982.941070]  [] show_trace_log_lvl+0x1a/0x30
> [19982.946256]  [] show_trace+0x12/0x20
> [19982.950744]  [] dump_stack+0x16/0x20
> [19982.955232]  [] scsi_io_completion+0x28a/0x3a0
> [19982.960586]  [] scsi_blk_pc_done+0x1b/0x30
> [19982.965594]  [] scsi_finish_command+0x4c/0x60
> [19982.970861]  [] scsi_softirq_done+0x77/0xe0
> [19982.975955]  [] blk_done_softirq+0x6b/0x80
> [19982.980962]  [] __do_softirq+0x62/0xc0
> [19982.985624]  [] do_softirq+0x55/0x60
> [19982.990112]  [] ksoftirqd+0x65/0x100
> [19982.994599]  [] kthread+0xa3/0xd0
> [19982.998827]  [] kernel_thread_helper+0x7/0x10
> [19983.004095]  ===
> [19983.009065] cdrom: This disc doesn't have any tracks I recognize!

So what has happened here is that this code, in ide-cd.c's
cdrom_decode_status() is now triggering:

} else if (blk_pc_request(rq) || rq->cmd_type == REQ_TYPE_ATA_PC) {
/* All other functions, except for READ. */
unsigned long flags;

/*
 * if we have an error, pass back CHECK_CONDITION as the
 * scsi status byte
 */
if (blk_pc_request(rq) 

Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread Andrew Morton
On Tue, 24 Apr 2007 05:44:42 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:

> On Mon, 23 Apr 2007, Andrew Morton wrote:
> > On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner <[EMAIL 
> > PROTECTED]> wrote:
> >
> >>> --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
> >>> +++ a/drivers/block/pktcdvd.c
> >>> @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
> >>>   rq->cmd_flags |= REQ_QUIET;
> >>>
> >>>   blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
> >>> - ret = rq->errors;
> >>> + if (rq->errors)
> >>> + ret = -EIO;
> >>> out:
> >>>   blk_put_request(rq);
> >>>   return ret;
> >>> _
> >>
> >> This patch fixes (or conceals?) the oops.
> >>
> >
> > Fixes.  But does the packet driver actually work OK for you?  Writes
> > files and stuff like that?
> >
> Short answer, no.
> 
> Long answer:
> # pktsetup 0 /dev/hdc
> 
> ...
>
> # mkudffs /dev/pktcdvd/0
> [11539.953560] pktcdvd: pkt_get_last_written failed
> trying to change type of multiple extents
> 
> I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
> burners, hdd has a cd-rw and hdc had a dvd-rw)
> 

OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
drive.  It could be that something is going wrong at the IDE level for you.

Are you able to identify the most recent kernel which actually worked?

Here's the bugfix which we already worked out:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq->cmd_flags |= REQ_QUIET;
 
blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
-   ret = rq->errors;
+   if (rq->errors)
+   ret = -EIO;
 out:
blk_put_request(rq);
return ret;
_


And here's a debug patch which will hopefully tell us where we're detecting
an error.  Please apply both patches to 2.6.21-rc7 and retest?

Also, please send the full boot-time dmesg output.



 drivers/ide/ide-cd.c|6 +-
 drivers/ide/ide-io.c|   24 +++-
 drivers/scsi/scsi_lib.c |2 ++
 include/linux/kernel.h  |9 +
 4 files changed, 35 insertions(+), 6 deletions(-)

diff -puN drivers/ide/ide-cd.c~block-debug drivers/ide/ide-cd.c
--- a/drivers/ide/ide-cd.c~block-debug
+++ a/drivers/ide/ide-cd.c
@@ -724,8 +724,10 @@ static int cdrom_decode_status(ide_drive
 * if we have an error, pass back CHECK_CONDITION as the
 * scsi status byte
 */
-   if (blk_pc_request(rq) && !rq->errors)
+   if (blk_pc_request(rq) && !rq->errors) {
rq->errors = SAM_STAT_CHECK_CONDITION;
+   M(rq->errors);
+   }
 
/* Check for tray open. */
if (sense_key == NOT_READY) {
@@ -791,6 +793,7 @@ static int cdrom_decode_status(ide_drive
if (!rq->errors)
info->write_timeout = jiffies + 
ATAPI_WAIT_WRITE_BUSY;
rq->errors = 1;
+   M(rq->errors);
if (time_after(jiffies, info->write_timeout))
do_end_request = 1;
else {
@@ -3127,6 +3130,7 @@ static int ide_cdrom_prep_pc(struct requ
 */
if (c[0] == MODE_SENSE || c[0] == MODE_SELECT) {
rq->errors = ILLEGAL_REQUEST;
+   M(rq->errors);
return BLKPREP_KILL;
}

diff -puN drivers/ide/ide-io.c~block-debug drivers/ide/ide-io.c
--- a/drivers/ide/ide-io.c~block-debug
+++ a/drivers/ide/ide-io.c
@@ -66,8 +66,10 @@ static int __ide_end_request(ide_drive_t
if (blk_noretry_request(rq) && end_io_error(uptodate))
nr_sectors = rq->hard_nr_sectors;
 
-   if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors)
+   if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors) {
rq->errors = -EIO;
+   M(rq->errors);
+   }
 
/*
 * decide whether to reenable DMA -- 3 is a random magic for now,
@@ -265,8 +267,10 @@ int ide_end_dequeued_request(ide_drive_t
if (blk_noretry_request(rq) && end_io_error(uptodate))
nr_sectors = rq->hard_nr_sectors;
 
-   if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors)
+   if (!blk_fs_request(rq) && end_io_error(uptodate) && !rq->errors) {
rq->errors = -EIO;
+   M(rq->errors);
+   }
 
/*
 * decide whether to reenable DMA -- 3 is a random magic for now,
@@ -380,8 +384,10 @@ void ide_end_drive_cmd (ide_drive_t *dri
 
if (rq->cmd_type == REQ_TYPE_ATA_CMD) {
u8 *args = (u8 *) rq->buffer;
-   if (rq->errors == 0)
+   if (rq->errors == 0) {

Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread Andrew Morton
On Tue, 24 Apr 2007 05:44:42 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:

 On Mon, 23 Apr 2007, Andrew Morton wrote:
  On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner [EMAIL 
  PROTECTED] wrote:
 
  --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
  +++ a/drivers/block/pktcdvd.c
  @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;
 
blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
  - ret = rq-errors;
  + if (rq-errors)
  + ret = -EIO;
  out:
blk_put_request(rq);
return ret;
  _
 
  This patch fixes (or conceals?) the oops.
 
 
  Fixes.  But does the packet driver actually work OK for you?  Writes
  files and stuff like that?
 
 Short answer, no.
 
 Long answer:
 # pktsetup 0 /dev/hdc
 
 ...

 # mkudffs /dev/pktcdvd/0
 [11539.953560] pktcdvd: pkt_get_last_written failed
 trying to change type of multiple extents
 
 I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
 burners, hdd has a cd-rw and hdc had a dvd-rw)
 

OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
drive.  It could be that something is going wrong at the IDE level for you.

Are you able to identify the most recent kernel which actually worked?

Here's the bugfix which we already worked out:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;
 
blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
-   ret = rq-errors;
+   if (rq-errors)
+   ret = -EIO;
 out:
blk_put_request(rq);
return ret;
_


And here's a debug patch which will hopefully tell us where we're detecting
an error.  Please apply both patches to 2.6.21-rc7 and retest?

Also, please send the full boot-time dmesg output.



 drivers/ide/ide-cd.c|6 +-
 drivers/ide/ide-io.c|   24 +++-
 drivers/scsi/scsi_lib.c |2 ++
 include/linux/kernel.h  |9 +
 4 files changed, 35 insertions(+), 6 deletions(-)

diff -puN drivers/ide/ide-cd.c~block-debug drivers/ide/ide-cd.c
--- a/drivers/ide/ide-cd.c~block-debug
+++ a/drivers/ide/ide-cd.c
@@ -724,8 +724,10 @@ static int cdrom_decode_status(ide_drive
 * if we have an error, pass back CHECK_CONDITION as the
 * scsi status byte
 */
-   if (blk_pc_request(rq)  !rq-errors)
+   if (blk_pc_request(rq)  !rq-errors) {
rq-errors = SAM_STAT_CHECK_CONDITION;
+   M(rq-errors);
+   }
 
/* Check for tray open. */
if (sense_key == NOT_READY) {
@@ -791,6 +793,7 @@ static int cdrom_decode_status(ide_drive
if (!rq-errors)
info-write_timeout = jiffies + 
ATAPI_WAIT_WRITE_BUSY;
rq-errors = 1;
+   M(rq-errors);
if (time_after(jiffies, info-write_timeout))
do_end_request = 1;
else {
@@ -3127,6 +3130,7 @@ static int ide_cdrom_prep_pc(struct requ
 */
if (c[0] == MODE_SENSE || c[0] == MODE_SELECT) {
rq-errors = ILLEGAL_REQUEST;
+   M(rq-errors);
return BLKPREP_KILL;
}

diff -puN drivers/ide/ide-io.c~block-debug drivers/ide/ide-io.c
--- a/drivers/ide/ide-io.c~block-debug
+++ a/drivers/ide/ide-io.c
@@ -66,8 +66,10 @@ static int __ide_end_request(ide_drive_t
if (blk_noretry_request(rq)  end_io_error(uptodate))
nr_sectors = rq-hard_nr_sectors;
 
-   if (!blk_fs_request(rq)  end_io_error(uptodate)  !rq-errors)
+   if (!blk_fs_request(rq)  end_io_error(uptodate)  !rq-errors) {
rq-errors = -EIO;
+   M(rq-errors);
+   }
 
/*
 * decide whether to reenable DMA -- 3 is a random magic for now,
@@ -265,8 +267,10 @@ int ide_end_dequeued_request(ide_drive_t
if (blk_noretry_request(rq)  end_io_error(uptodate))
nr_sectors = rq-hard_nr_sectors;
 
-   if (!blk_fs_request(rq)  end_io_error(uptodate)  !rq-errors)
+   if (!blk_fs_request(rq)  end_io_error(uptodate)  !rq-errors) {
rq-errors = -EIO;
+   M(rq-errors);
+   }
 
/*
 * decide whether to reenable DMA -- 3 is a random magic for now,
@@ -380,8 +384,10 @@ void ide_end_drive_cmd (ide_drive_t *dri
 
if (rq-cmd_type == REQ_TYPE_ATA_CMD) {
u8 *args = (u8 *) rq-buffer;
-   if (rq-errors == 0)
+   if (rq-errors == 0) {
rq-errors = !OK_STAT(stat,READY_STAT,BAD_STAT);
+   M(rq-errors);
+   }
 
if (args) {
 

Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread Andrew Morton
On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:

 On Wed, 25 Apr 2007, Andrew Morton wrote:
 snip
  OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
  drive.  It could be that something is going wrong at the IDE level for you.
 Perhaps; I'll try an external usb cd burner, and see where that goes.
 
  Are you able to identify the most recent kernel which actually worked?
 No, because I haven't set packet writing up in Linux before - however, I do 
 know 
 that I've successfully set up packet writing (using 2 of the 3 cd burners I 
 have) in another operating system before. I'll try 2.6.18 and see if that gets
 me anywhere different, though.

OK.

A quick summary: mainline's pktcdvd isn't working for William using IDE. 
It is working for me using sata.

 dmesg.1.txt is the dmesg output from immediately after system finishes 
 booting 
 (the unusually large printk times are due to kexec)
 
 # pktsetup 0 /dev/hdc
 [19861.831160] pktcdvd: writer pktcdvd0 mapped to hdc
 [19861.837138]
 [19861.837142] =
 [19861.844343] [ INFO: possible recursive locking detected ]
 [19861.849738] 2.6.21-rc7 #2
 [19861.852361] -
 [19861.857750] vol_id/4433 is trying to acquire lock:
 [19861.862533]  (bdev-bd_mutex){--..}, at: [c019bb8f] do_open+0x4f/0x2c0
 [19861.869386]
 [19861.869387] but task is already holding lock:
 [19861.875225]  (bdev-bd_mutex){--..}, at: [c019bb8f] do_open+0x4f/0x2c0
 [19861.882070]
 [19861.882071] other info that might help us debug this:
 [19861.888602] 2 locks held by vol_id/4433:
 [19861.892518]  #0:  (bdev-bd_mutex){--..}, at: [c019bb8f] 
 do_open+0x4f/0x2c0
 [19861.899813]  #1:  (ctl_mutex#2){--..}, at: [c04c615c] 
 mutex_lock+0x1c/0x20
 [19861.907046]
 [19861.907047] stack backtrace:
 [19861.911415]  [c010521a] show_trace_log_lvl+0x1a/0x30
 [19861.916569]  [c0105952] show_trace+0x12/0x20
 [19861.921021]  [c0105a46] dump_stack+0x16/0x20
 [19861.925475]  [c013ede0] __lock_acquire+0xbc0/0x1040
 [19861.930542]  [c013f2d0] lock_acquire+0x70/0x90
 [19861.935169]  [c04c61de] mutex_lock_nested+0x7e/0x2e0
 [19861.940315]  [c019bb8f] do_open+0x4f/0x2c0
 [19861.944595]  [c019be79] __blkdev_get+0x79/0x90
 [19861.949222]  [c019bea5] blkdev_get+0x15/0x20
 [19861.953674]  [c032a987] pkt_open+0xb7/0xd80
 [19861.958050]  [c019bbc5] do_open+0x85/0x2c0
 [19861.962330]  [c019c023] blkdev_open+0x33/0x70
 [19861.966870]  [c0175084] __dentry_open+0xf4/0x220
 [19861.971678]  [c0175255] nameidata_to_filp+0x35/0x40
 [19861.976738]  [c01752a9] do_filp_open+0x49/0x50
 [19861.981365]  [c01752f7] do_sys_open+0x47/0xd0
 [19861.985904]  [c01753bc] sys_open+0x1c/0x20
 [19861.990184]  [c01041c6] sysenter_past_esp+0x5f/0x99
 [19861.995243]  ===

Yeah, I'm not worrying about that one for now.

 # pktsetup 1 /dev/hdd
 [19909.635795] cdrom: This disc doesn't have any tracks I recognize!
 [19909.689394] pktcdvd: writer pktcdvd1 mapped to hdd
 [19909.820337] drivers/ide/ide-cd.c:729: setting error to 2
 [19909.825649]  [c010521a] show_trace_log_lvl+0x1a/0x30
 [19909.830810]  [c0105952] show_trace+0x12/0x20
 [19909.835263]  [c0105a46] dump_stack+0x16/0x20
 [19909.839716]  [c033f6e4] cdrom_decode_status+0x1f4/0x3b0
 [19909.845131]  [c033fae8] cdrom_newpc_intr+0x38/0x320
 [19909.850190]  [c0331106] ide_intr+0x96/0x200
 [19909.854557]  [c0150cf8] handle_IRQ_event+0x28/0x60
 [19909.859538]  [c0151f96] handle_edge_irq+0xa6/0x130
 [19909.864511]  [c0106449] do_IRQ+0x49/0xa0
 [19909.868618]  [c0104c3a] common_interrupt+0x2e/0x34
 [19909.873591]  [c01022d2] mwait_idle+0x12/0x20
 [19909.878044]  [c01023ca] cpu_idle+0x4a/0x80
 [19909.882324]  [c0101147] rest_init+0x37/0x40
 [19909.886690]  [c068ac7b] start_kernel+0x34b/0x420
 [19909.891499]  [] 0x0
 [19909.894488]  ===
 [19909.921518] pktcdvd: pkt_get_last_written failed
 
 # pktsetup 2 /dev/sr0
 [19982.934793] drivers/scsi/scsi_lib.c:838: setting error to 134217730
 [19982.941070]  [c010521a] show_trace_log_lvl+0x1a/0x30
 [19982.946256]  [c0105952] show_trace+0x12/0x20
 [19982.950744]  [c0105a46] dump_stack+0x16/0x20
 [19982.955232]  [c034543a] scsi_io_completion+0x28a/0x3a0
 [19982.960586]  [c034556b] scsi_blk_pc_done+0x1b/0x30
 [19982.965594]  [c0340d0c] scsi_finish_command+0x4c/0x60
 [19982.970861]  [c0345c07] scsi_softirq_done+0x77/0xe0
 [19982.975955]  [c0257f8b] blk_done_softirq+0x6b/0x80
 [19982.980962]  [c01243a2] __do_softirq+0x62/0xc0
 [19982.985624]  [c0124455] do_softirq+0x55/0x60
 [19982.990112]  [c0124be5] ksoftirqd+0x65/0x100
 [19982.994599]  [c0132963] kthread+0xa3/0xd0
 [19982.998827]  [c0104e17] kernel_thread_helper+0x7/0x10
 [19983.004095]  ===
 [19983.009065] cdrom: This disc doesn't have any tracks I recognize!

So what has happened here is that this code, in ide-cd.c's
cdrom_decode_status() is now triggering:

} else if (blk_pc_request(rq) || 

Re: BUG: Null pointer dereference in fs/open.c

2007-04-25 Thread William Heimbigner

On Wed, 25 Apr 2007, Andrew Morton wrote:

On Wed, 25 Apr 2007 22:53:00 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:


On Wed, 25 Apr 2007, Andrew Morton wrote:
snip

OK.  I am able to use the pktcdvd driver OK in mainline with a piix/sata
drive.  It could be that something is going wrong at the IDE level for you.

Perhaps; I'll try an external usb cd burner, and see where that goes.


Are you able to identify the most recent kernel which actually worked?

No, because I haven't set packet writing up in Linux before - however, I do know
that I've successfully set up packet writing (using 2 of the 3 cd burners I
have) in another operating system before. I'll try 2.6.18 and see if that gets
me anywhere different, though.


OK.

A quick summary: mainline's pktcdvd isn't working for William using IDE.
It is working for me using sata.


snip


So what has happened here is that this code, in ide-cd.c's
cdrom_decode_status() is now triggering:

} else if (blk_pc_request(rq) || rq-cmd_type == REQ_TYPE_ATA_PC) {
/* All other functions, except for READ. */
unsigned long flags;

/*
 * if we have an error, pass back CHECK_CONDITION as the
 * scsi status byte
 */
if (blk_pc_request(rq)  !rq-errors)
rq-errors = SAM_STAT_CHECK_CONDITION;


I suspect this is a bug introduced by
406c9b605cbc45151c03ac9a3f95e9acf050808c (in which case it'll be the third
bug so far).

Perhaps the IDE driver was previously not considering these requests to be
of type blk_pc_request(), and after
406c9b605cbc45151c03ac9a3f95e9acf050808c it _is_ treating them as
blk_pc_request() and is incorrectly reporting an error.  Or something like
that.

Guys: help!

A follow-up: after looking around a bit, I have managed to get packet writing to 
work properly on /dev/hdc (before, it was reporting only 1.8 MB available or so; 
this was a formatting issue).
I've also gotten the external cd-rw drive to work. However, I'm still at a loss 
as to why /dev/hdd won't work. I tried formatting a dvd-rw for this drive, 
however, it consistently gives me:

[27342.503933] drivers/ide/ide-cd.c:729: setting error to 2
[27342.509251]  [c010521a] show_trace_log_lvl+0x1a/0x30
[27342.514411]  [c0105952] show_trace+0x12/0x20
[27342.518864]  [c0105a46] dump_stack+0x16/0x20
[27342.523317]  [c033f6e4] cdrom_decode_status+0x1f4/0x3b0
[27342.528732]  [c033fae8] cdrom_newpc_intr+0x38/0x320
[27342.533791]  [c0331106] ide_intr+0x96/0x200
[27342.538157]  [c0150cf8] handle_IRQ_event+0x28/0x60
[27342.543139]  [c0151f96] handle_edge_irq+0xa6/0x130
[27342.548121]  [c0106449] do_IRQ+0x49/0xa0
[27342.552228]  [c0104c3a] common_interrupt+0x2e/0x34
[27342.557200]  [c01022d2] mwait_idle+0x12/0x20
[27342.561653]  [c01023ca] cpu_idle+0x4a/0x80
[27342.565934]  [c0101147] rest_init+0x37/0x40
[27342.570300]  [c068ac7b] start_kernel+0x34b/0x420
[27342.575109]  [] 0x0
[27342.578089]  ===
and doesn't work (the above output was generated by Andrew's patch to log 
certain areas).


# dvd+rw-format /dev/hdd -force
* BD/DVDRW/-RAM format utility by [EMAIL PROTECTED], version 7.0.
:-( failed to locate Quick Format descriptor.
* 4.7GB DVD-RW media in Sequential mode detected.
* formatting 0.0\:-[ READ TRACK INFORMATION failed with SK=3h/ASC=11h/ACQ=05h]: 
Input/output error

I tried putting in a different dvd-rw, and this time I get:
# dvd+rw-format /dev/hdd -force
* BD/DVDRW/-RAM format utility by [EMAIL PROTECTED], version 7.0.
* 4.7GB DVD-RW media in Sequential mode detected.
* formatting 0.0|:-[ FORMAT UNIT failed with SK=5h/ASC=26h/ACQ=00h]: 
Input/output error

William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-24 Thread Peter Osterlund

On Mon, 23 Apr 2007, Andrew Morton wrote:


Try this:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq->cmd_flags |= REQ_QUIET;

blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
-   ret = rq->errors;
+   if (rq->errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


The packet driver was assuming that request.errors is an errno, but it
isn't - it's some sort of diagnostic bitfield thing.  Now why would the
packet driver have though that?  Let's go read the comments:

...

Well there's your root cause right there.

I don't know why this wasn't oopsing in eariler kernels.  Perhaps something
else is broken.  Please test this urgently.


The code used to return -EIO until commit 
406c9b605cbc45151c03ac9a3f95e9acf050808c, which was commited 2007-01-05, 
so that would explain why older kernels didn't crash.



What the heck _is_ in request.errors?


According to linux/Documentation/block/request.txt, it is an error 
counter. The info in that text file would probably do a lot more good as 
comments in the structure definition though.



Should the packet driver even be looking at it?


I think so. How else is it supposed to know if the request failed?

--
Peter Osterlund - [EMAIL PROTECTED]
http://web.telia.com/~u89404340
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-24 Thread Peter Osterlund

On Mon, 23 Apr 2007, Andrew Morton wrote:


Try this:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;

blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
-   ret = rq-errors;
+   if (rq-errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


The packet driver was assuming that request.errors is an errno, but it
isn't - it's some sort of diagnostic bitfield thing.  Now why would the
packet driver have though that?  Let's go read the comments:

...

Well there's your root cause right there.

I don't know why this wasn't oopsing in eariler kernels.  Perhaps something
else is broken.  Please test this urgently.


The code used to return -EIO until commit 
406c9b605cbc45151c03ac9a3f95e9acf050808c, which was commited 2007-01-05, 
so that would explain why older kernels didn't crash.



What the heck _is_ in request.errors?


According to linux/Documentation/block/request.txt, it is an error 
counter. The info in that text file would probably do a lot more good as 
comments in the structure definition though.



Should the packet driver even be looking at it?


I think so. How else is it supposed to know if the request failed?

--
Peter Osterlund - [EMAIL PROTECTED]
http://web.telia.com/~u89404340
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread Andrew Morton
On Tue, 24 Apr 2007 05:44:42 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:

> On Mon, 23 Apr 2007, Andrew Morton wrote:
> > On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner <[EMAIL 
> > PROTECTED]> wrote:
> >
> >>> --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
> >>> +++ a/drivers/block/pktcdvd.c
> >>> @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
> >>>   rq->cmd_flags |= REQ_QUIET;
> >>>
> >>>   blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
> >>> - ret = rq->errors;
> >>> + if (rq->errors)
> >>> + ret = -EIO;
> >>> out:
> >>>   blk_put_request(rq);
> >>>   return ret;
> >>> _
> >>
> >> This patch fixes (or conceals?) the oops.
> >>
> >
> > Fixes.  But does the packet driver actually work OK for you?  Writes
> > files and stuff like that?
> >
> Short answer, no.
> 
> Long answer:
> # pktsetup 0 /dev/hdc
>
> ...
>
> [11508.520800] pktcdvd: pkt_get_last_written failed
> 
> # mkudffs /dev/pktcdvd/0
> [11539.953560] pktcdvd: pkt_get_last_written failed
> trying to change type of multiple extents
> 
> I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
> burners, hdd has a cd-rw and hdc had a dvd-rw)

Yes, I get the same on a sata (piix) dvd burner.

We need to work out who is setting rq->errors and why - should be pretty
simple.  I'll take a look at that after I've nailed one of these other bugs
over here.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner

On Mon, 23 Apr 2007, Andrew Morton wrote:

On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:


--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq->cmd_flags |= REQ_QUIET;

blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
-   ret = rq->errors;
+   if (rq->errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


This patch fixes (or conceals?) the oops.



Fixes.  But does the packet driver actually work OK for you?  Writes
files and stuff like that?


Short answer, no.

Long answer:
# pktsetup 0 /dev/hdc
[11508.006818] =
[11508.028248] [ INFO: possible recursive locking detected ]
[11508.044413] 2.6.21-rc7-git5 #23
[11508.053818] -
[11508.069989] vol_id/4315 is trying to acquire lock:
[11508.084332]  (>bd_mutex){--..}, at: [] 
do_open+0x4f/0x2c0

[11508.104867]
[11508.104868] but task is already holding lock:
[11508.122359]  (>bd_mutex){--..}, at: [] 
do_open+0x4f/0x2c0

[11508.142862]
[11508.142863] other info that might help us debug this:
[11508.162460] 2 locks held by vol_id/4315:
[11508.174212]  #0:  (>bd_mutex){--..}, at: [] 
do_open+0x4f/0x2c0
[11508.196066]  #1:  (_mutex#2){--..}, at: [] 
mutex_lock+0x1c/0x20

[11508.217720]
[11508.217721] stack backtrace:
[11508.230821]  [] show_trace_log_lvl+0x1a/0x30
[11508.246255]  [] show_trace+0x12/0x20
[11508.259619]  [] dump_stack+0x16/0x20
[11508.272974]  [] __lock_acquire+0xbc0/0x1040
[11508.288157]  [] lock_acquire+0x70/0x90
[11508.302035]  [] mutex_lock_nested+0x7e/0x2e0
[11508.317475]  [] do_open+0x4f/0x2c0
[11508.330314]  [] __blkdev_get+0x79/0x90
[11508.344189]  [] blkdev_get+0x15/0x20
[11508.357554]  [] pkt_open+0xb7/0xd80
[11508.370651]  [] do_open+0x85/0x2c0
[11508.383491]  [] blkdev_open+0x33/0x70
[11508.397107]  [] __dentry_open+0xf4/0x220
[11508.411509]  [] nameidata_to_filp+0x35/0x40
[11508.426684]  [] do_filp_open+0x49/0x50
[11508.440567]  [] do_sys_open+0x47/0xd0
[11508.454188]  [] sys_open+0x1c/0x20
[11508.467023]  [] sysenter_past_esp+0x5f/0x99
[11508.482202]  ===
[11508.520800] pktcdvd: pkt_get_last_written failed

# mkudffs /dev/pktcdvd/0
[11539.953560] pktcdvd: pkt_get_last_written failed
trying to change type of multiple extents

I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
burners, hdd has a cd-rw and hdc had a dvd-rw)



William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread Andrew Morton
On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:

> > --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
> > +++ a/drivers/block/pktcdvd.c
> > @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
> > rq->cmd_flags |= REQ_QUIET;
> >
> > blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
> > -   ret = rq->errors;
> > +   if (rq->errors)
> > +   ret = -EIO;
> > out:
> > blk_put_request(rq);
> > return ret;
> > _
> 
> This patch fixes (or conceals?) the oops.
> 

Fixes.  But does the packet driver actually work OK for you?  Writes
files and stuff like that?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner

On Mon, 23 Apr 2007, Andrew Morton wrote:

On Tue, 24 Apr 2007 04:09:18 + (GMT) William Heimbigner <[EMAIL PROTECTED]> 
wrote:


This bug occurs in linux-2.6.20 and 2.6.21-rc7-git5, and does not occur in
linux-2.6.19-git22.

After running "pktsetup 0 /dev/hdd", I get (timestamps removed):

pktcdvd: pkt_get_last_written failed
BUG: unable to handle kernel NULL pointer dereference at virtual address 
000e
printing eip:
c0173f69
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: snd_ca0106 snd_ac97_codec ac97_bus 8139cp 8139too iTCO_wdt
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010203   (2.6.21-rc7-git5 #22)
EIP is at do_sys_open+0x59/0xd0
eax: 0002   ebx: 4020   ecx: 0001   edx: 0002
esi: df1e3000   edi: 0003   ebp: de17bfa4   esp: de17bf84
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process vol_id (pid: 4273, ti=de17b000 task=df4143f0 task.ti=de17b000)
Stack:  c013d2a5 ff9c 0002 c059cea3 bfb6bf64 8000 b7f60ff4
de17bfb0 c017401c  de17b000 c01041c6 bfb6bf64 8000 
8000 b7f60ff4 bfb6a798 0005 007b 007b  0005
Call Trace:
  [] show_trace_log_lvl+0x1a/0x30
  [] show_stack_log_lvl+0xa9/0xd0
  [] show_registers+0x21c/0x3a0
  [] die+0x104/0x260
  [] do_page_fault+0x277/0x610
  [] error_code+0x74/0x7c
  [] sys_open+0x1c/0x20
  [] sysenter_past_esp+0x5f/0x99
  ===
Code: ff 85 c0 89 c7 78 77 8b 45 08 89 d9 89 f2 89 04 24 8b 45 e8 e8 69 ff
ff ff 3d 00 f0 ff ff 89 45 ec 77 71 8b 55 ec bb 20 00 00 40 <8b> 42 0c 8b
48 30 89 4d f0 0f b7 51 66 81 e2 00 f0 00 00 81 fa
EIP: [] do_sys_open+0x59/0xd0 SS:ESP 0068:de17bf84


Try this:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq->cmd_flags |= REQ_QUIET;

blk_execute_rq(rq->q, pd->bdev->bd_disk, rq, 0);
-   ret = rq->errors;
+   if (rq->errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


This patch fixes (or conceals?) the oops.




The packet driver was assuming that request.errors is an errno, but it
isn't - it's some sort of diagnostic bitfield thing.  Now why would the
packet driver have though that?  Let's go read the comments:

unsigned short nr_hw_segments;

unsigned short ioprio;

void *special;
char *buffer;

int tag;
int errors;

int ref_count;


Well there's your root cause right there.


I don't know why this wasn't oopsing in eariler kernels.  Perhaps something
else is broken.  Please test this urgently.


There's a locking problem in there too.  `pktsetup 0 /dev/scd0' gives me

[   77.72] pktcdvd: writer pktcdvd0 mapped to sr0
[   77.86]
[   77.86] =
[   77.86] [ INFO: possible recursive locking detected ]
[   77.86] 2.6.21-rc7 #19
[   77.86] -
[   77.86] vol_id/2508 is trying to acquire lock:
[   77.86]  (>bd_mutex){--..}, at: [] do_open+0x5a/0x267
[   77.86]
[   77.86] but task is already holding lock:
[   77.86]  (>bd_mutex){--..}, at: [] do_open+0x5a/0x267
[   77.86]
[   77.86] other info that might help us debug this:
[   77.86] 2 locks held by vol_id/2508:
[   77.86]  #0:  (>bd_mutex){--..}, at: [] 
do_open+0x5a/0x267
[   77.86]  #1:  (_mutex#2){--..}, at: [] pkt_open+0x1a/0xcbc 
[pktcdvd]
[   77.86]
[   77.86] stack backtrace:
[   77.86]  [] __lock_acquire+0x11e/0xb3b
[   77.86]  [] __mutex_unlock_slowpath+0x109/0x113
[   77.86]  [] trace_hardirqs_on+0x11e/0x141
[   77.86]  [] lock_acquire+0x56/0x6e
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] mutex_lock_nested+0xf4/0x24f
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] kobj_lookup+0xda/0x104
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] __blkdev_get+0x5b/0x66
[   77.86]  [] blkdev_get+0x12/0x14
[   77.86]  [] pkt_open+0x8d/0xcbc [pktcdvd]
[   77.86]  [] __d_lookup+0x66/0xed
[   77.86]  [] __d_lookup+0x66/0xed
[   77.86]  [] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [] cache_alloc_refill+0x4a/0x444
[   77.86]  [] kobj_lookup+0x33/0x104
[   77.86]  [] trace_hardirqs_on+0x11e/0x141
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] __mutex_lock_slowpath+0x222/0x235
[   77.86]  [] mutex_lock_nested+0x23c/0x24f
[   77.86]  [] mark_held_locks+0x46/0x62
[   77.86]  [] mutex_lock_nested+0x23c/0x24f
[   77.86]  [] mutex_lock_nested+0x23c/0x24f
[   77.86]  [] trace_hardirqs_on+0x11e/0x141
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] mutex_lock_nested+0x247/0x24f
[   77.86]  [] do_open+0x5a/0x267
[   77.86]  [] kobj_lookup+0xda/0x104
[   77.86]  [] 

Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner
This bug occurs in linux-2.6.20 and 2.6.21-rc7-git5, and does not occur in 
linux-2.6.19-git22.


After running "pktsetup 0 /dev/hdd", I get (timestamps removed):

pktcdvd: pkt_get_last_written failed
BUG: unable to handle kernel NULL pointer dereference at virtual address 
000e
printing eip:
c0173f69
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: snd_ca0106 snd_ac97_codec ac97_bus 8139cp 8139too iTCO_wdt
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010203   (2.6.21-rc7-git5 #22)
EIP is at do_sys_open+0x59/0xd0
eax: 0002   ebx: 4020   ecx: 0001   edx: 0002
esi: df1e3000   edi: 0003   ebp: de17bfa4   esp: de17bf84
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process vol_id (pid: 4273, ti=de17b000 task=df4143f0 task.ti=de17b000)
Stack:  c013d2a5 ff9c 0002 c059cea3 bfb6bf64 8000 b7f60ff4
   de17bfb0 c017401c  de17b000 c01041c6 bfb6bf64 8000 
   8000 b7f60ff4 bfb6a798 0005 007b 007b  0005
Call Trace:
 [] show_trace_log_lvl+0x1a/0x30
 [] show_stack_log_lvl+0xa9/0xd0
 [] show_registers+0x21c/0x3a0
 [] die+0x104/0x260
 [] do_page_fault+0x277/0x610
 [] error_code+0x74/0x7c
 [] sys_open+0x1c/0x20
 [] sysenter_past_esp+0x5f/0x99
 ===
Code: ff 85 c0 89 c7 78 77 8b 45 08 89 d9 89 f2 89 04 24 8b 45 e8 e8 69 ff 
ff ff 3d 00 f0 ff ff 89 45 ec 77 71 8b 55 ec bb 20 00 00 40 <8b> 42 0c 8b 
48 30 89 4d f0 0f b7 51 66 81 e2 00 f0 00 00 81 fa

EIP: [] do_sys_open+0x59/0xd0 SS:ESP 0068:de17bf84


from fs/open.c, comments added:
// do_sys_open is consistently called with dfd=0xff9c,
// filename="/dev/.tmp-254-0", flags=0x8000, mode=0)
long do_sys_open(int dfd, const char __user *filename, int flags, int mode)
{
char *tmp = getname(filename);
int fd = PTR_ERR(tmp);

if (!IS_ERR(tmp)) {
fd = get_unused_fd();
if (fd >= 0) {
// do_filp_open consistently returns 2, in this case
struct file *f = do_filp_open(dfd, tmp, flags, mode);
// IS_ERR always returns 0 for this command
if (IS_ERR(f)) {
put_unused_fd(fd);
fd = PTR_ERR(f);
} else {
// null pointer dereference occurs here
fsnotify_open(f->f_path.dentry);
fd_install(fd, f);
}
}
putname(tmp);
}
return fd;
}

I was able to workaround this, by testing if do_filp_open was returning 
2 or not, but obviously this is a very temporal solution to a very 
specific circumstance.


If there is any more information I can provide, let me know.
William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner
This bug occurs in linux-2.6.20 and 2.6.21-rc7-git5, and does not occur in 
linux-2.6.19-git22.


After running pktsetup 0 /dev/hdd, I get (timestamps removed):

pktcdvd: pkt_get_last_written failed
BUG: unable to handle kernel NULL pointer dereference at virtual address 
000e
printing eip:
c0173f69
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: snd_ca0106 snd_ac97_codec ac97_bus 8139cp 8139too iTCO_wdt
CPU:0
EIP:0060:[c0173f69]Not tainted VLI
EFLAGS: 00010203   (2.6.21-rc7-git5 #22)
EIP is at do_sys_open+0x59/0xd0
eax: 0002   ebx: 4020   ecx: 0001   edx: 0002
esi: df1e3000   edi: 0003   ebp: de17bfa4   esp: de17bf84
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process vol_id (pid: 4273, ti=de17b000 task=df4143f0 task.ti=de17b000)
Stack:  c013d2a5 ff9c 0002 c059cea3 bfb6bf64 8000 b7f60ff4
   de17bfb0 c017401c  de17b000 c01041c6 bfb6bf64 8000 
   8000 b7f60ff4 bfb6a798 0005 007b 007b  0005
Call Trace:
 [c010521a] show_trace_log_lvl+0x1a/0x30
 [c01052d9] show_stack_log_lvl+0xa9/0xd0
 [c010551c] show_registers+0x21c/0x3a0
 [c01057a4] die+0x104/0x260
 [c04c5947] do_page_fault+0x277/0x610
 [c04c408c] error_code+0x74/0x7c
 [c017401c] sys_open+0x1c/0x20
 [c01041c6] sysenter_past_esp+0x5f/0x99
 ===
Code: ff 85 c0 89 c7 78 77 8b 45 08 89 d9 89 f2 89 04 24 8b 45 e8 e8 69 ff 
ff ff 3d 00 f0 ff ff 89 45 ec 77 71 8b 55 ec bb 20 00 00 40 8b 42 0c 8b 
48 30 89 4d f0 0f b7 51 66 81 e2 00 f0 00 00 81 fa

EIP: [c0173f69] do_sys_open+0x59/0xd0 SS:ESP 0068:de17bf84


from fs/open.c, comments added:
// do_sys_open is consistently called with dfd=0xff9c,
// filename=/dev/.tmp-254-0, flags=0x8000, mode=0)
long do_sys_open(int dfd, const char __user *filename, int flags, int mode)
{
char *tmp = getname(filename);
int fd = PTR_ERR(tmp);

if (!IS_ERR(tmp)) {
fd = get_unused_fd();
if (fd = 0) {
// do_filp_open consistently returns 2, in this case
struct file *f = do_filp_open(dfd, tmp, flags, mode);
// IS_ERR always returns 0 for this command
if (IS_ERR(f)) {
put_unused_fd(fd);
fd = PTR_ERR(f);
} else {
// null pointer dereference occurs here
fsnotify_open(f-f_path.dentry);
fd_install(fd, f);
}
}
putname(tmp);
}
return fd;
}

I was able to workaround this, by testing if do_filp_open was returning 
2 or not, but obviously this is a very temporal solution to a very 
specific circumstance.


If there is any more information I can provide, let me know.
William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner

On Mon, 23 Apr 2007, Andrew Morton wrote:

On Tue, 24 Apr 2007 04:09:18 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:


This bug occurs in linux-2.6.20 and 2.6.21-rc7-git5, and does not occur in
linux-2.6.19-git22.

After running pktsetup 0 /dev/hdd, I get (timestamps removed):

pktcdvd: pkt_get_last_written failed
BUG: unable to handle kernel NULL pointer dereference at virtual address 
000e
printing eip:
c0173f69
*pde = 
Oops:  [#1]
PREEMPT
Modules linked in: snd_ca0106 snd_ac97_codec ac97_bus 8139cp 8139too iTCO_wdt
CPU:0
EIP:0060:[c0173f69]Not tainted VLI
EFLAGS: 00010203   (2.6.21-rc7-git5 #22)
EIP is at do_sys_open+0x59/0xd0
eax: 0002   ebx: 4020   ecx: 0001   edx: 0002
esi: df1e3000   edi: 0003   ebp: de17bfa4   esp: de17bf84
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process vol_id (pid: 4273, ti=de17b000 task=df4143f0 task.ti=de17b000)
Stack:  c013d2a5 ff9c 0002 c059cea3 bfb6bf64 8000 b7f60ff4
de17bfb0 c017401c  de17b000 c01041c6 bfb6bf64 8000 
8000 b7f60ff4 bfb6a798 0005 007b 007b  0005
Call Trace:
  [c010521a] show_trace_log_lvl+0x1a/0x30
  [c01052d9] show_stack_log_lvl+0xa9/0xd0
  [c010551c] show_registers+0x21c/0x3a0
  [c01057a4] die+0x104/0x260
  [c04c5947] do_page_fault+0x277/0x610
  [c04c408c] error_code+0x74/0x7c
  [c017401c] sys_open+0x1c/0x20
  [c01041c6] sysenter_past_esp+0x5f/0x99
  ===
Code: ff 85 c0 89 c7 78 77 8b 45 08 89 d9 89 f2 89 04 24 8b 45 e8 e8 69 ff
ff ff 3d 00 f0 ff ff 89 45 ec 77 71 8b 55 ec bb 20 00 00 40 8b 42 0c 8b
48 30 89 4d f0 0f b7 51 66 81 e2 00 f0 00 00 81 fa
EIP: [c0173f69] do_sys_open+0x59/0xd0 SS:ESP 0068:de17bf84


Try this:

--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;

blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
-   ret = rq-errors;
+   if (rq-errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


This patch fixes (or conceals?) the oops.




The packet driver was assuming that request.errors is an errno, but it
isn't - it's some sort of diagnostic bitfield thing.  Now why would the
packet driver have though that?  Let's go read the comments:

unsigned short nr_hw_segments;

unsigned short ioprio;

void *special;
char *buffer;

int tag;
int errors;

int ref_count;


Well there's your root cause right there.


I don't know why this wasn't oopsing in eariler kernels.  Perhaps something
else is broken.  Please test this urgently.


There's a locking problem in there too.  `pktsetup 0 /dev/scd0' gives me

[   77.72] pktcdvd: writer pktcdvd0 mapped to sr0
[   77.86]
[   77.86] =
[   77.86] [ INFO: possible recursive locking detected ]
[   77.86] 2.6.21-rc7 #19
[   77.86] -
[   77.86] vol_id/2508 is trying to acquire lock:
[   77.86]  (bdev-bd_mutex){--..}, at: [c01815e2] do_open+0x5a/0x267
[   77.86]
[   77.86] but task is already holding lock:
[   77.86]  (bdev-bd_mutex){--..}, at: [c01815e2] do_open+0x5a/0x267
[   77.86]
[   77.86] other info that might help us debug this:
[   77.86] 2 locks held by vol_id/2508:
[   77.86]  #0:  (bdev-bd_mutex){--..}, at: [c01815e2] 
do_open+0x5a/0x267
[   77.86]  #1:  (ctl_mutex#2){--..}, at: [f8dc6986] pkt_open+0x1a/0xcbc 
[pktcdvd]
[   77.86]
[   77.86] stack backtrace:
[   77.86]  [c01323c1] __lock_acquire+0x11e/0xb3b
[   77.86]  [c02efe4e] __mutex_unlock_slowpath+0x109/0x113
[   77.86]  [c0132166] trace_hardirqs_on+0x11e/0x141
[   77.86]  [c0132e34] lock_acquire+0x56/0x6e
[   77.86]  [c01815e2] do_open+0x5a/0x267
[   77.86]  [c02f01a5] mutex_lock_nested+0xf4/0x24f
[   77.86]  [c01815e2] do_open+0x5a/0x267
[   77.86]  [c024020c] kobj_lookup+0xda/0x104
[   77.86]  [c01815e2] do_open+0x5a/0x267
[   77.86]  [c018184a] __blkdev_get+0x5b/0x66
[   77.86]  [c0181867] blkdev_get+0x12/0x14
[   77.86]  [f8dc69f9] pkt_open+0x8d/0xcbc [pktcdvd]
[   77.86]  [c0170949] __d_lookup+0x66/0xed
[   77.86]  [c0170949] __d_lookup+0x66/0xed
[   77.86]  [c01ce919] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [c01ce919] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [c01ce919] _atomic_dec_and_lock+0xd/0x2c
[   77.86]  [c015f655] cache_alloc_refill+0x4a/0x444
[   77.86]  [c0240165] kobj_lookup+0x33/0x104
[   77.86]  [c0132166] trace_hardirqs_on+0x11e/0x141
[   77.86]  [c01815e2] do_open+0x5a/0x267
[   77.86]  [c02f007f] __mutex_lock_slowpath+0x222/0x235
[   77.86]  [c02f02ed] mutex_lock_nested+0x23c/0x24f
[   77.86]  [c0131f85] mark_held_locks+0x46/0x62
[   77.86]  [c02f02ed] 

Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread Andrew Morton
On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:

  --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
  +++ a/drivers/block/pktcdvd.c
  @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
  rq-cmd_flags |= REQ_QUIET;
 
  blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
  -   ret = rq-errors;
  +   if (rq-errors)
  +   ret = -EIO;
  out:
  blk_put_request(rq);
  return ret;
  _
 
 This patch fixes (or conceals?) the oops.
 

Fixes.  But does the packet driver actually work OK for you?  Writes
files and stuff like that?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread William Heimbigner

On Mon, 23 Apr 2007, Andrew Morton wrote:

On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:


--- a/drivers/block/pktcdvd.c~packet-fix-error-handling
+++ a/drivers/block/pktcdvd.c
@@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;

blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
-   ret = rq-errors;
+   if (rq-errors)
+   ret = -EIO;
out:
blk_put_request(rq);
return ret;
_


This patch fixes (or conceals?) the oops.



Fixes.  But does the packet driver actually work OK for you?  Writes
files and stuff like that?


Short answer, no.

Long answer:
# pktsetup 0 /dev/hdc
[11508.006818] =
[11508.028248] [ INFO: possible recursive locking detected ]
[11508.044413] 2.6.21-rc7-git5 #23
[11508.053818] -
[11508.069989] vol_id/4315 is trying to acquire lock:
[11508.084332]  (bdev-bd_mutex){--..}, at: [c019a82f] 
do_open+0x4f/0x2c0

[11508.104867]
[11508.104868] but task is already holding lock:
[11508.122359]  (bdev-bd_mutex){--..}, at: [c019a82f] 
do_open+0x4f/0x2c0

[11508.142862]
[11508.142863] other info that might help us debug this:
[11508.162460] 2 locks held by vol_id/4315:
[11508.174212]  #0:  (bdev-bd_mutex){--..}, at: [c019a82f] 
do_open+0x4f/0x2c0
[11508.196066]  #1:  (ctl_mutex#2){--..}, at: [c04c221c] 
mutex_lock+0x1c/0x20

[11508.217720]
[11508.217721] stack backtrace:
[11508.230821]  [c010521a] show_trace_log_lvl+0x1a/0x30
[11508.246255]  [c0105952] show_trace+0x12/0x20
[11508.259619]  [c0105a46] dump_stack+0x16/0x20
[11508.272974]  [c013e410] __lock_acquire+0xbc0/0x1040
[11508.288157]  [c013e900] lock_acquire+0x70/0x90
[11508.302035]  [c04c229e] mutex_lock_nested+0x7e/0x2e0
[11508.317475]  [c019a82f] do_open+0x4f/0x2c0
[11508.330314]  [c019ab19] __blkdev_get+0x79/0x90
[11508.344189]  [c019ab45] blkdev_get+0x15/0x20
[11508.357554]  [c03298f7] pkt_open+0xb7/0xd80
[11508.370651]  [c019a865] do_open+0x85/0x2c0
[11508.383491]  [c019acc3] blkdev_open+0x33/0x70
[11508.397107]  [c0173ce4] __dentry_open+0xf4/0x220
[11508.411509]  [c0173eb5] nameidata_to_filp+0x35/0x40
[11508.426684]  [c0173f09] do_filp_open+0x49/0x50
[11508.440567]  [c0173f57] do_sys_open+0x47/0xd0
[11508.454188]  [c017401c] sys_open+0x1c/0x20
[11508.467023]  [c01041c6] sysenter_past_esp+0x5f/0x99
[11508.482202]  ===
[11508.520800] pktcdvd: pkt_get_last_written failed

# mkudffs /dev/pktcdvd/0
[11539.953560] pktcdvd: pkt_get_last_written failed
trying to change type of multiple extents

I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
burners, hdd has a cd-rw and hdc had a dvd-rw)



William Heimbigner
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: Null pointer dereference in fs/open.c

2007-04-23 Thread Andrew Morton
On Tue, 24 Apr 2007 05:44:42 + (GMT) William Heimbigner [EMAIL PROTECTED] 
wrote:

 On Mon, 23 Apr 2007, Andrew Morton wrote:
  On Tue, 24 Apr 2007 05:10:04 + (GMT) William Heimbigner [EMAIL 
  PROTECTED] wrote:
 
  --- a/drivers/block/pktcdvd.c~packet-fix-error-handling
  +++ a/drivers/block/pktcdvd.c
  @@ -777,7 +777,8 @@ static int pkt_generic_packet(struct pkt
rq-cmd_flags |= REQ_QUIET;
 
blk_execute_rq(rq-q, pd-bdev-bd_disk, rq, 0);
  - ret = rq-errors;
  + if (rq-errors)
  + ret = -EIO;
  out:
blk_put_request(rq);
return ret;
  _
 
  This patch fixes (or conceals?) the oops.
 
 
  Fixes.  But does the packet driver actually work OK for you?  Writes
  files and stuff like that?
 
 Short answer, no.
 
 Long answer:
 # pktsetup 0 /dev/hdc

 ...

 [11508.520800] pktcdvd: pkt_get_last_written failed
 
 # mkudffs /dev/pktcdvd/0
 [11539.953560] pktcdvd: pkt_get_last_written failed
 trying to change type of multiple extents
 
 I get the same error with /dev/hdd as well (hdc and hdd are both dvd 
 burners, hdd has a cd-rw and hdc had a dvd-rw)

Yes, I get the same on a sata (piix) dvd burner.

We need to work out who is setting rq-errors and why - should be pretty
simple.  I'll take a look at that after I've nailed one of these other bugs
over here.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/