Re: Nbd problem now oopses.

2007-05-14 Thread Chuck Ebbert
Rogier Wolff wrote:
> ...
> [ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
> 86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
> 5628.608000] EIP: [] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c
> 

A big chunk of the machine code is missing here, so it's impossible to tell 
what really happened.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nbd problem now oopses.

2007-05-14 Thread Chuck Ebbert
Rogier Wolff wrote:
 ...
 [ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
 86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
 5628.608000] EIP: [c0293210] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c
 

A big chunk of the machine code is missing here, so it's impossible to tell 
what really happened.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Nbd problem now oopses.

2007-05-13 Thread Rogier Wolff

Hi,


After turning on the debugging for allocations and locks, I 
now get a kernel ooops. 


[ 5628.608000] BUG: unable to handle kernel NULL pointer dereference at virtual 
address 
[ 5628.608000]  printing eip:
[ 5628.608000] c0293210
[ 5628.608000] *pde = 
[ 5628.608000] Oops: 0002 [#1]
[ 5628.608000] Modules linked in: nbd
[ 5628.608000] CPU:0
[ 5628.608000] EIP:0060:[]Not tainted VLI
[ 5628.608000] EFLAGS: 00010246   (2.6.21 #8)
[ 5628.608000] EIP is at tcp_sendmsg+0x726/0xab3
[ 5628.608000] eax:    ebx: c24576b8   ecx:    edx: 
[ 5628.608000] esi: c30a006c   edi: 0840   ebp:    esp: c3f8fc5c
[ 5628.608000] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
[ 5628.608000] Process kblockd/0 (pid: 34, ti=c3f8e000 task=c3f89550 
task.ti=c3f8e000)
[ 5628.608000] Stack: 0002 012b 0001 0046 000a c011ad49 
 c100dea0 
[ 5628.608000]0001   c2b2c7c0 0840 07c0 
baa8 05a8 
[ 5628.608000]c000  c3f8fdf8 c3f8e000 1000 7fff 
c03a3820 c30a006c 
[ 5628.608000] Call Trace:
[ 5628.608000]  [] __do_softirq+0x35/0x73
[ 5628.608000]  [] inet_sendmsg+0x39/0x43
[ 5628.608000]  [] sock_sendmsg+0xbc/0xd4
...
[ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
5628.608000] EIP: [] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c


which seems to be: 

0xc02931f1 :  jne0xc0293572 
0xc02931f7 :  movl   $0x0,0x188(%esi)
0xc0293201 :  mov0x1c(%esp),%ebx
0xc0293205 :  mov%ebx,0x180(%esi)
0xc029320b :  jmp0xc0293572 

EIP points here: 
0xc0293210 :  cmpl   $0x0,0x24(%esp)
0xc0293215 :  mov0x98(%ebx),%edx
0xc029321b :  je 0xc0293232 
0xc029321d :  mov0x20(%esp),%ecx
0xc0293221 :  movzwl 0x12(%edx,%ecx,8),%eax
0xc0293226 :  add%edi,%eax
0xc0293228 :  mov%ax,0x12(%edx,%ecx,8)
0xc029322d :  jmp0xc02932b4 


which is 

790 if (err) {
791 /* If this page was new, give it to 
the
792  * socket so it does not get leaked.
793  */
794 if (!TCP_PAGE(sk)) {
795 TCP_PAGE(sk) = page;
796 TCP_OFF(sk) = 0;
797 }
798 goto do_error;
799 }
800 
801 /* Update the skb. */

EIP Points here. 
802 if (merge) {
803 skb_shinfo(skb)->frags[i - 1].size 
+=
804 
copy;


and now the question is: How can the 
cmpl   $0x0,0x24(%esp)
trap at address 0? 

How can "if (merge)" cause a segmentation fault?

If EIP is a bit off, it could be a line erarlier or further. So, could it
crash on the jmp tcp_sendmsg+2696? I dont' thinks so. 

How about "mov0x98(%ebx),%edx"? If ebx is invalid, this should 
crash. (ebx apparently holds skb if I understand things correctly). 
But from the dump, ebx holds c24576b8, and if that's invalid it would
not say 
  BUG: unable to handle kernel NULL pointer dereference at virtual address 

right?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Nbd problem now oopses.

2007-05-13 Thread Rogier Wolff

Hi,


After turning on the debugging for allocations and locks, I 
now get a kernel ooops. 


[ 5628.608000] BUG: unable to handle kernel NULL pointer dereference at virtual 
address 
[ 5628.608000]  printing eip:
[ 5628.608000] c0293210
[ 5628.608000] *pde = 
[ 5628.608000] Oops: 0002 [#1]
[ 5628.608000] Modules linked in: nbd
[ 5628.608000] CPU:0
[ 5628.608000] EIP:0060:[c0293210]Not tainted VLI
[ 5628.608000] EFLAGS: 00010246   (2.6.21 #8)
[ 5628.608000] EIP is at tcp_sendmsg+0x726/0xab3
[ 5628.608000] eax:    ebx: c24576b8   ecx:    edx: 
[ 5628.608000] esi: c30a006c   edi: 0840   ebp:    esp: c3f8fc5c
[ 5628.608000] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
[ 5628.608000] Process kblockd/0 (pid: 34, ti=c3f8e000 task=c3f89550 
task.ti=c3f8e000)
[ 5628.608000] Stack: 0002 012b 0001 0046 000a c011ad49 
 c100dea0 
[ 5628.608000]0001   c2b2c7c0 0840 07c0 
baa8 05a8 
[ 5628.608000]c000  c3f8fdf8 c3f8e000 1000 7fff 
c03a3820 c30a006c 
[ 5628.608000] Call Trace:
[ 5628.608000]  [c011ad49] __do_softirq+0x35/0x73
[ 5628.608000]  [c02ab61c] inet_sendmsg+0x39/0x43
[ 5628.608000]  [c026c744] sock_sendmsg+0xbc/0xd4
...
[ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
5628.608000] EIP: [c0293210] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c


which seems to be: 

0xc02931f1 tcp_sendmsg+1799:  jne0xc0293572 tcp_sendmsg+2696
0xc02931f7 tcp_sendmsg+1805:  movl   $0x0,0x188(%esi)
0xc0293201 tcp_sendmsg+1815:  mov0x1c(%esp),%ebx
0xc0293205 tcp_sendmsg+1819:  mov%ebx,0x180(%esi)
0xc029320b tcp_sendmsg+1825:  jmp0xc0293572 tcp_sendmsg+2696

EIP points here: 
0xc0293210 tcp_sendmsg+1830:  cmpl   $0x0,0x24(%esp)
0xc0293215 tcp_sendmsg+1835:  mov0x98(%ebx),%edx
0xc029321b tcp_sendmsg+1841:  je 0xc0293232 tcp_sendmsg+1864
0xc029321d tcp_sendmsg+1843:  mov0x20(%esp),%ecx
0xc0293221 tcp_sendmsg+1847:  movzwl 0x12(%edx,%ecx,8),%eax
0xc0293226 tcp_sendmsg+1852:  add%edi,%eax
0xc0293228 tcp_sendmsg+1854:  mov%ax,0x12(%edx,%ecx,8)
0xc029322d tcp_sendmsg+1859:  jmp0xc02932b4 tcp_sendmsg+1994


which is 

790 if (err) {
791 /* If this page was new, give it to 
the
792  * socket so it does not get leaked.
793  */
794 if (!TCP_PAGE(sk)) {
795 TCP_PAGE(sk) = page;
796 TCP_OFF(sk) = 0;
797 }
798 goto do_error;
799 }
800 
801 /* Update the skb. */

EIP Points here. 
802 if (merge) {
803 skb_shinfo(skb)-frags[i - 1].size 
+=
804 
copy;


and now the question is: How can the 
cmpl   $0x0,0x24(%esp)
trap at address 0? 

How can if (merge) cause a segmentation fault?

If EIP is a bit off, it could be a line erarlier or further. So, could it
crash on the jmp tcp_sendmsg+2696? I dont' thinks so. 

How about mov0x98(%ebx),%edx? If ebx is invalid, this should 
crash. (ebx apparently holds skb if I understand things correctly). 
But from the dump, ebx holds c24576b8, and if that's invalid it would
not say 
  BUG: unable to handle kernel NULL pointer dereference at virtual address 

right?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Jens Axboe
On Wed, May 09 2007, Rogier Wolff wrote:
> On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> > On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:
> > >
> > >Hi,
> > >
> > >The nbd client still reliably hangs when I use it.
> 
> Someone suggested to use 
> 
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
> 
> and that fixed it.  (i.e. there is something in there that should
> be merged)

Hmm, which branch? Most of my stuff is merged up with Linus as this
point.

> Jens, thanks for pointing out that there were different locks 
> involved.

You're welcome.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Jan Engelhardt

On May 9 2007 14:38, Rogier Wolff wrote:
>
>ozon:~> ps auxww | grep D
>USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
>root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
>root   112  0.0  0.0  0 0 ?D<   10:28   0:05 [kswapd0]
>root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
>petisuix 1234 /dev/nd0
>root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
>/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
>wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
>ozon:~> 
>
>Can anybody help me figure out what these proceses are waiting for?

echo t >/proc/sysrq-trigger

dumps a ton to /var/log/messages.



Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Rogier Wolff
On Wed, May 09, 2007 at 01:10:49PM +0200, Rogier Wolff wrote:
> On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> > On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:
> > >
> > >Hi,
> > >
> > >The nbd client still reliably hangs when I use it.
> 
> Someone suggested to use 
> 
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
> 
> and that fixed it.  (i.e. there is something in there that should
> be merged)

Cancel the party! It got MUCH further than before, but crashed
eventually. 

ozon:~> ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
root   112  0.0  0.0  0 0 ?D<   10:28   0:05 [kswapd0]
root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
petisuix 1234 /dev/nd0
root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
ozon:~> 

Can anybody help me figure out what these proceses are waiting for?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Rogier Wolff
On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:
> >
> >Hi,
> >
> >The nbd client still reliably hangs when I use it.

Someone suggested to use 

http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary

and that fixed it.  (i.e. there is something in there that should
be merged)

Jens, thanks for pointing out that there were different locks 
involved.

Roger. 

(I seem to have lost all other EMails in this thread. Apparently
my delete-old-list-emails is too agressive today...)

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Rogier Wolff
On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
 On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:
 
 Hi,
 
 The nbd client still reliably hangs when I use it.

Someone suggested to use 

http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary

and that fixed it.  (i.e. there is something in there that should
be merged)

Jens, thanks for pointing out that there were different locks 
involved.

Roger. 

(I seem to have lost all other EMails in this thread. Apparently
my delete-old-list-emails is too agressive today...)

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Rogier Wolff
On Wed, May 09, 2007 at 01:10:49PM +0200, Rogier Wolff wrote:
 On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
  On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:
  
  Hi,
  
  The nbd client still reliably hangs when I use it.
 
 Someone suggested to use 
 
 http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
 
 and that fixed it.  (i.e. there is something in there that should
 be merged)

Cancel the party! It got MUCH further than before, but crashed
eventually. 

ozon:~ ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
root   112  0.0  0.0  0 0 ?D   10:28   0:05 [kswapd0]
root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
petisuix 1234 /dev/nd0
root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
ozon:~ 

Can anybody help me figure out what these proceses are waiting for?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Jan Engelhardt

On May 9 2007 14:38, Rogier Wolff wrote:

ozon:~ ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
root   112  0.0  0.0  0 0 ?D   10:28   0:05 [kswapd0]
root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
petisuix 1234 /dev/nd0
root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
ozon:~ 

Can anybody help me figure out what these proceses are waiting for?

echo t /proc/sysrq-trigger

dumps a ton to /var/log/messages.



Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-09 Thread Jens Axboe
On Wed, May 09 2007, Rogier Wolff wrote:
 On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
  On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:
  
  Hi,
  
  The nbd client still reliably hangs when I use it.
 
 Someone suggested to use 
 
 http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
 
 and that fixed it.  (i.e. there is something in there that should
 be merged)

Hmm, which branch? Most of my stuff is merged up with Linus as this
point.

 Jens, thanks for pointing out that there were different locks 
 involved.

You're welcome.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-08 Thread Jens Axboe
On Tue, May 08 2007, Rogier Wolff wrote:
> 
> Hi,
> 
> The nbd client still reliably hangs when I use it. 
> 
> While looking into this, I found:
> 
> 
> 446 req->errors = 0;
> 447 spin_unlock_irq(q->queue_lock);
>
> 448 
> 449 mutex_lock(>tx_lock);
> 450 if (unlikely(!lo->sock)) {
> 451 mutex_unlock(>tx_lock);
> 452 printk(KERN_ERR "%s: Attempted send on closed 
> socket\n",
> 453lo->disk->disk_name);
> 454 req->errors++;
> 455 nbd_end_request(req);
> 456 spin_lock_irq(q->queue_lock);
> 457 continue;
> 458 }
> 459 
> 460 lo->active_req = req;
> 461 
> 462 if (nbd_send_req(lo, req) != 0) {
> 463 printk(KERN_ERR "%s: Request send failed\n",
> 464 lo->disk->disk_name);
> 465 req->errors++;
> 466 nbd_end_request(req);
> 467 } else {
> 468 spin_lock(>queue_lock);
>  ^^
> 469 list_add(>queuelist, >queue_head);
> 470 spin_unlock(>queue_lock);
> 471 }
> 472 
> 473 lo->active_req = NULL;
> 
> 
> As far as I read things, the function is called with the lock
> held and interrupts disabled., the lock can then be released and 
> retaken without disabling interrupts again. 
> 
> Should this be fixed?
> 
> (it doesn't fix my hang though)

Note lo->queue_lock vs q->queue_lock.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-08 Thread Satyam Sharma

On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:


Hi,

The nbd client still reliably hangs when I use it.

While looking into this, I found:


446 req->errors = 0;
447 spin_unlock_irq(q->queue_lock);
   


BTW (this could be unrelated to the original issue here), but can
anybody ever have a _genuine_ excuse to use spin_lock_irq /
spin_unlock_irq and not spin_lock_irqsave / spin_unlock_restore? I
find the latter primitives more tasteful even when I *know* something
is being called with interrupts enabled / disabled -- you never know
when some code is re-used again somewhere else and/or ripped out of
one place and put inside another ... the former API only invites
trouble, if anything.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


nbd problem.

2007-05-08 Thread Rogier Wolff

Hi,

The nbd client still reliably hangs when I use it. 

While looking into this, I found:


446 req->errors = 0;
447 spin_unlock_irq(q->queue_lock);
   
448 
449 mutex_lock(>tx_lock);
450 if (unlikely(!lo->sock)) {
451 mutex_unlock(>tx_lock);
452 printk(KERN_ERR "%s: Attempted send on closed 
socket\n",
453lo->disk->disk_name);
454 req->errors++;
455 nbd_end_request(req);
456 spin_lock_irq(q->queue_lock);
457 continue;
458 }
459 
460 lo->active_req = req;
461 
462 if (nbd_send_req(lo, req) != 0) {
463 printk(KERN_ERR "%s: Request send failed\n",
464 lo->disk->disk_name);
465 req->errors++;
466 nbd_end_request(req);
467 } else {
468 spin_lock(>queue_lock);
 ^^
469 list_add(>queuelist, >queue_head);
470 spin_unlock(>queue_lock);
471 }
472 
473 lo->active_req = NULL;


As far as I read things, the function is called with the lock
held and interrupts disabled., the lock can then be released and 
retaken without disabling interrupts again. 

Should this be fixed?

(it doesn't fix my hang though)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


nbd problem.

2007-05-08 Thread Rogier Wolff

Hi,

The nbd client still reliably hangs when I use it. 

While looking into this, I found:


446 req-errors = 0;
447 spin_unlock_irq(q-queue_lock);
   
448 
449 mutex_lock(lo-tx_lock);
450 if (unlikely(!lo-sock)) {
451 mutex_unlock(lo-tx_lock);
452 printk(KERN_ERR %s: Attempted send on closed 
socket\n,
453lo-disk-disk_name);
454 req-errors++;
455 nbd_end_request(req);
456 spin_lock_irq(q-queue_lock);
457 continue;
458 }
459 
460 lo-active_req = req;
461 
462 if (nbd_send_req(lo, req) != 0) {
463 printk(KERN_ERR %s: Request send failed\n,
464 lo-disk-disk_name);
465 req-errors++;
466 nbd_end_request(req);
467 } else {
468 spin_lock(lo-queue_lock);
 ^^
469 list_add(req-queuelist, lo-queue_head);
470 spin_unlock(lo-queue_lock);
471 }
472 
473 lo-active_req = NULL;


As far as I read things, the function is called with the lock
held and interrupts disabled., the lock can then be released and 
retaken without disabling interrupts again. 

Should this be fixed?

(it doesn't fix my hang though)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-08 Thread Satyam Sharma

On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:


Hi,

The nbd client still reliably hangs when I use it.

While looking into this, I found:


446 req-errors = 0;
447 spin_unlock_irq(q-queue_lock);
   


BTW (this could be unrelated to the original issue here), but can
anybody ever have a _genuine_ excuse to use spin_lock_irq /
spin_unlock_irq and not spin_lock_irqsave / spin_unlock_restore? I
find the latter primitives more tasteful even when I *know* something
is being called with interrupts enabled / disabled -- you never know
when some code is re-used again somewhere else and/or ripped out of
one place and put inside another ... the former API only invites
trouble, if anything.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nbd problem.

2007-05-08 Thread Jens Axboe
On Tue, May 08 2007, Rogier Wolff wrote:
 
 Hi,
 
 The nbd client still reliably hangs when I use it. 
 
 While looking into this, I found:
 
 
 446 req-errors = 0;
 447 spin_unlock_irq(q-queue_lock);

 448 
 449 mutex_lock(lo-tx_lock);
 450 if (unlikely(!lo-sock)) {
 451 mutex_unlock(lo-tx_lock);
 452 printk(KERN_ERR %s: Attempted send on closed 
 socket\n,
 453lo-disk-disk_name);
 454 req-errors++;
 455 nbd_end_request(req);
 456 spin_lock_irq(q-queue_lock);
 457 continue;
 458 }
 459 
 460 lo-active_req = req;
 461 
 462 if (nbd_send_req(lo, req) != 0) {
 463 printk(KERN_ERR %s: Request send failed\n,
 464 lo-disk-disk_name);
 465 req-errors++;
 466 nbd_end_request(req);
 467 } else {
 468 spin_lock(lo-queue_lock);
  ^^
 469 list_add(req-queuelist, lo-queue_head);
 470 spin_unlock(lo-queue_lock);
 471 }
 472 
 473 lo-active_req = NULL;
 
 
 As far as I read things, the function is called with the lock
 held and interrupts disabled., the lock can then be released and 
 retaken without disabling interrupts again. 
 
 Should this be fixed?
 
 (it doesn't fix my hang though)

Note lo-queue_lock vs q-queue_lock.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.4.2,3 nbd problem, works OK in 2.4.2-ac20,28

2001-04-03 Thread Russell King - ARM Linux

On Tue, Apr 03, 2001 at 04:16:04PM +0400, Vladimir Serov wrote:
> Unfortunately the details of handling these requests aren't clear for me
> and it's not simple to use Alan Cox patches on ARM cause there not
> supported by Russell King and other people in ARM community (I mean no
> patches again -acxx kernels) and i'm already overloaded by various beta
> and alpha software.

I'll look into the possibility of rooting out the fix in the -ac tree (if
any) tomorrow and dropping it into the next ARM tree.
   _
  |_| - ---+---+-
  |   |Russell King   [EMAIL PROTECTED]  --- ---
  | | | |http://www.arm.linux.org.uk//  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.2,3 nbd problem, works OK in 2.4.2-ac20,28

2001-04-03 Thread Vladimir Serov

Hello everybody,

I'm working on remote disks and swap for Strong ARM based board similar
to Brutus eval board (using usbnet Ethernet-over-USB driver).  And i've
got problems with Network block device (nbd) i'm using to mount devices
exported from host computer.  Every program trying access /dev/nbd0
after it was connected by nbd-client stops in D state.

My first thought was it's ARM specific, but later i found this problem
persists when using my host PC as client too.  I've compiled nbd.o and
nbd-server (latest from cvs) with debug options turned on and use
"strace cat /dev/nd0" to see where it stuck. It looks like nbd actually
gets first page (4k) of data in several packets but user space doesn't
get this data and read system call does never return (this is the case
for 2.4.2-rmk1-np3 at least). I got this problem on 2.4.2-rmk1-np1,
2.4.2-rmk1-np3, 2.4.3 with Russell Kings patch for 2.4.3-pre7 kernels on
ARM and vanilla 2.4.2 and 2.4.3 kernels on ix86.

BUT  2.4.2-ac20,28 works fine on ix86   Possibly main branch
doesn't get updated.
Unfortunately the details of handling these requests aren't clear for me
and it's not simple to use Alan Cox patches on ARM cause there not
supported by Russell King and other people in ARM community (I mean no
patches again -acxx kernels) and i'm already overloaded by various beta
and alpha software.

Any help will be appreciated !!!
Thanks in advance.

Vladimir.

PS. sorry for bad english, it' my second language.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.2,3 nbd problem, works OK in 2.4.2-ac20,28

2001-04-03 Thread Vladimir Serov

Hello everybody,

I'm working on remote disks and swap for Strong ARM based board similar
to Brutus eval board (using usbnet Ethernet-over-USB driver).  And i've
got problems with Network block device (nbd) i'm using to mount devices
exported from host computer.  Every program trying access /dev/nbd0
after it was connected by nbd-client stops in D state.

My first thought was it's ARM specific, but later i found this problem
persists when using my host PC as client too.  I've compiled nbd.o and
nbd-server (latest from cvs) with debug options turned on and use
"strace cat /dev/nd0" to see where it stuck. It looks like nbd actually
gets first page (4k) of data in several packets but user space doesn't
get this data and read system call does never return (this is the case
for 2.4.2-rmk1-np3 at least). I got this problem on 2.4.2-rmk1-np1,
2.4.2-rmk1-np3, 2.4.3 with Russell Kings patch for 2.4.3-pre7 kernels on
ARM and vanilla 2.4.2 and 2.4.3 kernels on ix86.

BUT  2.4.2-ac20,28 works fine on ix86   Possibly main branch
doesn't get updated.
Unfortunately the details of handling these requests aren't clear for me
and it's not simple to use Alan Cox patches on ARM cause there not
supported by Russell King and other people in ARM community (I mean no
patches again -acxx kernels) and i'm already overloaded by various beta
and alpha software.

Any help will be appreciated !!!
Thanks in advance.

Vladimir.

PS. sorry for bad english, it' my second language.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.2,3 nbd problem, works OK in 2.4.2-ac20,28

2001-04-03 Thread Russell King - ARM Linux

On Tue, Apr 03, 2001 at 04:16:04PM +0400, Vladimir Serov wrote:
 Unfortunately the details of handling these requests aren't clear for me
 and it's not simple to use Alan Cox patches on ARM cause there not
 supported by Russell King and other people in ARM community (I mean no
 patches again -acxx kernels) and i'm already overloaded by various beta
 and alpha software.

I'll look into the possibility of rooting out the fix in the -ac tree (if
any) tomorrow and dropping it into the next ARM tree.
   _
  |_| - ---+---+-
  |   |Russell King   [EMAIL PROTECTED]  --- ---
  | | | |http://www.arm.linux.org.uk//  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/