Re: panic on 2.6.24rc5

2008-01-01 Thread Arnaldo Carvalho de Melo
Em Sun, Dec 30, 2007 at 04:18:36PM +0100, Tomasz Grobelny escreveu:
 On Friday 28 December 2007, I wrote:
  Dnia Wednesday 26 of December 2007, napisałeś:
   What are the panics you are getting? It might be worth posting them to
   the list.
 
  Here is the screenshot I captured a few days ago. Details:
   - kernel-vanilla 2.6.24rc5,
 Now I'm using kernel as described in Arnaldo's mail (davem/net-2.6.25 + 
 patches 0001 to 0051).

dccp_hdlr_ack_ratio is not on net-2.6.25, which means it is in one of
the 0001 to 0051 patches from Gerrit. So, to help us understand where is
the problem you could try building a kernel without applying any of the
0001 to 0051 patches.

Could you do this at and report the results?

I'm also assuming you are using CCID2 either by explicitely using
feature negotiation setsockopt calls or by using the default, that is
CCID2. If this is the case it would also be interesting to, before
rebuilding the kernel, to try using CCID3 as the problem you're
experiencing when using netem is exactly in the interface between the
core DCCP code and the CCID being used.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: panic on 2.6.24rc5

2008-01-01 Thread Arnaldo Carvalho de Melo
Em Tue, Jan 01, 2008 at 10:30:56PM +0100, Tomasz Grobelny escreveu:
 Dnia Tuesday 01 of January 2008, Arnaldo Carvalho de Melo napisał:
  Em Sun, Dec 30, 2007 at 04:18:36PM +0100, Tomasz Grobelny escreveu:
   On Friday 28 December 2007, I wrote:
Dnia Wednesday 26 of December 2007, napisałeś:
 What are the panics you are getting? It might be worth posting them
 to the list.
   
Here is the screenshot I captured a few days ago. Details:
 - kernel-vanilla 2.6.24rc5,
  
   Now I'm using kernel as described in Arnaldo's mail (davem/net-2.6.25 +
   patches 0001 to 0051).
 
  dccp_hdlr_ack_ratio is not on net-2.6.25, which means it is in one of
  the 0001 to 0051 patches from Gerrit. So, to help us understand where is
  the problem you could try building a kernel without applying any of the
  0001 to 0051 patches.
 
  Could you do this at and report the results?
 
 But what should I exactly test? Just whether the delays are gone or something 
 more? I'll try to when I have some time (hopefully during weekend).

If the kernel oopses, if the results are the same or are some problem
introduced in the patches by Gerrit. I.e. you would help us to narrow
down the problem by trying a binary search of changeset history built
kernels. 

Please take a look at Documentation/BUG-HUNTING in the kernel sources.
The process is somehow time consuming and its understandable if you
can't perform it, your reports are already of great help, but if you can
try helping us to narrow down exactly when some bugs you notice
appeared, or if they were always present after some kernel builds, we'd
be really grateful :-)
 
  I'm also assuming you are using CCID2 either by explicitely using
  feature negotiation setsockopt calls or by using the default, that is

 In fact I was using ccid3. When I switched to ccid2 it started to work more 
 or 
 less ok. It seems that for whatever reason ccid_hc_tx_send_packet is 
 returning too big values (up to 64000).

That is an excellent data point, ccid3 code is way more complex than
ccid2, so trying with both is always a valuable data point.
 
  CCID2. If this is the case it would also be interesting to, before
  rebuilding the kernel, to try using CCID3 as the problem you're
  experiencing when using netem is exactly in the interface between the
  core DCCP code and the CCID being used.

 The problem with netem exists with both ccid2 and ccid3. I suspect that when 
 all three elements of the connection (server, client and netem) are on one 
 host netem is able to communicate packet loss by returning error. If netem 
 was on a diffrent host the packet would be sent correctly (no BUG: err=1 
 after ccid_hc_tx_packet_sent) but dropped on another host. I think that in 
 this situation dccp should behave as if the packet was simply dropped.

I can't work on this right now, will look at it tomorrow, but thanks for
the data points!

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


dccp send

2008-01-01 Thread Tomasz Grobelny
When I use dccp does sendmsg function block (until it sends the packet)? If 
so, should it? In either case, how to make it just queue the packet and 
return?
-- 
Regards,
Tomasz Grobelny
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dccp send

2008-01-01 Thread Arnaldo Carvalho de Melo
Em Wed, Jan 02, 2008 at 01:41:16AM +0100, Tomasz Grobelny escreveu:
 When I use dccp does sendmsg function block (until it sends the packet)? If 
 so, should it? In either case, how to make it just queue the packet and 
 return?

The interface is the same as for other AF_INET transports, use
O_NONBLOCK (open, fcntl) if you want it to be non blocking.

It queues it in the write routine and tries to send it right away, but
doesn't waits for actually sending the packet, i.e. it only checks if
there is write space available, if you set O_NONBLOCK and there is no
space it returns ENOBUFS, if O_NONBLOCK is not set it will sleep waiting
for write space to be made available, when the process will be awaken.

Use setsockopt(SO_SNDTIMEO) to change the default send timeout, etc.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: panic on 2.6.24rc5

2008-01-01 Thread Arnaldo Carvalho de Melo
Em Wed, Jan 02, 2008 at 01:57:14AM +0100, Tomasz Grobelny escreveu:
 Dnia Wednesday 02 of January 2008, Arnaldo Carvalho de Melo napisał:
  If the kernel oopses, if the results are the same or are some problem
  introduced in the patches by Gerrit. I.e. you would help us to narrow
  down the problem by trying a binary search of changeset history built
  kernels.

 Oh, and by the way: does there exist any set of automated tests for dccp? It 
 would be nice to have one, wouldn't it? Otherwise accepting any patch is 
 quite risky...

There are test programs, documented in the wiki, and there is peer
review too :-)

And DCCP on Linux was written in such a way that a large part of its
core engine is actually shared with TCP, benefiting from a much bigger
set of developers and testers.

But please feel free to add more automated tests, it'll benefit us all.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html