Re: weird network problems on current since 10/28/2012

2012-11-05 Thread Andre Oppermann

On 05.11.2012 02:39, Manfred Antar wrote:

At 01:57 PM 11/4/2012, you wrote:

On 04.11.2012 21:15, Andreas Tobler wrote:

On 04.11.12 14:57, Andre Oppermann wrote:

On 04.11.2012 13:11, Kim Culhan wrote:

On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:

On 2012-11-04 02:13, Manfred Antar wrote:

At 03:29 PM 11/3/2012, Adrian Chadd wrote:

After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

 sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?


FWIW this did not make the problem go away on 2 machines.


Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback

from Fabian Keil.

I jump into this thread since I have a similar network issue.

My scenario:

'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
The nfs drive on the server is an ufs fs. No zfs.

Up to r242261 I can install the kernel (or world) in a fluent way to the
nfs destination.

From r242262 it doesn't work smooth. I have stalls, sometimes my
patience is not enough and I kill the process.

I tried 242266 with the above mentioned patch. No real success.

How can I help/test?


Please try the attach patch instead of the above mentioned one.

--
Andre

Index: netinet/tcp_output.c
===
--- netinet/tcp_output.c(revision 242577)
+++ netinet/tcp_output.c(working copy)
@@ -228,7 +228,7 @@
tso = 0;
mtu = 0;
off = tp-snd_nxt - tp-snd_una;
-   sendwin = min(tp-snd_wnd, tp-snd_cwnd);
+   sendwin = ulmax(ulmin(tp-snd_wnd - off, tp-snd_cwnd), 0);

flags = tcp_outflags[tp-t_state];
/*
@@ -249,7 +249,7 @@
(p = tcp_sack_output(tp, sack_bytes_rxmt))) {
long cwin;

-   cwin = min(tp-snd_wnd, tp-snd_cwnd) - sack_bytes_rxmt;
+   cwin = ulmin(tp-snd_wnd - off, tp-snd_cwnd) - sack_bytes_rxmt;
if (cwin  0)
cwin = 0;
/* Do not retransmit SACK segments beyond snd_recover */
@@ -355,7 +355,7 @@
 * sending new data, having retransmitted all the
 * data possible in the scoreboard.
 */
-   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd)
+   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd - off)
   - off);
/*
 * Don't remove this (len  0) check !


This doesn't seem to make a difference.
I have a ssh window thats been trying to connect for the past 5 minutes.
This is on a local network 192.168.0.4  ===SSH== 
192.168.0.5
Also pop from the same machines endless trying to connect.
Hopefully this mail will get thru , otherwise i will need to reboot to old 
kernel


I've backed out the change with r242601 as it exhibits still too
many problems.  I'll fix these problems in the next days but in
the mean time HEAD should be in a working state.

I'm sorry for the trouble.

--
Andre

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Dimitry Andric

On 2012-11-04 02:13, Manfred Antar wrote:

At 03:29 PM 11/3/2012, Adrian Chadd wrote:

On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:

i have problem connecting to freebsd box on local network since last sunday.
the last kernel that works:
  FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that cause 
this.


Please do!

...

Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.


So, your problems seem to have been introduced by this commit by Andre:

  http://svn.freebsd.org/changeset/base/242266

  Increase the initial CWND to 10 segments as defined in IETF TCPM
  draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
  window improves the overall performance of many web services without
  risking congestion collapse.
  
  As long as it remains a draft it is placed under a sysctl marking it

  as experimental:
   net.inet.tcp.experimental.initcwnd10 = 1
  When it becomes an official RFC soon the sysctl will be changed to
  the RFC number and moved to net.inet.tcp.
  
  This implementation differs from the RFC draft in that it is a bit

  more conservative in the case of packet loss on SYN or SYN|ACK because
  we haven't reduced the default RTO to 1 second yet.  Also the restart
  window isn't yet increased as allowed.  Both will be adjusted with
  upcoming changes.
  
  Is is enabled by default.  In Linux it is enabled since kernel 3.0.


After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

  sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Andre Oppermann

On 04.11.2012 02:13, Manfred Antar wrote:

At 03:29 PM 11/3/2012, Adrian Chadd wrote:

On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:

i have problem connecting to freebsd box on local network since last sunday.
the last kernel that works:
  FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that cause 
this.


Please do!



adrian


OK
Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.


Can you please provide one or more tcpdump from a failing kernel?
Also please enable sysctl net.inet.tcp.logdebug=1 and capture LOG_DEBUG
output from syslogd.  That may give some important information as well.

--
Andre

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Alexander Yerenkow
Could this be same problem - PR/173309 ?

-- 
Regards,
Alexander Yerenkow
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Kim Culhan
On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
 On 2012-11-04 02:13, Manfred Antar wrote:
 At 03:29 PM 11/3/2012, Adrian Chadd wrote:
 On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
 i have problem connecting to freebsd box on local network since last
sunday.
 the last kernel that works:
   FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
 anything after that, sometimes i can connect, other times just hangs.
 any network connection hangs = pop httpd ssh etc etc.
 anyone have any ideas ?
 i can checkout different sources and see if i can locate the changes
that cause
 this.

 Please do!
 ...
 Here is what I found doing :
 setenv CVSROOT /usr/home/ncvs

 cvs co -DOctober 28, 2012 12:14:38 PDT sys

 A kernel from that time works fine.

 doing:

 cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour
later
 the following files were changed:
 sys/netinet/tcp_input.c
 sys/netinet/tcp_timer.c
 sys/netinet/tcp_var.h

 Building a kernel from these new files is when the problem starts.

 So, your problems seem to have been introduced by this commit by Andre:

http://svn.freebsd.org/changeset/base/242266

Increase the initial CWND to 10 segments as defined in IETF TCPM
draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
window improves the overall performance of many web services without
risking congestion collapse.

As long as it remains a draft it is placed under a sysctl marking it
as experimental:
 net.inet.tcp.experimental.initcwnd10 = 1
When it becomes an official RFC soon the sysctl will be changed to
the RFC number and moved to net.inet.tcp.

This implementation differs from the RFC draft in that it is a bit
more conservative in the case of packet loss on SYN or SYN|ACK because
we haven't reduced the default RTO to 1 second yet.  Also the restart
window isn't yet increased as allowed.  Both will be adjusted with
upcoming changes.

Is is enabled by default.  In Linux it is enabled since kernel 3.0.

 After the commit, there was a small discussion thread on svn-src-head@
 about the possible problems with the approach.  Maybe you are
 experiencing those?

 As the commit message says, you should be able to turn the feature off
 using:

sysctl net.inet.tcp.experimental.initcwnd10=0

 Can you please try that, and see if the problems go away?

FWIW this did not make the problem go away here.

thanks
-kim

--
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Andre Oppermann

On 04.11.2012 13:11, Kim Culhan wrote:

On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:

On 2012-11-04 02:13, Manfred Antar wrote:

At 03:29 PM 11/3/2012, Adrian Chadd wrote:

On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:

i have problem connecting to freebsd box on local network since last sunday.
the last kernel that works:
   FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that cause
this.


Please do!

...

Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.


So, your problems seem to have been introduced by this commit by Andre:

http://svn.freebsd.org/changeset/base/242266

Increase the initial CWND to 10 segments as defined in IETF TCPM
draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
window improves the overall performance of many web services without
risking congestion collapse.

As long as it remains a draft it is placed under a sysctl marking it
as experimental:
 net.inet.tcp.experimental.initcwnd10 = 1
When it becomes an official RFC soon the sysctl will be changed to
the RFC number and moved to net.inet.tcp.

This implementation differs from the RFC draft in that it is a bit
more conservative in the case of packet loss on SYN or SYN|ACK because
we haven't reduced the default RTO to 1 second yet.  Also the restart
window isn't yet increased as allowed.  Both will be adjusted with
upcoming changes.

Is is enabled by default.  In Linux it is enabled since kernel 3.0.

After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?


FWIW this did not make the problem go away on 2 machines.


Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback
from Fabian Keil.

--
Andre

Index: tcp_input.c
===
--- tcp_input.c (revision 242494)
+++ tcp_input.c (working copy)
@@ -2650,10 +2652,12 @@

SOCKBUF_LOCK(so-so_snd);
if (acked  so-so_snd.sb_cc) {
+   tp-snd_wnd -= so-so_snd.sb_cc;
sbdrop_locked(so-so_snd, (int)so-so_snd.sb_cc);
ourfinisacked = 1;
} else {
sbdrop_locked(so-so_snd, acked);
+   tp-snd_wnd -= acked;
ourfinisacked = 0;
}
/* NB: sowwakeup_locked() does an implicit unlock. */
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Manfred Antar
At 03:21 AM 11/4/2012, Dimitry Andric wrote:
On 2012-11-04 02:13, Manfred Antar wrote:
At 03:29 PM 11/3/2012, Adrian Chadd wrote:
On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
i have problem connecting to freebsd box on local network since last sunday.
the last kernel that works:
  FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that 
cause this.

Please do!
...
Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.

So, your problems seem to have been introduced by this commit by Andre:

  http://svn.freebsd.org/changeset/base/242266

  Increase the initial CWND to 10 segments as defined in IETF TCPM
  draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
  window improves the overall performance of many web services without
  risking congestion collapse.
  
  As long as it remains a draft it is placed under a sysctl marking it
  as experimental:
   net.inet.tcp.experimental.initcwnd10 = 1
  When it becomes an official RFC soon the sysctl will be changed to
  the RFC number and moved to net.inet.tcp.
  
  This implementation differs from the RFC draft in that it is a bit
  more conservative in the case of packet loss on SYN or SYN|ACK because
  we haven't reduced the default RTO to 1 second yet.  Also the restart
  window isn't yet increased as allowed.  Both will be adjusted with
  upcoming changes.
  
  Is is enabled by default.  In Linux it is enabled since kernel 3.0.

After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

  sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?

I read the commit log and tried that. It didn't change.
I will try the patch from Andre and enable the debug log.
Manfred


||  n...@pozo.com   ||
||  Ph. (415) 681-6235  ||
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Manfred Antar
At 05:57 AM 11/4/2012, you wrote:
On 04.11.2012 13:11, Kim Culhan wrote:
On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
On 2012-11-04 02:13, Manfred Antar wrote:
At 03:29 PM 11/3/2012, Adrian Chadd wrote:
On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
i have problem connecting to freebsd box on local network since last 
sunday.
the last kernel that works:
   FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that 
cause
this.

Please do!
...
Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.

So, your problems seem to have been introduced by this commit by Andre:

http://svn.freebsd.org/changeset/base/242266

Increase the initial CWND to 10 segments as defined in IETF TCPM
draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
window improves the overall performance of many web services without
risking congestion collapse.

As long as it remains a draft it is placed under a sysctl marking it
as experimental:
 net.inet.tcp.experimental.initcwnd10 = 1
When it becomes an official RFC soon the sysctl will be changed to
the RFC number and moved to net.inet.tcp.

This implementation differs from the RFC draft in that it is a bit
more conservative in the case of packet loss on SYN or SYN|ACK because
we haven't reduced the default RTO to 1 second yet.  Also the restart
window isn't yet increased as allowed.  Both will be adjusted with
upcoming changes.

Is is enabled by default.  In Linux it is enabled since kernel 3.0.

After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?

FWIW this did not make the problem go away on 2 machines.

Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback
from Fabian Keil.

-- 
Andre

Index: tcp_input.c
===
--- tcp_input.c (revision 242494)
+++ tcp_input.c (working copy)
@@ -2650,10 +2652,12 @@

SOCKBUF_LOCK(so-so_snd);
if (acked  so-so_snd.sb_cc) {
+   tp-snd_wnd -= so-so_snd.sb_cc;
sbdrop_locked(so-so_snd, (int)so-so_snd.sb_cc);
ourfinisacked = 1;
} else {
sbdrop_locked(so-so_snd, acked);
+   tp-snd_wnd -= acked;
ourfinisacked = 0;
}
/* NB: sowwakeup_locked() does an implicit unlock. */

This patch improves the connection issue, not hanging on trying to connect (ssh 
pop)
It still seems that it is taking longer to connect though. But in the end the 
connection goes through.
I can capture a tcpdump and put it at http://pozo.com/tcpdump/tpdump.txt if 
that will help.
I'll let it run for about 1/2 hour.
Manfred


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Andreas Tobler
On 04.11.12 14:57, Andre Oppermann wrote:
 On 04.11.2012 13:11, Kim Culhan wrote:
 On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
 On 2012-11-04 02:13, Manfred Antar wrote:
 At 03:29 PM 11/3/2012, Adrian Chadd wrote:
 On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
 i have problem connecting to freebsd box on local network since last 
 sunday.
 the last kernel that works:
FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
 anything after that, sometimes i can connect, other times just hangs.
 any network connection hangs = pop httpd ssh etc etc.
 anyone have any ideas ?
 i can checkout different sources and see if i can locate the changes 
 that cause
 this.

 Please do!
 ...
 Here is what I found doing :
 setenv CVSROOT /usr/home/ncvs

 cvs co -DOctober 28, 2012 12:14:38 PDT sys

 A kernel from that time works fine.

 doing:

 cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour 
 later
 the following files were changed:
 sys/netinet/tcp_input.c
 sys/netinet/tcp_timer.c
 sys/netinet/tcp_var.h

 Building a kernel from these new files is when the problem starts.

 So, your problems seem to have been introduced by this commit by Andre:

 http://svn.freebsd.org/changeset/base/242266

 Increase the initial CWND to 10 segments as defined in IETF TCPM
 draft-ietf-tcpm-initcwnd-05. It explains why the increased initial
 window improves the overall performance of many web services without
 risking congestion collapse.

 As long as it remains a draft it is placed under a sysctl marking it
 as experimental:
  net.inet.tcp.experimental.initcwnd10 = 1
 When it becomes an official RFC soon the sysctl will be changed to
 the RFC number and moved to net.inet.tcp.

 This implementation differs from the RFC draft in that it is a bit
 more conservative in the case of packet loss on SYN or SYN|ACK because
 we haven't reduced the default RTO to 1 second yet.  Also the restart
 window isn't yet increased as allowed.  Both will be adjusted with
 upcoming changes.

 Is is enabled by default.  In Linux it is enabled since kernel 3.0.

 After the commit, there was a small discussion thread on svn-src-head@
 about the possible problems with the approach.  Maybe you are
 experiencing those?

 As the commit message says, you should be able to turn the feature off
 using:

 sysctl net.inet.tcp.experimental.initcwnd10=0

 Can you please try that, and see if the problems go away?

 FWIW this did not make the problem go away on 2 machines.
 
 Yes, this very much looks like the same problem as in PR/173309.
 
 Please try the attached patch.  It fixes the connection hang issue.
 There may be a second issue I debugging currently base on the feedback
 from Fabian Keil.

I jump into this thread since I have a similar network issue.

My scenario:

'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
The nfs drive on the server is an ufs fs. No zfs.

Up to r242261 I can install the kernel (or world) in a fluent way to the
nfs destination.

From r242262 it doesn't work smooth. I have stalls, sometimes my
patience is not enough and I kill the process.

I tried 242266 with the above mentioned patch. No real success.

How can I help/test?

TIA,
Andreas

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Andre Oppermann

On 04.11.2012 21:15, Andreas Tobler wrote:

On 04.11.12 14:57, Andre Oppermann wrote:

On 04.11.2012 13:11, Kim Culhan wrote:

On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:

On 2012-11-04 02:13, Manfred Antar wrote:

At 03:29 PM 11/3/2012, Adrian Chadd wrote:

After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

 sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?


FWIW this did not make the problem go away on 2 machines.


Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback
from Fabian Keil.


I jump into this thread since I have a similar network issue.

My scenario:

'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
The nfs drive on the server is an ufs fs. No zfs.

Up to r242261 I can install the kernel (or world) in a fluent way to the
nfs destination.


From r242262 it doesn't work smooth. I have stalls, sometimes my

patience is not enough and I kill the process.

I tried 242266 with the above mentioned patch. No real success.

How can I help/test?


Please try the attach patch instead of the above mentioned one.

--
Andre

Index: netinet/tcp_output.c
===
--- netinet/tcp_output.c(revision 242577)
+++ netinet/tcp_output.c(working copy)
@@ -228,7 +228,7 @@
tso = 0;
mtu = 0;
off = tp-snd_nxt - tp-snd_una;
-   sendwin = min(tp-snd_wnd, tp-snd_cwnd);
+   sendwin = ulmax(ulmin(tp-snd_wnd - off, tp-snd_cwnd), 0);

flags = tcp_outflags[tp-t_state];
/*
@@ -249,7 +249,7 @@
(p = tcp_sack_output(tp, sack_bytes_rxmt))) {
long cwin;

-   cwin = min(tp-snd_wnd, tp-snd_cwnd) - sack_bytes_rxmt;
+   cwin = ulmin(tp-snd_wnd - off, tp-snd_cwnd) - sack_bytes_rxmt;
if (cwin  0)
cwin = 0;
/* Do not retransmit SACK segments beyond snd_recover */
@@ -355,7 +355,7 @@
 * sending new data, having retransmitted all the
 * data possible in the scoreboard.
 */
-   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd)
+   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd - off)
   - off);
/*
 * Don't remove this (len  0) check !
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Andreas Tobler
On 04.11.12 22:57, Andre Oppermann wrote:
 On 04.11.2012 21:15, Andreas Tobler wrote:
 On 04.11.12 14:57, Andre Oppermann wrote:
 On 04.11.2012 13:11, Kim Culhan wrote:
 On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
 On 2012-11-04 02:13, Manfred Antar wrote:
 At 03:29 PM 11/3/2012, Adrian Chadd wrote:
 After the commit, there was a small discussion thread on svn-src-head@
 about the possible problems with the approach.  Maybe you are
 experiencing those?

 As the commit message says, you should be able to turn the feature off
 using:

  sysctl net.inet.tcp.experimental.initcwnd10=0

 Can you please try that, and see if the problems go away?

 FWIW this did not make the problem go away on 2 machines.

 Yes, this very much looks like the same problem as in PR/173309.

 Please try the attached patch.  It fixes the connection hang issue.
 There may be a second issue I debugging currently base on the feedback
 from Fabian Keil.

 I jump into this thread since I have a similar network issue.

 My scenario:

 'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
 The nfs drive on the server is an ufs fs. No zfs.

 Up to r242261 I can install the kernel (or world) in a fluent way to the
 nfs destination.

 From r242262 it doesn't work smooth. I have stalls, sometimes my
 patience is not enough and I kill the process.

 I tried 242266 with the above mentioned patch. No real success.

 How can I help/test?
 
 Please try the attach patch instead of the above mentioned one.

Test run based on 242266. It starts much smoother. But it stalls later
on. Continues, stalls for several seconds, cont.

thx so far.

Andreas

1391  0  D+   0:00.00 install -o root -g wheel -m 555 crypto.ko
/netboot/test_install

procstat -kk 1391
  PIDTID COMM TDNAME   KSTACK
 1391 100099 install  -mi_switch+0x186
sleepq_timedwait+0x42 _sleep+0x1c9 clnt_vc_call+0x763
clnt_reconnect_call+0xfb newnfs_request+0xadb nfscl_request+0x72
nfsrpc_setattr+0x28f nfs_setattr+0x2b0 VOP_SETATTR_APV+0x31
setfmode+0x101 vn_chmod+0x8a sys_fchmod+0x8b amd64_syscall+0x55f
Xfast_syscall+0xf7

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-04 Thread Manfred Antar
At 01:57 PM 11/4/2012, you wrote:
On 04.11.2012 21:15, Andreas Tobler wrote:
On 04.11.12 14:57, Andre Oppermann wrote:
On 04.11.2012 13:11, Kim Culhan wrote:
On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote:
On 2012-11-04 02:13, Manfred Antar wrote:
At 03:29 PM 11/3/2012, Adrian Chadd wrote:
After the commit, there was a small discussion thread on svn-src-head@
about the possible problems with the approach.  Maybe you are
experiencing those?

As the commit message says, you should be able to turn the feature off
using:

 sysctl net.inet.tcp.experimental.initcwnd10=0

Can you please try that, and see if the problems go away?

FWIW this did not make the problem go away on 2 machines.

Yes, this very much looks like the same problem as in PR/173309.

Please try the attached patch.  It fixes the connection hang issue.
There may be a second issue I debugging currently base on the feedback
from Fabian Keil.

I jump into this thread since I have a similar network issue.

My scenario:

'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive.
The nfs drive on the server is an ufs fs. No zfs.

Up to r242261 I can install the kernel (or world) in a fluent way to the
nfs destination.

From r242262 it doesn't work smooth. I have stalls, sometimes my
patience is not enough and I kill the process.

I tried 242266 with the above mentioned patch. No real success.

How can I help/test?

Please try the attach patch instead of the above mentioned one.

-- 
Andre

Index: netinet/tcp_output.c
===
--- netinet/tcp_output.c(revision 242577)
+++ netinet/tcp_output.c(working copy)
@@ -228,7 +228,7 @@
tso = 0;
mtu = 0;
off = tp-snd_nxt - tp-snd_una;
-   sendwin = min(tp-snd_wnd, tp-snd_cwnd);
+   sendwin = ulmax(ulmin(tp-snd_wnd - off, tp-snd_cwnd), 0);

flags = tcp_outflags[tp-t_state];
/*
@@ -249,7 +249,7 @@
(p = tcp_sack_output(tp, sack_bytes_rxmt))) {
long cwin;

-   cwin = min(tp-snd_wnd, tp-snd_cwnd) - sack_bytes_rxmt;
+   cwin = ulmin(tp-snd_wnd - off, tp-snd_cwnd) - 
sack_bytes_rxmt;
if (cwin  0)
cwin = 0;
/* Do not retransmit SACK segments beyond snd_recover */
@@ -355,7 +355,7 @@
 * sending new data, having retransmitted all the
 * data possible in the scoreboard.
 */
-   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd)
+   len = ((long)ulmin(so-so_snd.sb_cc, tp-snd_wnd - off)
   - off);
/*
 * Don't remove this (len  0) check !

This doesn't seem to make a difference.
I have a ssh window thats been trying to connect for the past 5 minutes.
This is on a local network 192.168.0.4  ===SSH== 
192.168.0.5 
Also pop from the same machines endless trying to connect.
Hopefully this mail will get thru , otherwise i will need to reboot to old 
kernel
Manfred



||  n...@pozo.com   ||
||  Ph. (415) 681-6235  ||
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


weird network problems on current since 10/28/2012

2012-11-03 Thread Manfred Antar
i have problem connecting to freebsd box on local network since last sunday.
the last kernel that works:
 FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
anything after that, sometimes i can connect, other times just hangs.
any network connection hangs = pop httpd ssh etc etc.
anyone have any ideas ?
i can checkout different sources and see if i can locate the changes that cause 
this.
thanks
manfred




||  n...@pozo.com   ||
||  Ph. (415) 681-6235  ||
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-03 Thread Adrian Chadd
On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
 i have problem connecting to freebsd box on local network since last sunday.
 the last kernel that works:
  FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
 anything after that, sometimes i can connect, other times just hangs.
 any network connection hangs = pop httpd ssh etc etc.
 anyone have any ideas ?
 i can checkout different sources and see if i can locate the changes that 
 cause this.

Please do!



adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: weird network problems on current since 10/28/2012

2012-11-03 Thread Manfred Antar
At 03:29 PM 11/3/2012, Adrian Chadd wrote:
On 3 November 2012 10:40, Manfred Antar n...@pozo.com wrote:
 i have problem connecting to freebsd box on local network since last sunday.
 the last kernel that works:
  FreeBSD 10.0-CURRENT #0: Sun Oct 28 12:14:38 PDT 2012
 anything after that, sometimes i can connect, other times just hangs.
 any network connection hangs = pop httpd ssh etc etc.
 anyone have any ideas ?
 i can checkout different sources and see if i can locate the changes that 
 cause this.

Please do!



adrian

OK
Here is what I found doing :
setenv CVSROOT /usr/home/ncvs

cvs co -DOctober 28, 2012 12:14:38 PDT sys

A kernel from that time works fine.

doing:

cvs up -DOctober 28, 2012 13:14:38 PDT sys1 hour later
the following files were changed:
sys/netinet/tcp_input.c
sys/netinet/tcp_timer.c
sys/netinet/tcp_var.h

Building a kernel from these new files is when the problem starts.


||  n...@pozo.com   ||
||  Ph. (415) 681-6235  ||
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org