Re: Refactor TCP partial ACK handling

2017-10-24 Thread Job Snijders
On Tue, Oct 24, 2017 at 03:21:08PM +0200, Mike Belopuhov wrote:
> I didn't do it because tcp_var.h is where tcp keeps all of it's prototypes
> but I don't mind moving them into tcp_input.c.  Any objections?  Otherwise
> I'll check in the diff below.

ok job@



Remove TCP_FACK

2017-10-24 Thread Job Snijders
Dear all,

This patch builds upon the work shared in the following email. Mike's
patch is a prerequisite to apply this patch.

Date: Tue, 24 Oct 2017 15:21:08 +0200
From: Mike Belopuhov 
Subject: Re: Refactor TCP partial ACK handling

TCP_FACK was disabled by provos@ in June 1999. This patch removes
the TCP_FACK option and associated #if{,n}def code.

TCP_FACK is an algorithm that decides that when something is lost, all
not SACKed packets until the most forward SACK are lost. It may be a
correct estimate, if network does not reorder packets. 

The algorithm described in RFC 6675 may be a better replacement. This
culling patch can provide guidance how and where to implement 6675.

Kind regards,

Job

 share/man/man4/options.4 |   5 ---
 sys/conf/GENERIC |   1 -
 sys/netinet/tcp_input.c  | 111 +--
 sys/netinet/tcp_output.c |  42 --
 sys/netinet/tcp_timer.c  |   5 ---
 sys/netinet/tcp_usrreq.c |   5 ---
 sys/netinet/tcp_var.h|   6 ---
 usr.bin/netstat/inet.c   |   3 --
 8 files changed, 1 insertion(+), 177 deletions(-)

diff --git share/man/man4/options.4 share/man/man4/options.4
index c28d4e27896..737dc29efea 100644
--- share/man/man4/options.4
+++ share/man/man4/options.4
@@ -445,11 +445,6 @@ TCP to adjust the transmission rate using this signal.
 Both communication endpoints negotiate enabling
 .Em ECN
 functionality at the TCP connection establishment.
-.It Cd option TCP_FACK
-Turns on forward acknowledgements allowing a more precise estimate of
-outstanding data during the fast recovery phase by using
-.Em SACK
-information.
 .It Cd option TCP_SIGNATURE
 Turns on support for the TCP MD5 Signature option (RFC 2385).
 This is used by
diff --git sys/conf/GENERIC sys/conf/GENERIC
index 6df800175ed..e385b45785c 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -47,7 +47,6 @@ optionFUSE# FUSE
 option SOCKET_SPLICE   # Socket Splicing for TCP and UDP
 option TCP_ECN # Explicit Congestion Notification for TCP
 option TCP_SIGNATURE   # TCP MD5 Signatures, for BGP routing sessions
-#optionTCP_FACK# Forward Acknowledgements for TCP
 
 option INET6   # IPv6
 option IPSEC   # IPsec
diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
index 8d172e2905c..4321d85854c 100644
--- sys/netinet/tcp_input.c
+++ sys/netinet/tcp_input.c
@@ -974,10 +974,6 @@ findpcb:
if (SEQ_GT(tp->snd_una, tp->snd_last))
 #endif
tp->snd_last = tp->snd_una;
-#ifdef TCP_FACK
-   tp->snd_fack = tp->snd_una;
-   tp->retran_data = 0;
-#endif
m_freem(m);
 
/*
@@ -1566,18 +1562,7 @@ trimthenstep6:
 */
if (TCP_TIMER_ISARMED(tp, TCPT_REXMT) == 0)
tp->t_dupacks = 0;
-#ifdef TCP_FACK
-   /*
-* In FACK, can enter fast rec. if the receiver
-* reports a reass. queue longer than 3 segs.
-*/
-   else if (++tp->t_dupacks == tcprexmtthresh ||
-   ((SEQ_GT(tp->snd_fack, tcprexmtthresh *
-   tp->t_maxseg + tp->snd_una)) &&
-   SEQ_GT(tp->snd_una, tp->snd_last))) {
-#else
else if (++tp->t_dupacks == tcprexmtthresh) {
-#endif /* TCP_FACK */
tcp_seq onxt = tp->snd_nxt;
u_long win =
ulmin(tp->snd_wnd, tp->snd_cwnd) /
@@ -1603,15 +1588,6 @@ trimthenstep6:
 #endif
tcpstat_inc(tcps_cwr_frecovery);

tcpstat_inc(tcps_sack_recovery_episode);
-#ifdef TCP_FACK
-   tp->t_dupacks = tcprexmtthresh;
-   (void) tcp_output(tp);
-   /*
-* During FR, snd_cwnd is held
-* constant for FACK.
-*/
-   tp->snd_cwnd = tp->snd_ssthresh;
-#else
/*
 * tcp_output() will send
 * oldest SACK-eligible rtx.
@@ -1619,7 +1595,6 @@ trimthenstep6:
(void) tcp_output(tp);
 

Re: patch: syndaemon hangs on SIGINT

2017-10-24 Thread Base Pr1me
This has fixed the hanging processes on my laptop.

On Tue, Oct 24, 2017 at 11:55 AM, Luca Castagnini <
ilfigliodellaparolaavu...@gmail.com> wrote:

> Hi guys,
>
> syndaemon goes into an infinite loop whenever it receives a SIGINT signal.
> An explanation and a patch are proposed below.
>
> In the file:
> xenocara/driver/xf86-input-synaptics/tools/syndaemon.c
> the program uses a signal handling function which in turn calls
> kill(getpid(), signum), thus resulting in a loop.
> This actually happens for all the signals: SIGHUP, SIGINT, SIGQUIT, SIGILL,
> ...
> as they are all caught with the same signal_handler() function.
>
> Under Linux the following flag would be set:
> act.sa_flags = SA_ONESHOT;
> and the handling function would run only once.
>
> SA_ONESHOT is a (linux only?) obsolete equivalent to SA_RESETHAND.
> The patch below replaces SA_ONESHOT with SA_RESETHAND and removes the
> problem.
>
> Cheers,
> Luca
>
>
> Index: driver/xf86-input-synaptics/tools/syndaemon.c
> ===
> RCS file: /cvs/xenocara/driver/xf86-input-synaptics/tools/syndaemon.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 syndaemon.c
> --- driver/xf86-input-synaptics/tools/syndaemon.c   22 Jan 2017
> 09:54:53 -  1.5
> +++ driver/xf86-input-synaptics/tools/syndaemon.c   24 Oct 2017
> 12:16:40 -
> @@ -173,8 +173,8 @@ install_signal_handler(void)
>  sigemptyset(&set);
>  act.sa_handler = signal_handler;
>  act.sa_mask = set;
> -#ifdef SA_ONESHOT
> -act.sa_flags = SA_ONESHOT;
> +#ifdef SA_RESETHAND
> +act.sa_flags = SA_RESETHAND;
>  #else
>  act.sa_flags = 0;
>  #endif
>


Re: tftpd(8): diff for ip path rewrite

2017-10-24 Thread Jeremie Courreges-Anglas
On Mon, Oct 23 2017, Jan Klemkow  wrote:
> On Sun, Oct 22, 2017 at 09:32:54PM +, Jeremie Courreges-Anglas wrote:
>> On Sat, Oct 21 2017, Jan Klemkow  wrote:
>> > On Fri, Oct 20, 2017 at 12:04:41PM +, Jeremie Courreges-Anglas wrote:
>> >> On Fri, Oct 20 2017, Sebastien Marie  wrote:
>> >> > On Thu, Oct 19, 2017 at 08:58:12PM +0200, Jan Klemkow wrote:
>> >> >> +  char nfilename[PATH_MAX];
>> >> >> +
>> >> >> +  snprintf(nfilename, sizeof nfilename, "%s/%s",
>> >> >> +  getip(&client->ss), filename);
>> >> >
>> >> > - filename has PATH_MAX length
>> >> > - getip(&client->ss) could have NI_MAXHOST length
>> >> 
>> >> INET6_ADDRSTRLEN since getip() calls getnameinfo(NI_NUMERICHOST), but
>> >> your point stands.
>> >> 
>> >> > so nfilename could be larger than PATH_MAX (sizeof nfilename).
>> >> >
>> >> > I assume the return of snprintf() need to be checked. if truncation
>> >> > occured, a warning should be issued and nfilename discarded (just
>> >> > calling tftp_open(client, filename)) ?
>> >> 
>> >> I think we should refuse the request if truncated.
>> >
>> > done
>> >  
>> >> >> +  if (access(nfilename, R_OK) == 0)
>> >> >> +  tftp_open(client, nfilename);
>> >> >> +  else
>> >> >> +  tftp_open(client, filename);
>> >> >> +  }
>> >> 
>> >> Here we look up the same file in both the client-specific subdirectory
>> >> and the default directory.  Should we instead look only in the
>> >> client-specific directory if the latter exist?
>> >
>> > Common files should be found in the default directory.  But, host
>> > specific files could be overwritten if they exist in the subdirectory.
>> 
>> I think it would be better to perform those access tests in
>> validate_access(); logic in a single place, and a less quirky handling
>> of SEEDPATH.  Also the test done should probably depend on the type
>> (read, write) of the request.  Retrying with the default directory may
>> make sense in read mode, but maybe not in write (and -c, create) mode?
>> 
>> The updated diff below implements such semantics, but in
>> validate_access().  While here,
>> - improve diagnostic if both -i and -r are given; usage() doesn't show
>>   the conflict
>> - also test snprintf return value against -1, as spotted by semarie@
>> 
>> Maybe we should add a mention in the manpage that the client can
>> "escape" its client-specific directory?  (ie /../192.0.2.1/file)
>> 
>> The logic is more complicated but I hope it's for good.
>
> I successfully testes jca's diff in my setup and add two lines about
> directory escaping to the manpage.

I don't think there is a need to expand on security and machines
changing their IP address, especially when you're using TFTP, an
insecure protocol.  I just wanted to stress that no enforcement was
done.

Here's an alternate take at documenting -i, addressing a few issues. It
moves the "no path enforcement" sentence to CAVEATS.  I hope you agree
with this move.

While here:
- kill .Tn
- the content of the previous BUGS section doesn't look like a TFTP bug,
  so CAVEATS looks more appropriate to me

Feedback & oks welcome.


Index: tftpd.8
===
RCS file: /d/cvs/src/usr.sbin/tftpd/tftpd.8,v
retrieving revision 1.5
diff -u -p -r1.5 tftpd.8
--- tftpd.8 18 Jul 2015 05:32:56 -  1.5
+++ tftpd.8 24 Oct 2017 18:39:15 -
@@ -37,16 +37,14 @@
 .Nd DARPA Trivial File Transfer Protocol daemon
 .Sh SYNOPSIS
 .Nm tftpd
-.Op Fl 46cdv
+.Op Fl 46cdiv
 .Op Fl l Ar address
 .Op Fl p Ar port
 .Op Fl r Ar socket
 .Ar directory
 .Sh DESCRIPTION
 .Nm
-is a server which implements the
-.Tn DARPA
-Trivial File Transfer Protocol.
+is a server which implements the DARPA Trivial File Transfer Protocol.
 .Pp
 The use of
 .Xr tftp 1
@@ -100,6 +98,19 @@ If this option is specified,
 .Nm
 will run in the foreground and log
 the client IP, type of request, and filename to stderr.
+.It Fl i
+.Nm
+If specified,
+.Nm
+prefixes looks up the requested path in the subdirectory named after the
+client's IP address.
+For read requests, if the file is not found,
+.Nm
+falls back on the requested path.
+This option cannot be combined with
+.Fl r .
+See also
+.Sx CAVEATS
 .It Fl l Ar address
 Listen on the specified address.
 By default
@@ -126,6 +137,8 @@ before the TFTP request will continue.
 By default
 .Nm
 does not use filename rewriting.
+This option cannot be combined with
+.Fl i .
 .It Fl v
 Log the client IP, type of request, and filename.
 .It Ar directory
@@ -151,6 +164,10 @@ and appeared in
 It was rewritten for
 .Ox 5.2
 as a persistent non-blocking daemon.
-.Sh BUGS
+.Sh CAVEATS
 Many TFTP clients will not transfer files over 1678 octets
 .Pq 32767 blocks .
+.Pp
+In
+.Fl i
+mode, no attempt is made to limit the client to its subdirectory.
Index: tftpd.c
===
RCS file: /d/cvs/src/usr.sbi

Add Sparkfun ATtiny programmer in to usbdevs

2017-10-24 Thread Ville Valkonen
Hi,

as the topic states.

On Linux lsusb's iProduct shows:
  iProduct2 FabISP
And on OpenBSD:
  iProduct2 (error)

So in that sense the naming might be a bit off.

Adding the lines below in to uftdi.c doesn't make it work yet, so more
love is needed:

RCS file: /cvs/src/sys/dev/usb/uftdi.c,v
retrieving revision 1.75
diff -u -p -r1.75 uftdi.c
--- uftdi.c12 Dec 2016 04:35:04 -1.75
+++ uftdi.c23 Oct 2017 19:38:37 -
@@ -635,6 +635,7 @@ static const struct usb_devno uftdi_devs
 { USB_VENDOR_MATRIXORB, USB_PRODUCT_MATRIXORB_LCD_01FE },
 { USB_VENDOR_MATRIXORB, USB_PRODUCT_MATRIXORB_LCD_01FF },
 { USB_VENDOR_MECANIQUE, USB_PRODUCT_MECANIQUE_TELLSTICK },
+{ USB_VENDOR_MECANIQUE, USB_PRODUCT_MECANIQUE_AVRPROGRAMMER },
 { USB_VENDOR_MELCO, USB_PRODUCT_MELCO_PCOPRS1 },
 { USB_VENDOR_MOBILITY, USB_PRODUCT_MOBILITY_ED200H },
 { USB_VENDOR_OCT, USB_PRODUCT_OCT_US2308 },


And when the device is attached:

uftdi0 at uhub0 port 1 configuration 1 interface 0 "Mecanique Sparkfun
Tiny AVR programmer" rev 1.10/1.04 addr 2
uftdi0: Could not find data bulk in


Anyway, here's the usbdevs part:

Index: usbdevs_data.h
===
RCS file: /cvs/src/sys/dev/usb/usbdevs_data.h,v
retrieving revision 1.684
diff -u -p -r1.684 usbdevs_data.h
--- usbdevs_data.h11 Oct 2017 17:20:36 -1.684
+++ usbdevs_data.h24 Oct 2017 17:26:22 -
@@ -1,4 +1,4 @@
-/*$OpenBSD: usbdevs_data.h,v 1.684 2017/10/11 17:20:36 patrick Exp $*/
+/*$OpenBSD$*/

 /*
  * THIS FILE IS AUTOMATICALLY GENERATED.  DO NOT EDIT.
@@ -7088,6 +7088,10 @@ const struct usb_known_product usb_known
 {
 USB_VENDOR_MECANIQUE, USB_PRODUCT_MECANIQUE_TELLSTICK,
 "Telldus Tellstick",
+},
+{
+USB_VENDOR_MECANIQUE, USB_PRODUCT_MECANIQUE_AVRPROGRAMMER,
+"Sparkfun Tiny AVR programmer",
 },
 {
 USB_VENDOR_METAGEEK, USB_PRODUCT_METAGEEK_WISPY24I,
Index: usbdevs
===
RCS file: /cvs/src/sys/dev/usb/usbdevs,v
retrieving revision 1.678
diff -u -p -r1.678 usbdevs
--- usbdevs11 Oct 2017 17:19:50 -1.678
+++ usbdevs24 Oct 2017 17:26:22 -
@@ -2944,6 +2944,7 @@ product MELCO WLIUCGNM20x01eeWLI-UC-G
 /* MetaGeek tagged products */
 product MECANIQUE WISPY0x083eMetaGeek Wi-Spy
 product MECANIQUE TELLSTICK0x0c30Telldus Tellstick
+product MECANIQUE AVRPROGRAMMER0x0c9fSparkfun Tiny AVR programmer

 /* MetaGeek products */
 product METAGEEK WISPY24I0x2400Wi-Spy 2.4i


--
Kind regards,
Ville Valkonen



patch: syndaemon hangs on SIGINT

2017-10-24 Thread Luca Castagnini
Hi guys,

syndaemon goes into an infinite loop whenever it receives a SIGINT signal.
An explanation and a patch are proposed below.

In the file:
xenocara/driver/xf86-input-synaptics/tools/syndaemon.c
the program uses a signal handling function which in turn calls
kill(getpid(), signum), thus resulting in a loop.
This actually happens for all the signals: SIGHUP, SIGINT, SIGQUIT, SIGILL,
...
as they are all caught with the same signal_handler() function.

Under Linux the following flag would be set:
act.sa_flags = SA_ONESHOT;
and the handling function would run only once.

SA_ONESHOT is a (linux only?) obsolete equivalent to SA_RESETHAND.
The patch below replaces SA_ONESHOT with SA_RESETHAND and removes the
problem.

Cheers,
Luca


Index: driver/xf86-input-synaptics/tools/syndaemon.c
===
RCS file: /cvs/xenocara/driver/xf86-input-synaptics/tools/syndaemon.c,v
retrieving revision 1.5
diff -u -p -r1.5 syndaemon.c
--- driver/xf86-input-synaptics/tools/syndaemon.c   22 Jan 2017
09:54:53 -  1.5
+++ driver/xf86-input-synaptics/tools/syndaemon.c   24 Oct 2017
12:16:40 -
@@ -173,8 +173,8 @@ install_signal_handler(void)
 sigemptyset(&set);
 act.sa_handler = signal_handler;
 act.sa_mask = set;
-#ifdef SA_ONESHOT
-act.sa_flags = SA_ONESHOT;
+#ifdef SA_RESETHAND
+act.sa_flags = SA_RESETHAND;
 #else
 act.sa_flags = 0;
 #endif


sparc64 kernel inline

2017-10-24 Thread Mark Kettenis
Some of the sparc64 use "extern inline".  That doesn't work for
compilers that implement proper C99 inline semantics as you may end up
with undefined references if you don't provide an actual
implementation.  Use "static inline" instead, just like we do in
similar cases on other architectures.

ok?


Index: arch/sparc64/include/ctlreg.h
===
RCS file: /cvs/src/sys/arch/sparc64/include/ctlreg.h,v
retrieving revision 1.28
diff -u -p -r1.28 ctlreg.h
--- arch/sparc64/include/ctlreg.h   25 May 2017 03:19:39 -  1.28
+++ arch/sparc64/include/ctlreg.h   24 Oct 2017 15:14:43 -
@@ -528,8 +528,8 @@ do {
\
 
 #define sparc_rd(name) sparc_rd_ ## name()
 #define GEN_RD(name)   \
-extern __inline u_int64_t sparc_rd_ ## name(void); \
-extern __inline u_int64_t  \
+static inline u_int64_t sparc_rd_ ## name(void);   \
+static inline u_int64_t\
 sparc_rd_ ## name()\
 {  \
u_int64_t r;\
@@ -540,8 +540,8 @@ sparc_rd_ ## name() 
\
 
 #define sparc_rdpr(name) sparc_rdpr_ ## name()
 #define GEN_RDPR(name) \
-extern __inline u_int64_t sparc_rdpr_ ## name(void);   \
-extern __inline u_int64_t  \
+static inline u_int64_t sparc_rdpr_ ## name(void); \
+static inline u_int64_t\
 sparc_rdpr_ ## name()  \
 {  \
u_int64_t r;\
@@ -573,8 +573,8 @@ GEN_RDPR(ver);
 
 /* Generate ld*a/st*a functions for non-constant ASI's. */
 #define LDNC_GEN(tp, o)
\
-   extern __inline tp o ## _asi(paddr_t);  \
-   extern __inline tp  \
+   static inline tp o ## _asi(paddr_t);\
+   static inline tp\
o ## _asi(paddr_t va)   \
{   \
tp r;   \
@@ -585,8 +585,8 @@ GEN_RDPR(ver);
: "%g0");   \
return (r); \
}   \
-   extern __inline tp o ## _nc(paddr_t, int);  \
-   extern __inline tp  \
+   static inline tp o ## _nc(paddr_t, int);\
+   static inline tp\
o ## _nc(paddr_t va, int asi)   \
{   \
sparc_wr(asi, asi, 0);  \
@@ -626,8 +626,8 @@ LDNC_GEN(int, lda);
 #define ldxa(va, asi)  LD_GENERIC(va, asi, ldx, u_int64_t)
 
 #define STNC_GEN(tp, o)
\
-   extern __inline void o ## _asi(paddr_t, tp);\
-   extern __inline void\
+   static inline void o ## _asi(paddr_t, tp);  \
+   static inline void  \
o ## _asi(paddr_t va, tp val)   \
{   \
__asm volatile( \
@@ -636,8 +636,8 @@ LDNC_GEN(int, lda);
: "r" (val), "r" ((volatile tp *)va)\
: "memory");\
}   \
-   extern __inline void o ## _nc(paddr_t, int, tp);\
-   extern __inline void\
+   static inline void o ## _nc(paddr_t, int, tp);  \
+   static inline void  \
o ## _nc(paddr_t va, int asi, tp val)   \
{  

Re: Dumping IPsec flows w/ radix-tree

2017-10-24 Thread Todd C. Miller
On Tue, 24 Oct 2017 15:46:43 +0200, Martin Pieuchot wrote:

> Since we're now iterating over a tree, this diff changes the output
> of ipsecctl(8) a bit.  But I'd say it's a feature, it allows us to
> see how flows are ordered, which is useful for debugging ;)
> It is also useful to compare configurations since you can now diff
> the output in order to figure if some flow are missing.

This seems like an improvement so OK millert@.

 - todd



isakmpd: DH grp19-21 & 25-30 plumbing

2017-10-24 Thread Martin Pieuchot
Since 2010 isakmpd(8) and iked(8) share the same Diffie-Hellman
implementation based on libcrypto.  In 2014 reyk@ synced these
two implementations bringing support for DH groups 27-30 using
Brainpool curves.

Sadly the necessary plumbing and documentation was missing.  So
here's a diff to make isakmpd(8) users happy.

ok?

Index: ipsecctl/ike.c
===
RCS file: /cvs/src/sbin/ipsecctl/ike.c,v
retrieving revision 1.81
diff -u -p -r1.81 ike.c
--- ipsecctl/ike.c  9 Dec 2015 21:41:50 -   1.81
+++ ipsecctl/ike.c  24 Oct 2017 14:44:38 -
@@ -330,30 +330,57 @@ ike_section_p2(struct ipsec_rule *r, FIL
switch (r->p2xfs->groupxf->id) {
case GROUPXF_NONE:
break;
-   case GROUPXF_768:
+   case GROUPXF_1:
group_desc = "MODP_768";
break;
-   case GROUPXF_1024:
+   case GROUPXF_2:
group_desc = "MODP_1024";
break;
-   case GROUPXF_1536:
+   case GROUPXF_5:
group_desc = "MODP_1536";
break;
-   case GROUPXF_2048:
+   case GROUPXF_14:
group_desc = "MODP_2048";
break;
-   case GROUPXF_3072:
+   case GROUPXF_15:
group_desc = "MODP_3072";
break;
-   case GROUPXF_4096:
+   case GROUPXF_16:
group_desc = "MODP_4096";
break;
-   case GROUPXF_6144:
+   case GROUPXF_17:
group_desc = "MODP_6144";
break;
-   case GROUPXF_8192:
+   case GROUPXF_18:
group_desc = "MODP_8192";
break;
+   case GROUPXF_19:
+   group_desc = "ECP_256";
+   break;
+   case GROUPXF_20:
+   group_desc = "ECP_384";
+   break;
+   case GROUPXF_21:
+   group_desc = "ECP_521";
+   break;
+   case GROUPXF_25:
+   group_desc = "ECP_192";
+   break;
+   case GROUPXF_26:
+   group_desc = "ECP_224";
+   break;
+   case GROUPXF_27:
+   group_desc = "BP_224";
+   break;
+   case GROUPXF_28:
+   group_desc = "BP_256";
+   break;
+   case GROUPXF_29:
+   group_desc = "BP_384";
+   break;
+   case GROUPXF_30:
+   group_desc = "BP_512";
+   break;
default:
warnx("illegal group %s", r->p2xfs->groupxf->name);
return (-1);
@@ -496,34 +523,61 @@ ike_section_p1(struct ipsec_rule *r, FIL
 
if (r->p1xfs && r->p1xfs->groupxf) {
switch (r->p1xfs->groupxf->id) {
-   case GROUPXF_768:
+   case GROUPXF_1:
group_desc = "MODP_768";
break;
-   case GROUPXF_1024:
+   case GROUPXF_2:
group_desc = "MODP_1024";
break;
-   case GROUPXF_1536:
+   case GROUPXF_5:
group_desc = "MODP_1536";
break;
-   case GROUPXF_2048:
+   case GROUPXF_14:
group_desc = "MODP_2048";
break;
-   case GROUPXF_3072:
+   case GROUPXF_15:
group_desc = "MODP_3072";
break;
-   case GROUPXF_4096:
+   case GROUPXF_16:
group_desc = "MODP_4096";
break;
-   case GROUPXF_6144:
+   case GROUPXF_17:
group_desc = "MODP_6144";
break;
-   case GROUPXF_8192:
+   case GROUPXF_18:
group_desc = "MODP_8192";
break;
+   case GROUPXF_19:
+   group_desc = "ECP_256";
+   break;
+   case GROUPXF_20:
+   group_desc = "ECP_384";
+   break;
+   case GROUPXF_21:
+   group_desc = "ECP_521";
+   break;
+   case GROUPXF_25:
+   group_desc = "ECP_192";
+   break;
+   case GROUPXF_26:
+   group_desc = "ECP_224";

Re: refill msk(4) rx ring from a timeout when there's no mbufs

2017-10-24 Thread Mark Kettenis
> Date: Tue, 24 Oct 2017 09:37:38 +1000
> From: David Gwynne 
> 
> theo pointed out the timeout should be cancelled when the interface
> goes down. this adds a timeout_del in msk_stop.

Hmm, you didn't attach a new diff...

Commenting on the previous diff; do we really need to have the "tick"
parameter?  Currently 0 and 1 are treated the same by timeout_add(9),
so it seems a bit pointless.  Could just hardcode 1.

Make sure phessler tests this!

> some other drivers suffer this problem, ill have a look at them
> once this is in.
> 
> On Mon, Oct 23, 2017 at 11:35:09AM +1000, David Gwynne wrote:
> > if msk runs out of mbufs, the rx ring remains empty and there's
> > nothing except an ifconfig down and up to get it going again.
> > 
> > this adds a timeout to refill the ring. it's largely copied from
> > other drivers (vr in this case).
> > 
> > tests? ok?
> > 
> > Index: if_mskvar.h
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_mskvar.h,v
> > retrieving revision 1.13
> > diff -u -p -r1.13 if_mskvar.h
> > --- if_mskvar.h 2 Jun 2017 01:47:36 -   1.13
> > +++ if_mskvar.h 23 Oct 2017 01:32:04 -
> > @@ -212,6 +212,7 @@ struct sk_if_softc {
> > int sk_pktlen;
> > int sk_link;
> > struct timeout  sk_tick_ch;
> > +   struct timeout  sk_tick_rx;
> > struct msk_chain_data   sk_cdata;
> > struct msk_ring_data*sk_rdata;
> > bus_dmamap_tsk_ring_map;
> > Index: if_msk.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_msk.c,v
> > retrieving revision 1.130
> > diff -u -p -r1.130 if_msk.c
> > --- if_msk.c3 Oct 2017 15:37:58 -   1.130
> > +++ if_msk.c23 Oct 2017 01:32:04 -
> > @@ -148,7 +148,7 @@ void msk_ifmedia_sts(struct ifnet *, str
> >  int msk_newbuf(struct sk_if_softc *);
> >  int msk_init_rx_ring(struct sk_if_softc *);
> >  int msk_init_tx_ring(struct sk_if_softc *);
> > -void msk_fill_rx_ring(struct sk_if_softc *);
> > +void msk_fill_rx_ring(struct sk_if_softc *, int);
> >  
> >  int msk_miibus_readreg(struct device *, int, int);
> >  void msk_miibus_writereg(struct device *, int, int, int);
> > @@ -156,6 +156,7 @@ void msk_miibus_statchg(struct device *)
> >  
> >  void msk_iff(struct sk_if_softc *);
> >  void msk_tick(void *);
> > +void msk_fill_rx_tick(void *);
> >  
> >  #ifdef MSK_DEBUG
> >  #define DPRINTF(x) if (mskdebug) printf x
> > @@ -428,7 +429,7 @@ msk_init_rx_ring(struct sk_if_softc *sc_
> > /* two ring entries per packet, so the effective ring size is halved */
> > if_rxr_init(&sc_if->sk_cdata.sk_rx_ring, 2, MSK_RX_RING_CNT/2);
> >  
> > -   msk_fill_rx_ring(sc_if);
> > +   msk_fill_rx_ring(sc_if, 0);
> > return (0);
> >  }
> >  
> > @@ -1027,6 +1028,7 @@ msk_attach(struct device *parent, struct
> > ifmedia_set(&sc_if->sk_mii.mii_media, IFM_ETHER|IFM_AUTO);
> >  
> > timeout_set(&sc_if->sk_tick_ch, msk_tick, sc_if);
> > +   timeout_set(&sc_if->sk_tick_rx, msk_fill_rx_tick, sc_if);
> >  
> > /*
> >  * Call MI attach routines.
> > @@ -1779,7 +1781,7 @@ msk_txeof(struct sk_if_softc *sc_if)
> >  }
> >  
> >  void
> > -msk_fill_rx_ring(struct sk_if_softc *sc_if)
> > +msk_fill_rx_ring(struct sk_if_softc *sc_if, int ticks)
> >  {
> > u_int slots, used;
> >  
> > @@ -1792,6 +1794,20 @@ msk_fill_rx_ring(struct sk_if_softc *sc_
> > slots -= used;
> > }
> > if_rxr_put(&sc_if->sk_cdata.sk_rx_ring, slots);
> > +   if (if_rxr_inuse(&sc_if->sk_cdata.sk_rx_ring) == 0)
> > +   timeout_add(&sc_if->sk_tick_rx, ticks);
> > +}
> > +
> > +void
> > +msk_fill_rx_tick(void *xsc_if)
> > +{
> > +   struct sk_if_softc *sc_if = xsc_if;
> > +   int s;
> > +
> > +   s = splnet();
> > +   if (if_rxr_inuse(&sc_if->sk_cdata.sk_rx_ring) == 0)
> > +   msk_fill_rx_ring(sc_if, 1);
> > +   splx(s);
> >  }
> >  
> >  void
> > @@ -1902,12 +1918,12 @@ msk_intr(void *xsc)
> > CSR_WRITE_4(sc, SK_Y2_ICR, 2);
> >  
> > if (rx[0]) {
> > -   msk_fill_rx_ring(sc_if0);
> > +   msk_fill_rx_ring(sc_if0, 0);
> > SK_IF_WRITE_2(sc_if0, 0, SK_RXQ1_Y2_PREF_PUTIDX,
> > sc_if0->sk_cdata.sk_rx_prod);
> > }
> > if (rx[1]) {
> > -   msk_fill_rx_ring(sc_if1);
> > +   msk_fill_rx_ring(sc_if1, 0);
> > SK_IF_WRITE_2(sc_if1, 0, SK_RXQ1_Y2_PREF_PUTIDX,
> > sc_if1->sk_cdata.sk_rx_prod);
> > }
> 
> 



ikev2: follow rfc5903 correctly (ECP Groups)

2017-10-24 Thread Patrick Wildt
Hi,

in the final RFC 5903 the computation for the DH shared secret changed.
Instead of the full point, only the X point is included.  Unfortunately
this is a backwards incompatible change, so older ikeds won't be com-
patible with this change is committed.  Of course only if you use ECP.
Anyway, this change makes us follow the RFC correctly.

Source: https://tools.ietf.org/html/rfc5903 - 9.  Changes from RFC 4753

ok?

Patrick

diff --git a/sbin/iked/dh.c b/sbin/iked/dh.c
index a8308eec596..a3ef5f80906 100644
--- a/sbin/iked/dh.c
+++ b/sbin/iked/dh.c
@@ -38,10 +38,13 @@ int modp_create_shared(struct group *, uint8_t *, uint8_t 
*);
 /* EC2N/ECP */
 intec_init(struct group *);
 intec_getlen(struct group *);
+intec_secretlen(struct group *);
 intec_create_exchange(struct group *, uint8_t *);
 intec_create_shared(struct group *, uint8_t *, uint8_t *);
 
-intec_point2raw(struct group *, const EC_POINT *, uint8_t *, size_t);
+#define EC_POINT2RAW_FULL  0
+#define EC_POINT2RAW_XONLY 1
+intec_point2raw(struct group *, const EC_POINT *, uint8_t *, size_t, int);
 EC_POINT *
ec_raw2point(struct group *, uint8_t *, size_t);
 
@@ -293,6 +296,7 @@ group_get(uint32_t id)
case GROUP_ECP:
group->init = ec_init;
group->getlen = ec_getlen;
+   group->secretlen = ec_secretlen;
group->exchange = ec_create_exchange;
group->shared = ec_create_shared;
break;
@@ -343,6 +347,15 @@ dh_getlen(struct group *group)
return (group->getlen(group));
 }
 
+int
+dh_secretlen(struct group *group)
+{
+   if (group->secretlen)
+   return (group->secretlen(group));
+   else
+   return (group->getlen(group));
+}
+
 int
 dh_create_exchange(struct group *group, uint8_t *buf)
 {
@@ -450,6 +463,20 @@ ec_getlen(struct group *group)
return ((roundup(group->spec->bits, 8) * 2) / 8);
 }
 
+/*
+ * Note that the shared secret only includes the x value:
+ *
+ * See RFC 5903, 7. ECP Key Exchange Data Formats:
+ *   The Diffie-Hellman shared secret value consists of the x value of the
+ *   Diffie-Hellman common value.
+ * See also RFC 5903, 9. Changes from RFC 4753.
+ */
+int
+ec_secretlen(struct group *group)
+{
+   return (ec_getlen(group) / 2);
+}
+
 int
 ec_create_exchange(struct group *group, uint8_t *buf)
 {
@@ -459,7 +486,7 @@ ec_create_exchange(struct group *group, uint8_t *buf)
bzero(buf, len);
 
return (ec_point2raw(group, EC_KEY_get0_public_key(group->ec),
-   buf, len));
+   buf, len, EC_POINT2RAW_FULL));
 }
 
 int
@@ -496,7 +523,8 @@ ec_create_shared(struct group *group, uint8_t *secret, 
uint8_t *exchange)
if (!EC_POINT_mul(ecgroup, secretp, NULL, exchangep, privkey, NULL))
goto done;
 
-   ret = ec_point2raw(group, secretp, secret, ec_getlen(group));
+   ret = ec_point2raw(group, secretp, secret, ec_secretlen(group),
+   EC_POINT2RAW_XONLY);
 
  done:
if (exkey != NULL)
@@ -511,7 +539,7 @@ ec_create_shared(struct group *group, uint8_t *secret, 
uint8_t *exchange)
 
 int
 ec_point2raw(struct group *group, const EC_POINT *point,
-uint8_t *buf, size_t len)
+uint8_t *buf, size_t len, int mode)
 {
const EC_GROUP  *ecgroup = NULL;
BN_CTX  *bnctx = NULL;
@@ -528,9 +556,19 @@ ec_point2raw(struct group *group, const EC_POINT *point,
goto done;
 
eclen = ec_getlen(group);
-   if (len < eclen)
+   switch (mode) {
+   case EC_POINT2RAW_XONLY:
+   xlen = eclen / 2;
+   ylen = 0;
+   break;
+   case EC_POINT2RAW_FULL:
+   xlen = ylen = eclen / 2;
+   break;
+   default:
+   goto done;
+   }
+   if (len < xlen + ylen)
goto done;
-   xlen = ylen = eclen / 2;
 
if ((ecgroup = EC_KEY_get0_group(group->ec)) == NULL)
goto done;
@@ -551,10 +589,12 @@ ec_point2raw(struct group *group, const EC_POINT *point,
if (!BN_bn2bin(x, buf + xoff))
goto done;
 
-   yoff = (ylen - BN_num_bytes(y)) + xlen;
-   bzero(buf + xlen, yoff - xlen);
-   if (!BN_bn2bin(y, buf + yoff))
-   goto done;
+   if (ylen > 0) {
+   yoff = (ylen - BN_num_bytes(y)) + xlen;
+   bzero(buf + xlen, yoff - xlen);
+   if (!BN_bn2bin(y, buf + yoff))
+   goto done;
+   }
 
ret = 0;
  done:
diff --git a/sbin/iked/dh.h b/sbin/iked/dh.h
index 77bb4b5ef16..7e24d4d6746 100644
--- a/sbin/iked/dh.h
+++ b/sbin/iked/dh.h
@@ -46,6 +46,7 @@ struct group {
 
int (*init)(struct group *);
int (*getlen)(struct group *);
+   int (*secretlen)(struct group *);
int (*exchange)(struct group *, uint8_t *);
int (*shared)(struct group *, uint8_t *

Re: Refactor TCP partial ACK handling

2017-10-24 Thread Alexander Bluhm
On Tue, Oct 24, 2017 at 03:21:08PM +0200, Mike Belopuhov wrote:
> I didn't do it because tcp_var.h is where tcp keeps all of it's prototypes
> but I don't mind moving them into tcp_input.c.  Any objections?  Otherwise
> I'll check in the diff below.

Regression tests pass.
OK bluhm@

> diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
> index 790e163975e..8d172e2905c 100644
> --- sys/netinet/tcp_input.c
> +++ sys/netinet/tcp_input.c
> @@ -183,10 +183,13 @@ do { \
>   else \
>   TCP_SET_DELACK(tp); \
>   if_put(ifp); \
>  } while (0)
>  
> +void  tcp_sack_partialack(struct tcpcb *, struct tcphdr *);
> +void  tcp_newreno_partialack(struct tcpcb *, struct tcphdr *);
> +
>  void  syn_cache_put(struct syn_cache *);
>  void  syn_cache_rm(struct syn_cache *);
>  int   syn_cache_respond(struct syn_cache *, struct mbuf *);
>  void  syn_cache_timer(void *);
>  void  syn_cache_reaper(void *);
> @@ -1664,52 +1667,39 @@ trimthenstep6:
>   }
>   /*
>* If the congestion window was inflated to account
>* for the other side's cached packets, retract it.
>*/
> - if (tp->sack_enable) {
> - if (tp->t_dupacks >= tcprexmtthresh) {
> - /* Check for a partial ACK */
> - if (tcp_sack_partialack(tp, th)) {
> -#ifdef TCP_FACK
> - /* Force call to tcp_output */
> - if (tp->snd_awnd < tp->snd_cwnd)
> - tp->t_flags |= TF_NEEDOUTPUT;
> -#else
> - tp->snd_cwnd += tp->t_maxseg;
> - tp->t_flags |= TF_NEEDOUTPUT;
> -#endif /* TCP_FACK */
> - } else {
> - /* Out of fast recovery */
> - tp->snd_cwnd = tp->snd_ssthresh;
> - if (tcp_seq_subtract(tp->snd_max,
> - th->th_ack) < tp->snd_ssthresh)
> - tp->snd_cwnd =
> -tcp_seq_subtract(tp->snd_max,
> -th->th_ack);
> - tp->t_dupacks = 0;
> -#ifdef TCP_FACK
> - if (SEQ_GT(th->th_ack, tp->snd_fack))
> - tp->snd_fack = th->th_ack;
> -#endif
> - }
> - }
> - } else {
> - if (tp->t_dupacks >= tcprexmtthresh &&
> - !tcp_newreno(tp, th)) {
> + if (tp->t_dupacks >= tcprexmtthresh) {
> + /* Check for a partial ACK */
> + if (SEQ_LT(th->th_ack, tp->snd_last)) {
> + if (tp->sack_enable)
> + tcp_sack_partialack(tp, th);
> + else
> + tcp_newreno_partialack(tp, th);
> + } else {
>   /* Out of fast recovery */
>   tp->snd_cwnd = tp->snd_ssthresh;
>   if (tcp_seq_subtract(tp->snd_max, th->th_ack) <
>   tp->snd_ssthresh)
>   tp->snd_cwnd =
>   tcp_seq_subtract(tp->snd_max,
>   th->th_ack);
>   tp->t_dupacks = 0;
> +#ifdef TCP_FACK
> + if (tp->sack_enable &&
> + SEQ_GT(th->th_ack, tp->snd_fack))
> + tp->snd_fack = th->th_ack;
> +#endif
>   }
> - }
> - if (tp->t_dupacks < tcprexmtthresh)
> + } else {
> + /*
> +  * Reset the duplicate ACK counter if we
> +  * were not in fast recovery.
> +  */
>   tp->t_dupacks = 0;
> + }
>   if (SEQ_GT(th->th_ack, tp->snd_max)) {
>   tcpstat_inc(tcps_rcvacktoomuch);
>   goto dropafterack_ratelim;
>   }
>   acked = th->th_ack - tp->snd_una;
> @@ -2703,36 +2693,38 @@ tcp_clean_sackreport(struct tcpcb *tp)
>   tp->sackblks[i].start = tp->sackblks[i].end=0;
>  
>  }
>  
>  /*
> - * Checks for partial ack.  If partial ack arrives, turn off retransmission
> - * timer, deflate the window, do not clear tp->t_dupacks, and return 1.
> - * If the ack advances at least to tp->snd_last, return 0.
> + * Partial ack handling within a sack recovery episode.  When a partial ack
> + * arrives, turn off retransmi

fuselib: patch to check for NULL in fuse_parse_cmdline()

2017-10-24 Thread Helg Bredow
Arguments to fuse_parse_cmdline() are not checked for NULL before assignment. 
This patch performs the check.

Index: fuse.c
===
RCS file: /cvs/src/lib/libfuse/fuse.c,v
retrieving revision 1.30
diff -u -p -u -p -r1.30 fuse.c
--- fuse.c  24 Oct 2017 09:01:05 -  1.30
+++ fuse.c  24 Oct 2017 13:12:51 -
@@ -426,10 +426,14 @@ fuse_parse_cmdline(struct fuse_args *arg
return (-1);
}
 
-   *mp = strdup(opt.mp);
-   if (*mp == NULL)
-   return (-1);
-   *mt = 0;
+   if (mp != NULL) {
+   *mp = strdup(opt.mp);
+   if (*mp == NULL)
+   return (-1);
+   }
+
+   if (mt != NULL)
+   *mt = 0;
 
return (0);
 }



Dumping IPsec flows w/ radix-tree

2017-10-24 Thread Martin Pieuchot
Diff below switches the logic of the NET_KEY_SPD_DUMP sysctl(2) to
iterate over the SPD table instead of the global list of policies
`ipsec_policy_head'.

This will allow us to get rid of the global list, which makes future
MP work easier.

Since we're now iterating over a tree, this diff changes the output
of ipsecctl(8) a bit.  But I'd say it's a feature, it allows us to
see how flows are ordered, which is useful for debugging ;)
It is also useful to compare configurations since you can now diff
the output in order to figure if some flow are missing.


Before
 flow esp in from 192.168.200.0/24 to 10.10.10.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp out from 10.10.10.0/24 to 192.168.200.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp in from 192.168.200.0/24 to 10.10.11.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp out from 10.10.11.0/24 to 192.168.200.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp in from 192.168.100.0/24 to 10.10.10.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp out from 10.10.10.0/24 to 192.168.100.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp in from 192.168.100.0/24 to 10.10.11.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp out from 10.10.11.0/24 to 192.168.100.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require


After:
 flow esp in from 192.168.100.0/24 to 10.10.10.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp in from 192.168.100.0/24 to 10.10.11.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp in from 192.168.200.0/24 to 10.10.10.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp in from 192.168.200.0/24 to 10.10.11.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type use
 flow esp out from 10.10.10.0/24 to 192.168.100.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp out from 10.10.10.0/24 to 192.168.200.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp out from 10.10.11.0/24 to 192.168.100.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require
 flow esp out from 10.10.11.0/24 to 192.168.200.0/24 peer 192.168.1.80 srcid 
192.168.1.102/32 dstid 192.168.1.80/32 type require


ok?


Index: net/pfkeyv2.c
===
RCS file: /cvs/src/sys/net/pfkeyv2.c,v
retrieving revision 1.168
diff -u -p -r1.168 pfkeyv2.c
--- net/pfkeyv2.c   16 Oct 2017 08:22:25 -  1.168
+++ net/pfkeyv2.c   24 Oct 2017 13:31:56 -
@@ -165,6 +165,7 @@ int pfkeyv2_usrreq(struct socket *, int,
 int pfkeyv2_output(struct mbuf *, struct socket *, struct sockaddr *,
 struct mbuf *);
 int pfkey_sendup(struct keycb *, struct mbuf *, int);
+int pfkeyv2_sysctl_policydumper(struct ipsec_policy *, void *, unsigned int);
 
 /*
  * Wrapper around m_devget(); copy data from contiguous buffer to mbuf
@@ -2324,24 +2325,8 @@ ret:
 }
 
 int
-pfkeyv2_ipo_walk(u_int rdomain, int (*walker)(struct ipsec_policy *, void *),
-void *arg)
-{
-   int rval = 0;
-   struct ipsec_policy *ipo;
-
-   NET_ASSERT_LOCKED();
-
-   TAILQ_FOREACH(ipo, &ipsec_policy_head, ipo_list) {
-   if (ipo->ipo_rdomain != rdomain)
-   continue;
-   rval = walker(ipo, (void *)arg);
-   }
-   return (rval);
-}
-
-int
-pfkeyv2_sysctl_policydumper(struct ipsec_policy *ipo, void *arg)
+pfkeyv2_sysctl_policydumper(struct ipsec_policy *ipo, void *arg,
+unsigned int tableid)
 {
struct pfkeyv2_sysctl_walk *w = (struct pfkeyv2_sysctl_walk *)arg;
void *buffer = 0;
@@ -2433,7 +2418,7 @@ pfkeyv2_sysctl(int *name, u_int namelen,
 
case NET_KEY_SPD_DUMP:
NET_LOCK();
-   error = pfkeyv2_ipo_walk(rdomain,
+   error = spd_table_walk(rdomain,
pfkeyv2_sysctl_policydumper, &w);
NET_UNLOCK();
if (oldp)
Index: net/pfkeyv2.h
===
RCS file: /cvs/src/sys/net/pfkeyv2.h,v
retrieving revision 1.77
diff -u -p -r1.77 pfkeyv2.h
--- net/pfkeyv2.h   29 May 2017 14:28:01 -  1.77
+++ net/pfkeyv2.h   24 Oct 2017 13:32:06 -
@@ -382,9 +382,7 @@ int pfkeyv2_flush_walker(struct tdb *, v
 int pfkeyv2_get_proto_alg(u_int8_t, u_int8_t *, int *);
 int pfkeyv2_sysctl(int *, u_int, void *, size_t *, void *, size_t);
 int pfkeyv2_sysctl_walker(struct tdb *, void *, int);
-int pfkeyv2_ipo_walk(u_int, int (*)(struct ipsec_policy *, void *), void *);
 int pfkeyv2_sysctl_dump(void *);
-int pfkeyv2

Re: [patch]Use BUFSIZE instead of hard-code in netcat.c

2017-10-24 Thread Nan Xiao
Ouch! I misunderstood the patch by @bluhm, please ignore my
previous mail! I am very sorry for disrupting!
Best Regards
Nan Xiao


On Tue, Oct 24, 2017 at 9:30 PM, Nan Xiao  wrote:
> Actually, I think "plen = sizeof(buf);" may be better.
> Best Regards
> Nan Xiao
>
>
> On Tue, Oct 24, 2017 at 8:52 PM, Alexander Bluhm
>  wrote:
>> On Tue, Oct 24, 2017 at 07:44:02PM +0800, Nan Xiao wrote:
>>> Use BUFSIZE instead of hard-code in netcat.c, FYI. Thanks!
>>
>> As this buffer is used with MSG_PEEK and its content is discarded,
>> the size does not really matter.  The complicated logic seems to
>> be a leftover from the -j jumbo option.
>>
>> I think this is simpler.
>>
>> bluhm
>>
>> Index: usr.bin/nc/netcat.c
>> ===
>> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/nc/netcat.c,v
>> retrieving revision 1.187
>> diff -u -p -r1.187 netcat.c
>> --- usr.bin/nc/netcat.c 15 Jul 2017 17:27:39 -  1.187
>> +++ usr.bin/nc/netcat.c 24 Oct 2017 12:41:38 -
>> @@ -563,13 +563,12 @@ main(int argc, char *argv[])
>>  * initially to wait for a caller, then use
>>  * the regular functions to talk to the 
>> caller.
>>  */
>> -   int rv, plen;
>> -   char buf[16384];
>> +   int rv;
>> +   char buf[2048];
>> struct sockaddr_storage z;
>>
>> len = sizeof(z);
>> -   plen = 2048;
>> -   rv = recvfrom(s, buf, plen, MSG_PEEK,
>> +   rv = recvfrom(s, buf, sizeof(buf), MSG_PEEK,
>> (struct sockaddr *)&z, &len);
>> if (rv < 0)
>> err(1, "recvfrom");



Re: [patch]Use BUFSIZE instead of hard-code in netcat.c

2017-10-24 Thread Nan Xiao
Actually, I think "plen = sizeof(buf);" may be better.
Best Regards
Nan Xiao


On Tue, Oct 24, 2017 at 8:52 PM, Alexander Bluhm
 wrote:
> On Tue, Oct 24, 2017 at 07:44:02PM +0800, Nan Xiao wrote:
>> Use BUFSIZE instead of hard-code in netcat.c, FYI. Thanks!
>
> As this buffer is used with MSG_PEEK and its content is discarded,
> the size does not really matter.  The complicated logic seems to
> be a leftover from the -j jumbo option.
>
> I think this is simpler.
>
> bluhm
>
> Index: usr.bin/nc/netcat.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/nc/netcat.c,v
> retrieving revision 1.187
> diff -u -p -r1.187 netcat.c
> --- usr.bin/nc/netcat.c 15 Jul 2017 17:27:39 -  1.187
> +++ usr.bin/nc/netcat.c 24 Oct 2017 12:41:38 -
> @@ -563,13 +563,12 @@ main(int argc, char *argv[])
>  * initially to wait for a caller, then use
>  * the regular functions to talk to the 
> caller.
>  */
> -   int rv, plen;
> -   char buf[16384];
> +   int rv;
> +   char buf[2048];
> struct sockaddr_storage z;
>
> len = sizeof(z);
> -   plen = 2048;
> -   rv = recvfrom(s, buf, plen, MSG_PEEK,
> +   rv = recvfrom(s, buf, sizeof(buf), MSG_PEEK,
> (struct sockaddr *)&z, &len);
> if (rv < 0)
> err(1, "recvfrom");



Re: Refactor TCP partial ACK handling

2017-10-24 Thread Mike Belopuhov
On Tue, Oct 24, 2017 at 13:37 +0200, Martin Pieuchot wrote:
> On 24/10/17(Tue) 12:27, Mike Belopuhov wrote:
> > On Tue, Oct 24, 2017 at 12:05 +0200, Martin Pieuchot wrote:
> > > On 21/10/17(Sat) 15:17, Mike Belopuhov wrote:
> > > > On Fri, Oct 20, 2017 at 22:59 +0200, Klemens Nanni wrote:
> > > > > The comments for both void tcp_{sack,newreno}_partialack() still 
> > > > > mention
> > > > > tp->snd_last and return value bits.
> > > > > 
> > > > 
> > > > Good eyes!  It made me spot a mistake I made by folding two lines
> > > > into an incorrect ifdef in tcp_sack_partialack.  I expected it to
> > > > say "ifdef TCP_FACK" while it says "ifNdef".  The adjusted comment
> > > > didn't make sense and I found the bug.
> > > 
> > > Could you send the full/fixed diff?
> > > 
> > 
> > Sure.
> 
> Diff is correct.  I have two suggestions, but it's ok mpi@ either way.
> 
> > > And what about TCP_FACK?  It is disabled by default, is there a
> > > point in keeping it?
> > 
> > Job has pointed out that RFC 6675 might be a better alternative
> > so it might be a good idea to ditch it while we're at it.  I'm
> > not certain which parts need to be preserved (if any) however.
> 
> I'd say remove it.  One can always look in the Attic if necessary.
> 
> > diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
> > index 790e163975e..3809a2371f2 100644
> > --- sys/netinet/tcp_input.c
> > +++ sys/netinet/tcp_input.c
> > @@ -1664,52 +1664,38 @@ trimthenstep6:
> > }
> > /*
> >  * If the congestion window was inflated to account
> >  * for the other side's cached packets, retract it.
> >  */
> > -   if (tp->sack_enable) {
> > -   if (tp->t_dupacks >= tcprexmtthresh) {
> > -   /* Check for a partial ACK */
> > -   if (tcp_sack_partialack(tp, th)) {
> > -#ifdef TCP_FACK
> > -   /* Force call to tcp_output */
> > -   if (tp->snd_awnd < tp->snd_cwnd)
> > -   tp->t_flags |= TF_NEEDOUTPUT;
> > -#else
> > -   tp->snd_cwnd += tp->t_maxseg;
> > -   tp->t_flags |= TF_NEEDOUTPUT;
> > -#endif /* TCP_FACK */
> > -   } else {
> > -   /* Out of fast recovery */
> > -   tp->snd_cwnd = tp->snd_ssthresh;
> > -   if (tcp_seq_subtract(tp->snd_max,
> > -   th->th_ack) < tp->snd_ssthresh)
> > -   tp->snd_cwnd =
> > -  tcp_seq_subtract(tp->snd_max,
> > -  th->th_ack);
> > -   tp->t_dupacks = 0;
> > -#ifdef TCP_FACK
> > -   if (SEQ_GT(th->th_ack, tp->snd_fack))
> > -   tp->snd_fack = th->th_ack;
> > -#endif
> > -   }
> > -   }
> > -   } else {
> > -   if (tp->t_dupacks >= tcprexmtthresh &&
> > -   !tcp_newreno(tp, th)) {
> > +   if (tp->t_dupacks >= tcprexmtthresh) {
> 
> I'd keep the comment:
> 
>   /* Check for a partial ACK */

Sure.

> > diff --git sys/netinet/tcp_var.h sys/netinet/tcp_var.h
> > index 6b797fd48e7..97b04884879 100644
> > --- sys/netinet/tcp_var.h
> > +++ sys/netinet/tcp_var.h
> > @@ -764,15 +764,15 @@ void   tcp_update_sack_list(struct tcpcb *tp, 
> > tcp_seq, tcp_seq);
> >  voidtcp_del_sackholes(struct tcpcb *, struct tcphdr *);
> >  voidtcp_clean_sackreport(struct tcpcb *tp);
> >  voidtcp_sack_adjust(struct tcpcb *tp);
> >  struct sackhole *
> >  tcp_sack_output(struct tcpcb *tp);
> > -int tcp_sack_partialack(struct tcpcb *, struct tcphdr *);
> > +voidtcp_sack_partialack(struct tcpcb *, struct tcphdr *);
> > +voidtcp_newreno_partialack(struct tcpcb *, struct tcphdr *);
> >  #ifdef DEBUG
> >  voidtcp_print_holes(struct tcpcb *tp);
> >  #endif
> > -int tcp_newreno(struct tcpcb *, struct tcphdr *);
> 
> I'd love to see these definitions moved to netinet/tcp_input.c.  It
> makes code audit much more simpler as you know that their are local.
> 

I didn't do it because tcp_var.h is where tcp keeps all of it's prototypes
but I don't mind moving them into tcp_input.c.  Any objections?  Otherwise
I'll check in the diff below.

diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
index 790e163975e..8d172e2905c 100644
--- sys/netinet/tcp_input.c
+++ sys/netinet/tcp_input.c
@@ -183,10 +183,13 @@ do { \
else \
TCP_SET_DELACK(tp); \
if_put(ifp); \
 } while (0)
 
+voidtcp_sack_partialack(struct tcpcb *, struct tcphdr *);
+voidtcp_newreno_partialack(st

Re: dd: exit nonzero on receipt of SIGINT

2017-10-24 Thread Theo Buehler
> So when calling _exit() explicitly, we have to pass a status
> of 128 + signo indeed.

I agree with your analysis. ok



Re: [patch]Use BUFSIZE instead of hard-code in netcat.c

2017-10-24 Thread Alexander Bluhm
On Tue, Oct 24, 2017 at 07:44:02PM +0800, Nan Xiao wrote:
> Use BUFSIZE instead of hard-code in netcat.c, FYI. Thanks!

As this buffer is used with MSG_PEEK and its content is discarded,
the size does not really matter.  The complicated logic seems to
be a leftover from the -j jumbo option.

I think this is simpler.

bluhm

Index: usr.bin/nc/netcat.c
===
RCS file: /data/mirror/openbsd/cvs/src/usr.bin/nc/netcat.c,v
retrieving revision 1.187
diff -u -p -r1.187 netcat.c
--- usr.bin/nc/netcat.c 15 Jul 2017 17:27:39 -  1.187
+++ usr.bin/nc/netcat.c 24 Oct 2017 12:41:38 -
@@ -563,13 +563,12 @@ main(int argc, char *argv[])
 * initially to wait for a caller, then use
 * the regular functions to talk to the caller.
 */
-   int rv, plen;
-   char buf[16384];
+   int rv;
+   char buf[2048];
struct sockaddr_storage z;
 
len = sizeof(z);
-   plen = 2048;
-   rv = recvfrom(s, buf, plen, MSG_PEEK,
+   rv = recvfrom(s, buf, sizeof(buf), MSG_PEEK,
(struct sockaddr *)&z, &len);
if (rv < 0)
err(1, "recvfrom");



Re: dd: exit nonzero on receipt of SIGINT

2017-10-24 Thread Ingo Schwarze
Hi,

Scott Cheloha wrote on Mon, Oct 23, 2017 at 09:01:04PM -0500:

> Per this bit from POSIX on dd(1):

>> For SIGINT, the dd utility shall interrupt its current processing,
>> write status information to standard error, and exit as though
>> terminated by SIGINT.  It shall take the standard action for all
>> other signals; see [...].

> I think we ought to exit nonzero when SIGINT'd.

I think that is the correct interpretation, because

  http://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html

does not describe SIGINT in more detail,

  http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html

clearly says that SIGINT causes *abnormal* termination by default, and

  
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

says

  If the default action is to terminate the process abnormally, the
  process is terminated as if by a call to _exit(), except that the
  status made available to wait(), waitid(), and waitpid() indicates
  abnormal termination by the signal.

So when calling _exit() explicitly, we have to pass a status
of 128 + signo indeed.


Of course, this can cause a change of behaviour in scripts, but
it seems more likely to me to fix bugs than introduce any.
A script that relied on dd(1) to succeed on SIGINT would seem
quite badly broken to me.

OK to commit?
  Ingo


> Index: bin/dd/misc.c
> ===
> RCS file: /cvs/src/bin/dd/misc.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 misc.c
> --- bin/dd/misc.c 13 Aug 2017 02:06:42 -  1.21
> +++ bin/dd/misc.c 24 Oct 2017 01:38:10 -
> @@ -111,9 +111,8 @@ summaryx(int notused)
>  }
>  
>  void
> -terminate(int notused)
> +terminate(int signo)
>  {
> -
>   summary();
> - _exit(0);
> + _exit(128 + signo);
>  }



Re: kqueue_scan() in parallel

2017-10-24 Thread Alexander Bluhm
On Tue, Oct 24, 2017 at 11:14:22AM +0200, Martin Pieuchot wrote:
> This is not a problem.  The way knote_acquire() is designed is to sleep
> for a small amount of time, 1 * hz for the moment.

Ah yes, I see the hz now.

So there cannot be starvation.  This is a best effort algorithm,
we do not need reliable wakeups.  In the few cases they are missing,
we continue after 1 hz.  I think that is fine.

> And I think
> we should continue to sync with their work.

This is a valid reason.

OK bluhm@



[patch]Use BUFSIZE instead of hard-code in netcat.c

2017-10-24 Thread Nan Xiao
Hi tech@,

Use BUFSIZE instead of hard-code in netcat.c, FYI. Thanks!

Best Regards
Nan Xiao

Index: netcat.c
===
RCS file: /cvs/src/usr.bin/nc/netcat.c,v
retrieving revision 1.187
diff -u -p -r1.187 netcat.c
--- netcat.c15 Jul 2017 17:27:39 -  1.187
+++ netcat.c24 Oct 2017 09:03:28 -
@@ -564,7 +564,7 @@ main(int argc, char *argv[])
 * the regular functions to talk to the caller.
 */
int rv, plen;
-   char buf[16384];
+   char buf[BUFSIZE];
struct sockaddr_storage z;

len = sizeof(z);



Re: Refactor TCP partial ACK handling

2017-10-24 Thread Martin Pieuchot
On 24/10/17(Tue) 12:27, Mike Belopuhov wrote:
> On Tue, Oct 24, 2017 at 12:05 +0200, Martin Pieuchot wrote:
> > On 21/10/17(Sat) 15:17, Mike Belopuhov wrote:
> > > On Fri, Oct 20, 2017 at 22:59 +0200, Klemens Nanni wrote:
> > > > The comments for both void tcp_{sack,newreno}_partialack() still mention
> > > > tp->snd_last and return value bits.
> > > > 
> > > 
> > > Good eyes!  It made me spot a mistake I made by folding two lines
> > > into an incorrect ifdef in tcp_sack_partialack.  I expected it to
> > > say "ifdef TCP_FACK" while it says "ifNdef".  The adjusted comment
> > > didn't make sense and I found the bug.
> > 
> > Could you send the full/fixed diff?
> > 
> 
> Sure.

Diff is correct.  I have two suggestions, but it's ok mpi@ either way.

> > And what about TCP_FACK?  It is disabled by default, is there a
> > point in keeping it?
> 
> Job has pointed out that RFC 6675 might be a better alternative
> so it might be a good idea to ditch it while we're at it.  I'm
> not certain which parts need to be preserved (if any) however.

I'd say remove it.  One can always look in the Attic if necessary.

> diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
> index 790e163975e..3809a2371f2 100644
> --- sys/netinet/tcp_input.c
> +++ sys/netinet/tcp_input.c
> @@ -1664,52 +1664,38 @@ trimthenstep6:
>   }
>   /*
>* If the congestion window was inflated to account
>* for the other side's cached packets, retract it.
>*/
> - if (tp->sack_enable) {
> - if (tp->t_dupacks >= tcprexmtthresh) {
> - /* Check for a partial ACK */
> - if (tcp_sack_partialack(tp, th)) {
> -#ifdef TCP_FACK
> - /* Force call to tcp_output */
> - if (tp->snd_awnd < tp->snd_cwnd)
> - tp->t_flags |= TF_NEEDOUTPUT;
> -#else
> - tp->snd_cwnd += tp->t_maxseg;
> - tp->t_flags |= TF_NEEDOUTPUT;
> -#endif /* TCP_FACK */
> - } else {
> - /* Out of fast recovery */
> - tp->snd_cwnd = tp->snd_ssthresh;
> - if (tcp_seq_subtract(tp->snd_max,
> - th->th_ack) < tp->snd_ssthresh)
> - tp->snd_cwnd =
> -tcp_seq_subtract(tp->snd_max,
> -th->th_ack);
> - tp->t_dupacks = 0;
> -#ifdef TCP_FACK
> - if (SEQ_GT(th->th_ack, tp->snd_fack))
> - tp->snd_fack = th->th_ack;
> -#endif
> - }
> - }
> - } else {
> - if (tp->t_dupacks >= tcprexmtthresh &&
> - !tcp_newreno(tp, th)) {
> + if (tp->t_dupacks >= tcprexmtthresh) {

I'd keep the comment:

/* Check for a partial ACK */
> + if (SEQ_LT(th->th_ack, tp->snd_last)) {
> + if (tp->sack_enable)
> + tcp_sack_partialack(tp, th);
> + else
> + tcp_newreno_partialack(tp, th);
> + } else {
>   /* Out of fast recovery */
>   tp->snd_cwnd = tp->snd_ssthresh;
>   if (tcp_seq_subtract(tp->snd_max, th->th_ack) <
>   tp->snd_ssthresh)
>   tp->snd_cwnd =
>   tcp_seq_subtract(tp->snd_max,
>   th->th_ack);
>   tp->t_dupacks = 0;
> +#ifdef TCP_FACK
> + if (tp->sack_enable &&
> + SEQ_GT(th->th_ack, tp->snd_fack))
> + tp->snd_fack = th->th_ack;
> +#endif
>   }
> - }
> - if (tp->t_dupacks < tcprexmtthresh)
> + } else {
> + /*
> +  * Reset the duplicate ACK counter if we
> +  * were not in fast recovery.
> +  */
>   tp->t_dupacks = 0;
> + }
>   if (SEQ_GT(th->th_ack, tp->snd_max)) {
>   tcpstat_inc(tcps_rcvacktoomuch);
>   goto dropafterack_ratelim;
>   }
>   acked = th->th_ack - tp->snd_una;
> @@ -2703,36 +2689,38 @@ tcp_clean_sackreport(struct tcpcb *tp)
>   tp->sackblks[

Re: Refactor TCP partial ACK handling

2017-10-24 Thread Mike Belopuhov
On Tue, Oct 24, 2017 at 12:05 +0200, Martin Pieuchot wrote:
> On 21/10/17(Sat) 15:17, Mike Belopuhov wrote:
> > On Fri, Oct 20, 2017 at 22:59 +0200, Klemens Nanni wrote:
> > > The comments for both void tcp_{sack,newreno}_partialack() still mention
> > > tp->snd_last and return value bits.
> > > 
> > 
> > Good eyes!  It made me spot a mistake I made by folding two lines
> > into an incorrect ifdef in tcp_sack_partialack.  I expected it to
> > say "ifdef TCP_FACK" while it says "ifNdef".  The adjusted comment
> > didn't make sense and I found the bug.
> 
> Could you send the full/fixed diff?
> 

Sure.

> And what about TCP_FACK?  It is disabled by default, is there a
> point in keeping it?

Job has pointed out that RFC 6675 might be a better alternative
so it might be a good idea to ditch it while we're at it.  I'm
not certain which parts need to be preserved (if any) however.

diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
index 790e163975e..3809a2371f2 100644
--- sys/netinet/tcp_input.c
+++ sys/netinet/tcp_input.c
@@ -1664,52 +1664,38 @@ trimthenstep6:
}
/*
 * If the congestion window was inflated to account
 * for the other side's cached packets, retract it.
 */
-   if (tp->sack_enable) {
-   if (tp->t_dupacks >= tcprexmtthresh) {
-   /* Check for a partial ACK */
-   if (tcp_sack_partialack(tp, th)) {
-#ifdef TCP_FACK
-   /* Force call to tcp_output */
-   if (tp->snd_awnd < tp->snd_cwnd)
-   tp->t_flags |= TF_NEEDOUTPUT;
-#else
-   tp->snd_cwnd += tp->t_maxseg;
-   tp->t_flags |= TF_NEEDOUTPUT;
-#endif /* TCP_FACK */
-   } else {
-   /* Out of fast recovery */
-   tp->snd_cwnd = tp->snd_ssthresh;
-   if (tcp_seq_subtract(tp->snd_max,
-   th->th_ack) < tp->snd_ssthresh)
-   tp->snd_cwnd =
-  tcp_seq_subtract(tp->snd_max,
-  th->th_ack);
-   tp->t_dupacks = 0;
-#ifdef TCP_FACK
-   if (SEQ_GT(th->th_ack, tp->snd_fack))
-   tp->snd_fack = th->th_ack;
-#endif
-   }
-   }
-   } else {
-   if (tp->t_dupacks >= tcprexmtthresh &&
-   !tcp_newreno(tp, th)) {
+   if (tp->t_dupacks >= tcprexmtthresh) {
+   if (SEQ_LT(th->th_ack, tp->snd_last)) {
+   if (tp->sack_enable)
+   tcp_sack_partialack(tp, th);
+   else
+   tcp_newreno_partialack(tp, th);
+   } else {
/* Out of fast recovery */
tp->snd_cwnd = tp->snd_ssthresh;
if (tcp_seq_subtract(tp->snd_max, th->th_ack) <
tp->snd_ssthresh)
tp->snd_cwnd =
tcp_seq_subtract(tp->snd_max,
th->th_ack);
tp->t_dupacks = 0;
+#ifdef TCP_FACK
+   if (tp->sack_enable &&
+   SEQ_GT(th->th_ack, tp->snd_fack))
+   tp->snd_fack = th->th_ack;
+#endif
}
-   }
-   if (tp->t_dupacks < tcprexmtthresh)
+   } else {
+   /*
+* Reset the duplicate ACK counter if we
+* were not in fast recovery.
+*/
tp->t_dupacks = 0;
+   }
if (SEQ_GT(th->th_ack, tp->snd_max)) {
tcpstat_inc(tcps_rcvacktoomuch);
goto dropafterack_ratelim;
}
acked = th->th_ack - tp->snd_una;
@@ -2703,36 +2689,38 @@ tcp_clean_sackreport(struct tcpcb *tp)
tp->sackblks[i].start = tp->sackblks[i].end=0;
 
 }
 
 /*
- * Checks for partial ack.  If partial ack arrives, turn off retransmission
- * timer, deflate the window, do not clear tp->t_dupacks, and return 1.
- * If the ack advances at least to tp->snd_last, return 0.
+ * Partial ack handling within a sack recovery episode.  When a partial ack
+ 

Re: Refactor TCP partial ACK handling

2017-10-24 Thread Martin Pieuchot
On 21/10/17(Sat) 15:17, Mike Belopuhov wrote:
> On Fri, Oct 20, 2017 at 22:59 +0200, Klemens Nanni wrote:
> > The comments for both void tcp_{sack,newreno}_partialack() still mention
> > tp->snd_last and return value bits.
> > 
> 
> Good eyes!  It made me spot a mistake I made by folding two lines
> into an incorrect ifdef in tcp_sack_partialack.  I expected it to
> say "ifdef TCP_FACK" while it says "ifNdef".  The adjusted comment
> didn't make sense and I found the bug.

Could you send the full/fixed diff?  And what about TCP_FACK?  It is
disabled by default, is there a point in keeping it?

> diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
> index 45aafee0d05..d5de9cb2407 100644
> --- sys/netinet/tcp_input.c
> +++ sys/netinet/tcp_input.c
> @@ -2690,13 +2690,13 @@ tcp_clean_sackreport(struct tcpcb *tp)
>   tp->sackblks[i].start = tp->sackblks[i].end=0;
>  
>  }
>  
>  /*
> - * Checks for partial ack.  If partial ack arrives, turn off retransmission
> - * timer, deflate the window, do not clear tp->t_dupacks, and return 1.
> - * If the ack advances at least to tp->snd_last, return 0.
> + * Partial ack handling within a sack recovery episode.  When a partial ack
> + * arrives, turn off retransmission timer, deflate the window, do not clear
> + * tp->t_dupacks.
>   */
>  void
>  tcp_sack_partialack(struct tcpcb *tp, struct tcphdr *th)
>  {
>   /* Turn off retx. timer (will start again next segment) */
> @@ -2711,16 +2711,16 @@ tcp_sack_partialack(struct tcpcb *tp, struct tcphdr 
> *th)
>   if (tp->snd_cwnd > (th->th_ack - tp->snd_una)) {
>   tp->snd_cwnd -= th->th_ack - tp->snd_una;
>   tp->snd_cwnd += tp->t_maxseg;
>   } else
>   tp->snd_cwnd = tp->t_maxseg;
> + tp->snd_cwnd += tp->t_maxseg;
> + tp->t_flags |= TF_NEEDOUTPUT;
> +#else
>   /* Force call to tcp_output */
>   if (tp->snd_awnd < tp->snd_cwnd)
>   tp->t_flags |= TF_NEEDOUTPUT;
> -#else
> - tp->snd_cwnd += tp->t_maxseg;
> - tp->t_flags |= TF_NEEDOUTPUT;
>  #endif
>  }
>  
>  /*
>   * Pull out of band byte out of a segment so
> @@ -3078,14 +3078,14 @@ tcp_mss_update(struct tcpcb *tp)
>   }
>  
>  }
>  
>  /*
> - * Checks for partial ack.  If partial ack arrives, force the retransmission
> - * of the next unacknowledged segment, do not clear tp->t_dupacks, and return
> - * 1.  By setting snd_nxt to ti_ack, this forces retransmission timer to
> - * be started again.  If the ack advances at least to tp->snd_last, return 0.
> + * When a partial ack arrives, force the retransmission of the
> + * next unacknowledged segment.  Do not clear tp->t_dupacks.
> + * By setting snd_nxt to ti_ack, this forces retransmission timer
> + * to be started again.
>   */
>  void
>  tcp_newreno_partialack(struct tcpcb *tp, struct tcphdr *th)
>  {
>   /*
> 



Re: kqueue_scan() in parallel

2017-10-24 Thread Martin Pieuchot
On 23/10/17(Mon) 21:58, Alexander Bluhm wrote:
> On Wed, Oct 18, 2017 at 11:22:01AM +0200, Martin Pieuchot wrote:
> > Diff below is the last version of my kqueue diff to allow grabbing the
> > solock() and possibly sleeping, inside kqueue_scan().
> 
> I wonder why you don't call knote_release() when calling knote_drop().
> Another thread could sleep in knote_acquire() and never get a
> wakeup.

This is not a problem.  The way knote_acquire() is designed is to sleep
for a small amount of time, 1 * hz for the moment.  Here's what the
comment, coming from DragonflyBSD says:

  * If we cannot acquire the knote we sleep and return 0.  The knote
  * may be stale on return in this case and the caller must restart
  * whatever loop they are in.


>  I think there should be a call to knote_release() from
> knote_drop() just before freeing it.  Then other threads wake up
> and can continue.

This is an optimization.  You're saying that "waiting" for 1 * hz is
too long.  I doubt it makes any difference in practise. First because
it is a small interval.  But also because the top part of the kernel
is still serialized by the KERNEL_LOCK().  So if one thread wakes up
its sibling, the sibling will have to spin until it exits the kernel.

> Why is knote_release() not a void function?

Because it's a stripped down version of DragonflyBSD's one.  And I think
we should continue to sync with their work.  But I can change that for
now.

> I did run all regress tests with your diff, everything passes.
> Btw, I have added a link to my regress page where you can see the
> diffs in /usr/src when I did not run the tests with a snapshot.
> http://bluhm.genua.de/regress/results/regress.html



Re: libfuse: patch to prevent fuse-zip from dumping core

2017-10-24 Thread Martin Pieuchot
On 18/10/17(Wed) 14:33, Helg Bredow wrote:
> On Wed, 18 Oct 2017 15:03:21 +0200
> Martin Pieuchot  wrote:
> 
> > On 18/10/17(Wed) 12:51, Helg Bredow wrote:
> > > On Wed, 18 Oct 2017 10:04:07 +0200
> > > Martin Pieuchot  wrote:
> > > 
> > > > On 17/10/17(Tue) 15:30, Helg Bredow wrote:
> > > > > If you execute "fuse-zip -V" it prints the version and then dumps 
> > > > > core. This is because fuse-zip does not initialise the mount point 
> > > > > pointer to NULL. This patch ensures that it's always initialised to 
> > > > > NULL.
> > > > 
> > > > It's hard to understand your fix if you don't explain what "dumps core".
> > > > 
> > > > I had to install the package and look at the trace myself.  You could
> > > > save me these tasks by either posting the backtrace, saying that free(3)
> > > > is call on an uninitialized memory or both.
> > > > 
> > > > That said, I'd suggest different fix.  Initializing `mp' in fuse_setup()
> > > > is very fragile.  Instead I'd declare a local variable and don't use
> > > > `mp' at all in these function.
> > > > In case of sucsses, just before returning the "struct fuse" pointer I'd
> > > > assign *mp, if not NULL, to the local variable.
> > > > 
> > > > By the way, what does "mp" stand for?  I find the name confusing.
> > > > 
> > > 
> > > Sorry, I've been looking at this for so long I assumed the diff was 
> > > self-explanatory. Thanks for your patience. I like your suggestion and 
> > > have modified the patch accordingly. A NULL test before the assignment is 
> > > superfluous since the code won't be reached if dir is NULL.
> > 
> > But what if mp is NULL?
> 
> Good point; I hadn't considered that. I've added the check.

Committed thanks.  It would be nice if you could audit the others
functions of the library and fix any similar bug.  Fuzzing the
library might also help.



Re: [PATCH 1/2] VMD: Prevent Disappearing VM from vmctl status

2017-10-24 Thread Mike Larkin
On Tue, Oct 10, 2017 at 03:46:56PM -0700, Carlos Cardenas wrote:
> Remove terminate_vm/vm_remove logic from vmm_dispatch_parent. This
> logic is present in vmm_sighdlr when a VM process has signaled
> SIGCHLD for proper cleanup.
> 
> diff --git usr.sbin/vmd/vmm.c usr.sbin/vmd/vmm.c
> index ccd7680b479..8cc1c15157a 100644
> --- usr.sbin/vmd/vmm.c
> +++ usr.sbin/vmd/vmm.c
> @@ -170,15 +170,13 @@ vmm_dispatch_parent(int fd, struct privsep_proc *p, 
> struct imsg *imsg)
>   else
>   res = 0;
>   } else {
> - /* in the process of shutting down... */
> - log_debug("%s: performing a forced shutdown",
> - __func__);
> + /*
> +  * VM is currently being shutdown.
> +  * Check to see if the VM process is still
> +  * active.  If not, return VMD_VM_STOP_INVALID.
> +  */
>   vtp.vtp_vm_id = vm_vmid2id(vm->vm_vmid, vm);
> - /* ensure vm_id isn't 0 */
> - if (vtp.vtp_vm_id != 0) {
> - res = terminate_vm(&vtp);
> - vm_remove(vm);
> - } else {
> + if (vtp.vtp_vm_id == 0) {
>   log_debug("%s: no vm running anymore",
>   __func__);
>   res = VMD_VM_STOP_INVALID;
> -- 
> 2.14.2
>

committed thanks! 



Re: [PATCH 2/2] VMD: Handle vm-induced powerdown better

2017-10-24 Thread Mike Larkin
On Tue, Oct 10, 2017 at 03:46:57PM -0700, Carlos Cardenas wrote:
> The VMD parent process didn't handle the case of a VM exiting
> with a non 0 return properly (i.e. EIO).
> 
> diff --git usr.sbin/vmd/vmd.c usr.sbin/vmd/vmd.c
> index f1abc54d9a3..dcff6de0c0e 100644
> --- usr.sbin/vmd/vmd.c
> +++ usr.sbin/vmd/vmd.c
> @@ -394,11 +394,14 @@ vmd_dispatch_vmm(int fd, struct privsep_proc *p, struct 
> imsg *imsg)
>   case IMSG_VMDOP_TERMINATE_VM_EVENT:
>   IMSG_SIZE_CHECK(imsg, &vmr);
>   memcpy(&vmr, imsg->data, sizeof(vmr));
> - log_debug("%s: handling TERMINATE_EVENT for vm id %d",
> - __func__, vmr.vmr_id);
> - if ((vm = vm_getbyvmid(vmr.vmr_id)) == NULL)
> + log_debug("%s: handling TERMINATE_EVENT for vm id %d ret %d",
> + __func__, vmr.vmr_id, vmr.vmr_result);
> + if ((vm = vm_getbyvmid(vmr.vmr_id)) == NULL) {
> + log_debug("%s: vm %d is no longer available",
> + __func__, vmr.vmr_id);
>   break;
> - if (vmr.vmr_result == 0) {
> + }
> + if (vmr.vmr_result != EAGAIN) {
>   if (vm->vm_from_config) {
>   log_debug("%s: about to stop vm id %d",
>   __func__, vm->vm_vmid);
> @@ -408,7 +411,7 @@ vmd_dispatch_vmm(int fd, struct privsep_proc *p, struct 
> imsg *imsg)
>   __func__, vm->vm_vmid);
>   vm_remove(vm);
>   }
> - } else if (vmr.vmr_result == EAGAIN) {
> + } else {
>   /* Stop VM instance but keep the tty open */
>   log_debug("%s: about to stop vm id %d with tty open",
>   __func__, vm->vm_vmid);
> -- 
> 2.14.2
>

It occurred to me that I may have not committed a couple of these. This one's
in. Sorry for the delay, this got buried under a pile of other email.

-ml