Re: pvclock(4)

2018-12-03 Thread johnw

Chris Cappuccio 於 2018-12-04 08:56 寫到:

Reyk Floeter [r...@openbsd.org] wrote:


Yes, KVM???s stable bit is not a reliable indication as it is seems to 
depend on the capabilities of the KVM version and not the actual 
availability of the feature on the particular hardware. How annoying.


As mentioned before: I???d like to disable pvclock for now and I can 
do that in the morning CET if nobody beats me to it.


I have an idea how to deal with old platforms afterwards but this 
needs some more tests and thoughts.




Perhaps the solution is as "simple" as checking the status of the bit
after the presence of the bit is established ?

Index: pvclock.c
===
RCS file: /cvs/src/sys/dev/pv/pvclock.c,v
retrieving revision 1.2
diff -u -p -u -r1.2 pvclock.c
--- pvclock.c   24 Nov 2018 13:12:29 -  1.2
+++ pvclock.c   4 Dec 2018 00:53:56 -
@@ -127,8 +127,10 @@ pvclock_match(struct device *parent, voi
 void
 pvclock_attach(struct device *parent, struct device *self, void *aux)
 {
-   struct pvclock_softc*sc = (struct pvclock_softc *)self;
-   paddr_t  pa;
+   struct pvclock_softc*sc = (struct pvclock_softc *)self;
+   struct pvclock_time_info*ti;
+   paddr_t  pa;
+   uint8_t  flags;

if ((sc->sc_time = km_alloc(PAGE_SIZE,
_any, _zero, _nowait)) == NULL) {
@@ -151,6 +153,13 @@ pvclock_attach(struct device *parent, st

/* Better than HPET but below TSC */
sc->sc_tc->tc_quality = 1500;
+
+   ti = sc->sc_time;
+   flags = ti->ti_flags;
+   if ((flags & PVCLOCK_FLAG_TSC_STABLE) == 0) {
+   printf(": unstable timestamp counter\n");
+   return;
+   }

tc_init(sc->sc_tc);


Hi, your patch work for me, the system boot without manually disable 
pvclock, thanks.


Attached dmesg.

--
Key fingerprint: CDB3 6C62 254B C088 1E5D DD32 182C 97DB CF2C 80AC
OpenBSD 6.4-current (GENERIC.MP) #490: Thu Nov 29 16:00:20 MST 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 788389888 (751MB)
avail mem = 755261440 (720MB)
User Kernel Config
UKC> disable 59
 59 pvclock0 disabled
UKC> quit
Continuing...
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf5b80 (10 entries)
bios0: vendor SeaBIOS version "1.11.1-1" date 04/01/2014
bios0: QEMU Standard PC (i440FX + PIIX, 1996)
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP APIC HPET
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz, 118.79 MHz, 06-17-0a
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,x2APIC,DEADLINE,XSAVE,HV,NXE,LONG,LAHF,PERF,ARAT,MELTDOWN
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 1039MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz, 322.28 MHz, 06-17-0a
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,x2APIC,DEADLINE,XSAVE,HV,NXE,LONG,LAHF,PERF,ARAT,MELTDOWN
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: smt 0, core 0, package 1
ioapic0 at mainbus0: apid 0 pa 0xfec0, version 20, 24 pins
acpihpet0 at acpi0: 1 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
"ACPI0006" at acpi0 not configured
acpipci0 at acpi0 PCI0: _OSC failed
acpicmos0 at acpi0
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"PNP0A06" at acpi0 not configured
"QEMU0002" at acpi0 not configured
"ACPI0010" at acpi0 not configured
pvbus0 at mainbus0: KVM
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  ATAPI 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
piixpm0 at pci0 dev 1 function 3 "Intel 82371AB 

Re: sed(1): disallow invocation without script

2018-12-03 Thread Ingo Schwarze
Hi Scott,

Scott Cheloha wrote on Mon, Dec 03, 2018 at 08:47:23PM -0600:

> This diff makes it an error to invoke sed(1) without any command
> or script.

I'm not convinced this makes anything better for anybody.
For people who do not currently call sed(1) without giving
a script, it makes no difference.
People who currently do call sed(1) without a script (for
example in some obscure corner of some shell script) may
get into trouble.

> In POSIX.1-2008 the sed(1) synopsis changed from this:
> 
>   sed [-n] script[file...]
>   sed [-n][-e script]...[-f script_file]...[file...]
> 
> to this:
> 
>   sed [-n] script [file...]
>   sed [-n] -e script [-e script]... [-f script_file]... [file...]
>   sed [-n] [-e script]... -f script_file [-f script_file]... [file]
> 
> which makes the empty invocation, i.e.
> 
>   $ sed
> or
> 
>   $ ... | sed | ...
> 
> unambiguously an unrecognized form; you have to specify some sort of
> script, somehow, or the behavior is undefined.

Right.  And undefined behaviour (in POSIX) means we are allowed to
define what we want to do, as an extension.  If some traditional
behaviour exists, it may be reasonable to stick to that, unless
there are real reasons to change it.  Currently, our sed(1) manual
page implicitly already defines our extension: it implicitly says
that sed(1) is a cat(1) by default:

 The sed utility reads the specified files, or the standard input if no
 files are specified, modifying the input as specified by a list of
 commands.  The input is then written to the standard output.
 [...]
 Normally, sed cyclically copies a line of input, not including
 its terminating newline character, into a pattern space, (unless
 there is something left after a D function), applies all of
 the commands with addresses that select that pattern space,
 copies the pattern space to the standard output, appending a
 newline, and deletes the pattern space.

A list can be empty.  When there are no commands, the pattern space
simply gets copied unchanged.

> Yes, the standard was silent regarding the behavior in this case in
> earlier versions, but now that the synopsis doesn't even allow for it
> *and* we claim conformance to the newer standard I think we ought to
> document how our implementation behaves in this case (probably note it
> in STANDARDS) or drop it.  I'd like to drop it.
> 
> What do other implementations do?  FreeBSD/NetBSD/Solaris/illumos sed(1)
> all (like us) descend from the Spinellis rewrite for 4.4BSD and all of
> them have preserved this behavior and left it undocumented.

Seems like a real reason to keep the current behaviour.

> GNU sed and AIX sed error out in this case.

So the GNU looks like the odd wildebeest here.  AIX is not terribly
important, i think.

> Aside from it being undocumented, I'd like to drop this because:
> 
>   1. The empty invocation is not useful.  If you want sed to do
>  nothing you can very easily do so explicitly, i.e.
> 
>   $ sed ""
> 
>  Allowing the empty invocation just makes more room for user
>  error without enhancing the tool.

I consider it logical that, if you can build a list of cammands by
specifying an arbitrary number of -e and an arbitrary number of -f
options, and both numbers can be zero, that both can also be zero
at the same time.

[...]
>   2. Requiring a command is consistent, usage-wise, if you think of
>  sed(1) as a successor to grep(1).

I don't think that is a very good analogy.  It is clear what it means
to apply an empty list of commands: nothing happens.  Is is not
clear what it means to pick lines that match an empty list of patterns.
If anything, "nothing is selected" would seem more logical than
"everything is selected" because how can something match when there
isn't even a pattern?  So the analogy doesn't really work.

>  grep(1) supports matching the null pattern, i.e.
> 
>   $ grep ""

The null pattern is not the same as an empty list of patterns.

It is logical what an empty list of commands does, and it reasonable
that an empty command does the same.

Yours,
  Ingo



sed(1): disallow invocation without script

2018-12-03 Thread Scott Cheloha
Hi,

This diff makes it an error to invoke sed(1) without any command or
script.

In POSIX.1-2008 the sed(1) synopsis changed from this:

  sed [-n] script[file...]
  sed [-n][-e script]...[-f script_file]...[file...]

to this:

  sed [-n] script [file...]
  sed [-n] -e script [-e script]... [-f script_file]... [file...]
  sed [-n] [-e script]... -f script_file [-f script_file]... [file]

which makes the empty invocation, i.e.

$ sed
or

$ ... | sed | ...

unambiguously an unrecognized form; you have to specify some sort of
script, somehow, or the behavior is undefined.

Yes, the standard was silent regarding the behavior in this case in
earlier versions, but now that the synopsis doesn't even allow for it
*and* we claim conformance to the newer standard I think we ought to
document how our implementation behaves in this case (probably note it
in STANDARDS) or drop it.  I'd like to drop it.

What do other implementations do?  FreeBSD/NetBSD/Solaris/illumos sed(1)
all (like us) descend from the Spinellis rewrite for 4.4BSD and all of
them have preserved this behavior and left it undocumented.  GNU sed and
AIX sed error out in this case.

Aside from it being undocumented, I'd like to drop this because:

  1. The empty invocation is not useful.  If you want sed to do
 nothing you can very easily do so explicitly, i.e.

$ sed ""

 Allowing the empty invocation just makes more room for user
 error without enhancing the tool.  I appreciate that there
 *could* be a shell script out there with a variable sed script
 like this:

... | sed $script | ...

 where the author intended for sed(1) to go ahead and do nothing
 in certain cases and that this change would then break that script,
 but on average I imagine that this is more likely to be a "gotcha"
 than a feature.

  2. Requiring a command is consistent, usage-wise, if you think of
 sed(1) as a successor to grep(1).

 grep(1) supports matching the null pattern, i.e.

$ grep ""

 but you have to do so explicitly.  There is no default pattern.

 Both utilities are inspired by ed(1).   grep(1) for g/re/p and sed(1)
 for all the others.

 If grep(1) fails without a pattern I think sed(1) should similarly
 fail without a command.  Just as the omission of a pattern is not
 equivalent to the null pattern, the omission of a command is not
 equivalent to the explicit instruction to do nothing, however similar
 they may appear.

--

Thoughts?  Preferences?

-Scott

Aside: I did my best here with modifying the synopsis in sed.1.  Is
there a better/preferred way to indicate that an optional flag with an
argument can be given one or more times?  That is, to produce sth like:

[-f command_file]...

Index: POSIX
===
RCS file: /cvs/src/usr.bin/sed/POSIX,v
retrieving revision 1.2
diff -u -p -r1.2 POSIX
--- POSIX   26 Jun 1996 05:39:04 -  1.2
+++ POSIX   4 Dec 2018 02:47:15 -
@@ -128,13 +128,7 @@ All uses of "POSIX" refer to section 4.5
does not specify this.  This implementation follows historical
practice.
 
-14.POSIX does not explicitly specify how sed behaves if no script is
-   specified.  Since the sed Synopsis permits this form of the command,
-   and the language in the Description section states that the input
-   is output, it seems reasonable that it behave like the cat(1)
-   command.  Historic sed implementations behave differently for "ls |
-   sed", where they produce no output, and "ls | sed -e#", where they
-   behave like cat.  This implementation behaves like cat in both cases.
+14.Deleted.
 
 15.The POSIX requirement to open all w files at the beginning makes
sed behave nonintuitively when the w commands are preceded by
Index: main.c
===
RCS file: /cvs/src/usr.bin/sed/main.c,v
retrieving revision 1.38
diff -u -p -r1.38 main.c
--- main.c  14 Nov 2018 10:59:33 -  1.38
+++ main.c  4 Dec 2018 02:47:15 -
@@ -102,6 +102,7 @@ u_long linenum;
 static void add_compunit(enum e_cut, char *);
 static void add_file(char *);
 static int next_files_have_lines(void);
+static __dead void usage(void);
 
 int termwidth;
 
@@ -143,11 +144,7 @@ main(int argc, char *argv[])
setvbuf(stdout, NULL, _IOLBF, 0);
break;
default:
-   case '?':
-   (void)fprintf(stderr,
-   "usage: sed [-aEnru] [-i[extension]] command [file 
...]\n"
-   "   sed [-aEnru] [-e command] [-f command_file] 
[-i[extension]] [file ...]\n");
-   exit(1);
+   usage();
}
argc -= optind;
argv += optind;
@@ -174,7 +171,9 @@ main(int argc, char *argv[])

Re: pvclock(4)

2018-12-03 Thread Chris Cappuccio
Reyk Floeter [r...@openbsd.org] wrote:
>
> Yes, KVM???s stable bit is not a reliable indication as it is seems to depend 
> on the capabilities of the KVM version and not the actual availability of the 
> feature on the particular hardware. How annoying.
>
> As mentioned before: I???d like to disable pvclock for now and I can do that 
> in the morning CET if nobody beats me to it.
>
> I have an idea how to deal with old platforms afterwards but this needs some 
> more tests and thoughts.
>

Perhaps the solution is as "simple" as checking the status of the bit
after the presence of the bit is established ?

Index: pvclock.c
===
RCS file: /cvs/src/sys/dev/pv/pvclock.c,v
retrieving revision 1.2
diff -u -p -u -r1.2 pvclock.c
--- pvclock.c   24 Nov 2018 13:12:29 -  1.2
+++ pvclock.c   4 Dec 2018 00:53:56 -
@@ -127,8 +127,10 @@ pvclock_match(struct device *parent, voi
 void
 pvclock_attach(struct device *parent, struct device *self, void *aux)
 {
-   struct pvclock_softc*sc = (struct pvclock_softc *)self;
-   paddr_t  pa;
+   struct pvclock_softc*sc = (struct pvclock_softc *)self;
+   struct pvclock_time_info*ti;
+   paddr_t  pa;
+   uint8_t  flags;
 
if ((sc->sc_time = km_alloc(PAGE_SIZE,
_any, _zero, _nowait)) == NULL) {
@@ -151,6 +153,13 @@ pvclock_attach(struct device *parent, st
 
/* Better than HPET but below TSC */
sc->sc_tc->tc_quality = 1500;
+
+   ti = sc->sc_time;
+   flags = ti->ti_flags;
+   if ((flags & PVCLOCK_FLAG_TSC_STABLE) == 0) {
+   printf(": unstable timestamp counter\n");
+   return;
+   }
 
tc_init(sc->sc_tc);
 



Re: pvclock(4)

2018-12-03 Thread Reyk Floeter


> Am 04.12.2018 um 00:52 schrieb Chris Cappuccio :
> 
> johnw [johnw.m...@gmail.com] wrote:
>> 
>> Hi, after disable pvclock, it can boot with new kernel again, thanks.
> ...
>> cpu0: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz, 105.29 MHz, 06-17-0a
>> cpu0: 
>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,x2APIC,DEADLINE,XSAVE,HV,NXE,LONG,LAHF,PERF,ARAT,MELTDOWN
> ...
> 
> This CPU clearly doesn't have the invariant TSC (it is required according
> to Reyk Floeter's tech@ posting from Nov 19th.) So, pvclock does not handle
> this situation properly, apparently checking for KVM_FEATURE_CLOCKSOURCE2
> and KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is not enough. 

Yes, KVM’s stable bit is not a reliable indication as it is seems to depend on 
the capabilities of the KVM version and not the actual availability of the 
feature on the particular hardware. How annoying.

As mentioned before: I‘d like to disable pvclock for now and I can do that in 
the morning CET if nobody beats me to it.

I have an idea how to deal with old platforms afterwards but this needs some 
more tests and thoughts.

Reyk



Re: pvclock(4)

2018-12-03 Thread Chris Cappuccio
johnw [johnw.m...@gmail.com] wrote:
> 
> Hi, after disable pvclock, it can boot with new kernel again, thanks.
...
> cpu0: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz, 105.29 MHz, 06-17-0a
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,SSSE3,CX16,SSE4.1,x2APIC,DEADLINE,XSAVE,HV,NXE,LONG,LAHF,PERF,ARAT,MELTDOWN
...

This CPU clearly doesn't have the invariant TSC (it is required according
to Reyk Floeter's tech@ posting from Nov 19th.) So, pvclock does not handle
this situation properly, apparently checking for KVM_FEATURE_CLOCKSOURCE2
and KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is not enough. 

Chris



Re: bgpd, network add broken with rdomains ?

2018-12-03 Thread Denis Fondras
On Mon, Dec 03, 2018 at 05:59:26PM +0100, Julien Dhaille wrote:
> Hi. I am using bgpd within a rdomain (1).
> After the upgrade to 6.4 stable, I can’t announce prefixes anymore via
> bgpctl :
> 
> router# ps aux -o rtable|grep bgp
> 
> root  4039  0.0  0.1   300  1292 p0  S+p    5:12PM    0:00.00 grep
> bgp  0
> root 68170  0.0  0.2  1056  2060 p2  I+ 4:52PM    0:00.01 bgpd
> -dvv 1
> _bgpd    80238  0.0  0.4  4160  4264 p2  I+p    4:52PM    0:00.01 bgpd:
> route deci        1
> _bgpd    26255  0.0  0.2  1456  2164 p2  S+p    4:52PM    0:00.04 bgpd:
> session en  1
> 
> router# route -T1 exec bgpctl network add 10.0.0.1/32 nexthop 10.0.0.20
> prepend-self 11
> or
> router# route -T1 exec bgpctl -s /var/run/bgpd.sock.1 network add
> 10.0.0.1/32 nexthop 10.0.0.20 prepend-self 11
> 
> results in :
> 
> network_add: prefix 10.0.0.1/32 in non-existing rdomain 0
> 
> Am I missing a change or something ?
> 

rde.c,v1.389 from Jul 10, 2018 introduced this "regression".

Can you try this diff :

Index: bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.224
diff -u -p -r1.224 bgpctl.c
--- bgpctl.c28 Nov 2018 08:33:59 -  1.224
+++ bgpctl.c3 Dec 2018 20:16:45 -
@@ -345,6 +345,7 @@ main(int argc, char *argv[])
bzero(, sizeof(net));
net.prefix = res->addr;
net.prefixlen = res->prefixlen;
+   net.rtableid = r;
/* attribute sets are not supported */
if (res->action == NETWORK_ADD) {
imsg_compose(ibuf, IMSG_NETWORK_ADD, 0, 0, -1,



Re: bgpd, network add broken with rdomains ?

2018-12-03 Thread Denis Fondras
On Mon, Dec 03, 2018 at 09:19:10PM +0100, Denis Fondras wrote:
> On Mon, Dec 03, 2018 at 05:59:26PM +0100, Julien Dhaille wrote:
> > Hi. I am using bgpd within a rdomain (1).
> > After the upgrade to 6.4 stable, I can’t announce prefixes anymore via
> > bgpctl :
> > 
> > router# ps aux -o rtable|grep bgp
> > 
> > root  4039  0.0  0.1   300  1292 p0  S+p    5:12PM    0:00.00 grep
> > bgp  0
> > root 68170  0.0  0.2  1056  2060 p2  I+ 4:52PM    0:00.01 bgpd
> > -dvv 1
> > _bgpd    80238  0.0  0.4  4160  4264 p2  I+p    4:52PM    0:00.01 bgpd:
> > route deci        1
> > _bgpd    26255  0.0  0.2  1456  2164 p2  S+p    4:52PM    0:00.04 bgpd:
> > session en  1
> > 
> > router# route -T1 exec bgpctl network add 10.0.0.1/32 nexthop 10.0.0.20
> > prepend-self 11
> > or
> > router# route -T1 exec bgpctl -s /var/run/bgpd.sock.1 network add
> > 10.0.0.1/32 nexthop 10.0.0.20 prepend-self 11
> > 
> > results in :
> > 
> > network_add: prefix 10.0.0.1/32 in non-existing rdomain 0
> > 
> > Am I missing a change or something ?
> > 
> 
> rde.c,v1.389 from Jul 10, 2018 introduced this "regression".
> 
> Can you try this diff :
> 

Well, a bit too fast...

Index: bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.224
diff -u -p -r1.224 bgpctl.c
--- bgpctl.c28 Nov 2018 08:33:59 -  1.224
+++ bgpctl.c3 Dec 2018 20:24:41 -
@@ -101,6 +101,7 @@ const char  *print_auth_method(enum auth_
 struct imsgbuf *ibuf;
 struct mrt_parser show_mrt = { show_mrt_dump, show_mrt_state, show_mrt_msg };
 struct mrt_parser net_mrt = { network_mrt_dump, NULL, NULL };
+int tableid;
 
 __dead void
 usage(void)
@@ -116,7 +117,7 @@ int
 main(int argc, char *argv[])
 {
struct sockaddr_un   sun;
-   int  fd, n, done, ch, nodescr = 0, verbose = 0, r;
+   int  fd, n, done, ch, nodescr = 0, verbose = 0;
struct imsg  imsg;
struct network_confignet;
struct parse_result *res;
@@ -128,8 +129,8 @@ main(int argc, char *argv[])
if (pledge("stdio rpath wpath cpath unix inet dns", NULL) == -1)
err(1, "pledge");
 
-   r = getrtable();
-   if (asprintf(, "%s.%d", SOCKET_NAME, r) == -1)
+   tableid = getrtable();
+   if (asprintf(, "%s.%d", SOCKET_NAME, tableid) == -1)
err(1, "asprintf");
 
while ((ch = getopt(argc, argv, "ns:")) != -1) {
@@ -345,6 +346,7 @@ main(int argc, char *argv[])
bzero(, sizeof(net));
net.prefix = res->addr;
net.prefixlen = res->prefixlen;
+   net.rtableid = tableid;
/* attribute sets are not supported */
if (res->action == NETWORK_ADD) {
imsg_compose(ibuf, IMSG_NETWORK_ADD, 0, 0, -1,
@@ -1981,6 +1983,7 @@ network_bulk(struct parse_result *res)
errx(1, "bad prefix: %s", b);
net.prefix = h;
net.prefixlen = len;
+   net.rtableid = tableid;
 
if (res->action == NETWORK_BULK_ADD) {
imsg_compose(ibuf, IMSG_NETWORK_ADD,





Re: Patch to support AR816x/AR817x chipsets in alc(4)

2018-12-03 Thread Genadijus Paleckis
Ping?

Could anyone give more feedback on this one?

Im running this patch for around 2 minths without issues on my laptop
with E2400 chipset.

I understand that this is quite big change and also it might be that
no one has supported card for this driver. Or perhaps because of the
driver being for unawanted vendor (Atheros) Is there any other reason
for this not to be added to the driver?

I'd also appreciate any feedback if I messed that up.

Thanks

On Wed, Nov 14, 2018 at 09:20:38PM +, Genadijus Paleckis wrote:
> Hi all,
> 
> I'm piggybacking on post from Jason Hunt
> https://marc.info/?l=openbsd-tech=146881042926430=2
> 
> Please find attached diff for alc(4) synced with FreeBSD alc(4).
> 
> In addition to original diff this one adds support to E2200, E2400
> and E2500 cards made by Attansic. I am running it for more than a month
> with E2400 model I have in my laptop without any issues.
> 
> I sent patch to kevlo@ who was very helpful with the feedback. I made all
> requested modifications.
> 
> If that's OK would be great to have it merged.
> 
> 
> $ dmesg | grep alc
> alc0 at pci5 dev 0 function 0 "Attansic Technology Killer E2400 Gigabit
> Ethernet" rev 0x10: msi, address d4:81:d7:8a:01:27
> atphy0 at alc0 phy 0: AR8035 10/100/1000 PHY, rev. 9
> 
> 
> Index: sys/dev/pci/if_alc.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_alc.c,v
> retrieving revision 1.42
> diff -u -p -r1.42 if_alc.c
> --- sys/dev/pci/if_alc.c  8 Sep 2017 05:36:52 -   1.42
> +++ sys/dev/pci/if_alc.c  14 Nov 2018 21:00:34 -
> @@ -26,7 +26,7 @@
>   * SUCH DAMAGE.
>   */
>  
> -/* Driver for Atheros AR8131/AR8132 PCIe Ethernet. */
> +/* Driver for Atheros AR813x/AR815x/AR816x/AR817x PCIe Ethernet. */
>  
>  #include "bpfilter.h"
>  #include "vlan.h"
> @@ -76,12 +76,17 @@ void  alc_watchdog(struct ifnet *);
>  int  alc_mediachange(struct ifnet *);
>  void alc_mediastatus(struct ifnet *, struct ifmediareq *);
>  
> -void alc_aspm(struct alc_softc *, uint64_t);
> +void alc_aspm(struct alc_softc *, int, uint64_t);
> +void alc_aspm_813x(struct alc_softc *, uint64_t);
> +void alc_aspm_816x(struct alc_softc *, int);
>  void alc_disable_l0s_l1(struct alc_softc *);
>  int  alc_dma_alloc(struct alc_softc *);
>  void alc_dma_free(struct alc_softc *);
>  int  alc_encap(struct alc_softc *, struct mbuf *);
>  void alc_get_macaddr(struct alc_softc *);
> +void alc_get_macaddr_813x(struct alc_softc *);
> +void alc_get_macaddr_816x(struct alc_softc *);
> +void alc_get_macaddr_par(struct alc_softc *);
>  void alc_init_cmb(struct alc_softc *);
>  void alc_init_rr_ring(struct alc_softc *);
>  int  alc_init_rx_ring(struct alc_softc *);
> @@ -89,9 +94,23 @@ void   alc_init_smb(struct alc_softc *);
>  void alc_init_tx_ring(struct alc_softc *);
>  int  alc_intr(void *);
>  void alc_mac_config(struct alc_softc *);
> +uint32_t alc_mii_readreg_813x(struct alc_softc *, int, int);
> +uint32_t alc_mii_readreg_816x(struct alc_softc *, int, int);
> +uint32_t alc_mii_writereg_813x(struct alc_softc *, int, int, int);
> +uint32_t alc_mii_writereg_816x(struct alc_softc *, int, int, int);
> +void alc_dsp_fixup(struct alc_softc *, int);
>  int  alc_miibus_readreg(struct device *, int, int);
>  void alc_miibus_statchg(struct device *);
> +int  alc_miibus_writeregr(struct device *, int, int, int);
>  void alc_miibus_writereg(struct device *, int, int, int);
> +uint32_t alc_miidbg_readreg(struct alc_softc *, int);
> +uint32_t alc_miidbg_writereg(struct alc_softc *, int, int);
> +uint32_t alc_miiext_readreg(struct alc_softc *, int, int);
> +uint32_t alc_miiext_writereg(struct alc_softc *, int, int, int);
> +void alc_phy_reset_813x(struct alc_softc *);
> +void alc_phy_reset_816x(struct alc_softc *);
> +void alc_setwol_813x(struct alc_softc *);
> +void alc_setwol_816x(struct alc_softc *);
>  int  alc_newbuf(struct alc_softc *, struct alc_rxdesc *);
>  void alc_phy_down(struct alc_softc *);
>  void alc_phy_reset(struct alc_softc *);
> @@ -108,6 +127,12 @@ void alc_stop_mac(struct alc_softc *);
>  void alc_stop_queue(struct alc_softc *);
>  void alc_tick(void *);
>  void alc_txeof(struct alc_softc *);
> +void alc_init_pcie(struct alc_softc *, int);
> +void alc_config_msi(struct alc_softc *);
> +int  alc_dma_alloc(struct alc_softc *);
> +void alc_dma_free(struct alc_softc *);
> +int  alc_encap(struct alc_softc *, struct mbuf *);
> +void alc_osc_reset(struct alc_softc *);
>  
>  uint32_t alc_dma_burst[] = { 128, 256, 512, 1024, 2048, 4096, 0 };
>  
> @@ -117,11 +142,18 @@ const struct pci_matchid alc_devices[] =
>   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_L1D },
>   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_L1D_1 },
>   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_L2C_1 },
> - { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_L2C_2 }
> + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_L2C_2 },
> + { PCI_VENDOR_ATTANSIC, 

Re: vmctl wait

2018-12-03 Thread Carlos Cardenas
On Mon, Dec 03, 2018 at 06:22:14PM +0100, Claudio Jeker wrote:
> This adds a feature to vmctl/vmd to wait for a VM to stop.
> It is a feature usable in many situation where you wait for a VM to halt
> after work is done. This is more or less vmctl stop  -w without
> sending the termination to the VM.
> 
> There is only one vmctl that can wait so if a second one comes in the
> previous one will terminate with an error. This is why I modified the
> error path a bit in vmctl.
> 
> -- 
> :wq Claudio

Ok ccardenas@

+--+
Carlos



Re: bgpd optimize filter rules

2018-12-03 Thread Job Snijders
On Mon, Dec 03, 2018 at 12:14:13PM +0100, Claudio Jeker wrote:
> There is a trivial optimization that bgpd can do when loading the filter
> ruleset. If the rule is the same as the previous rule than the filterset
> can be merged.  e.g.
> 
> match from ebgp set community delete $myAS:*
> match from ebgp set community $myAS:15
> match from ebgp set med 100
> 
> Will be optimized into:
> 
> match from ebgp set { metric 100 community delete $myAS:* community 
> $myAS:15 }
> 
> The following diff is doing this and saves around 5% of the rules in
> arouteserver configs and probably similar amount in other peoples config.

OK job@



vmctl wait

2018-12-03 Thread Claudio Jeker
This adds a feature to vmctl/vmd to wait for a VM to stop.
It is a feature usable in many situation where you wait for a VM to halt
after work is done. This is more or less vmctl stop  -w without
sending the termination to the VM.

There is only one vmctl that can wait so if a second one comes in the
previous one will terminate with an error. This is why I modified the
error path a bit in vmctl.

-- 
:wq Claudio


Index: vmctl/main.c
===
RCS file: /cvs/src/usr.sbin/vmctl/main.c,v
retrieving revision 1.48
diff -u -p -r1.48 main.c
--- vmctl/main.c26 Nov 2018 10:39:30 -  1.48
+++ vmctl/main.c3 Dec 2018 17:11:08 -
@@ -63,6 +63,7 @@ intctl_reset(struct parse_result *, i
 int ctl_start(struct parse_result *, int, char *[]);
 int ctl_status(struct parse_result *, int, char *[]);
 int ctl_stop(struct parse_result *, int, char *[]);
+int ctl_waitfor(struct parse_result *, int, char *[]);
 int ctl_pause(struct parse_result *, int, char *[]);
 int ctl_unpause(struct parse_result *, int, char *[]);
 int ctl_send(struct parse_result *, int, char *[]);
@@ -82,6 +83,7 @@ struct ctl_command ctl_commands[] = {
"\t\t[-n switch] [-i count] [-d disk]* [-t name]" },
{ "status", CMD_STATUS, ctl_status, "[id]" },
{ "stop",   CMD_STOP,   ctl_stop,   "[id|-a] [-fw]" },
+   { "wait",   CMD_WAITFOR,ctl_waitfor,"id" },
{ "pause",  CMD_PAUSE,  ctl_pause,  "id" },
{ "unpause",CMD_UNPAUSE,ctl_unpause,"id" },
{ "send",   CMD_SEND,   ctl_send,   "id",   1},
@@ -178,7 +180,7 @@ parse(int argc, char *argv[])
err(1, "pledge");
}
if (ctl->main(, argc, argv) != 0)
-   err(1, "failed");
+   exit(1);
 
if (ctl_sock != -1) {
close(ibuf->fd);
@@ -251,6 +253,9 @@ vmmaction(struct parse_result *res)
imsg_compose(ibuf, IMSG_CTL_RESET, 0, 0, -1,
>mode, sizeof(res->mode));
break;
+   case CMD_WAITFOR:
+   waitfor_vm(res->id, res->name);
+   break;
case CMD_PAUSE:
pause_vm(res->id, res->name);
break;
@@ -310,6 +315,9 @@ vmmaction(struct parse_result *res)
done = vm_start_complete(, ,
tty_autoconnect);
break;
+   case CMD_WAITFOR:
+   flags = VMOP_WAIT;
+   /* FALLTHROUGH */
case CMD_STOP:
done = terminate_vm_complete(, ,
flags);
@@ -337,7 +345,10 @@ vmmaction(struct parse_result *res)
}
}
 
-   return (0);
+   if (ret)
+   return (1);
+   else
+   return (0);
 }
 
 void
@@ -941,6 +952,18 @@ ctl_stop(struct parse_result *res, int a
 
 int
 ctl_console(struct parse_result *res, int argc, char *argv[])
+{
+   if (argc == 2) {
+   if (parse_vmid(res, argv[1], 0) == -1)
+   errx(1, "invalid id: %s", argv[1]);
+   } else if (argc != 2)
+   ctl_usage(res->ctl);
+
+   return (vmmaction(res));
+}
+
+int
+ctl_waitfor(struct parse_result *res, int argc, char *argv[])
 {
if (argc == 2) {
if (parse_vmid(res, argv[1], 0) == -1)
Index: vmctl/vmctl.8
===
RCS file: /cvs/src/usr.sbin/vmctl/vmctl.8,v
retrieving revision 1.54
diff -u -p -r1.54 vmctl.8
--- vmctl/vmctl.8   20 Nov 2018 12:48:16 -  1.54
+++ vmctl/vmctl.8   3 Dec 2018 17:11:08 -
@@ -234,6 +234,8 @@ Stop all running VMs.
 .It Cm unpause Ar id
 Unpause (resume from a paused state) a VM with the specified
 .Ar id .
+.It Cm wait Ar id
+Wait until the specified VM has stopped.
 .El
 .Pp
 If the
Index: vmctl/vmctl.c
===
RCS file: /cvs/src/usr.sbin/vmctl/vmctl.c,v
retrieving revision 1.63
diff -u -p -r1.63 vmctl.c
--- vmctl/vmctl.c   26 Nov 2018 10:39:30 -  1.63
+++ vmctl/vmctl.c   3 Dec 2018 17:11:08 -
@@ -496,6 +496,10 @@ terminate_vm_complete(struct imsg *imsg,
fprintf(stderr, "vm not found\n");
*ret = EIO;
break;
+   case EINTR:
+   fprintf(stderr, "interrupted call\n");
+   *ret = EIO;
+   break;
default:
errno = res;
fprintf(stderr, "failed: %s\n",

Re: More MH_ALIGN conversions

2018-12-03 Thread Alexander Bluhm
On Mon, Dec 03, 2018 at 11:57:34AM +0100, Claudio Jeker wrote:
> Next round of conversions. Additionally to converting MH_ALIGN() to
> m_align() this also switches m_gethdr/M_GETHDR calls to m_get/M_GET calls
> because M_MOVE_PKTHDR() is initialising the pkthdr and so there is no need
> to do that when allocating the mbuf.
> 
> OK?

OK bluhm@

> Index: net/if_gre.c
> ===
> RCS file: /cvs/src/sys/net/if_gre.c,v
> retrieving revision 1.140
> diff -u -p -r1.140 if_gre.c
> --- net/if_gre.c  29 Nov 2018 00:14:29 -  1.140
> +++ net/if_gre.c  30 Nov 2018 09:48:19 -
> @@ -1939,7 +1939,8 @@ egre_start(struct ifnet *ifp)
>   bpf_mtap_ether(if_bpf, m0, BPF_DIRECTION_OUT);
>  #endif
>  
> - m = m_gethdr(M_DONTWAIT, m0->m_type);
> + /* force prepend mbuf because of alignment problems */
> + m = m_get(M_DONTWAIT, m0->m_type);
>   if (m == NULL) {
>   m_freem(m0);
>   continue;
> @@ -1948,7 +1949,7 @@ egre_start(struct ifnet *ifp)
>   M_MOVE_PKTHDR(m, m0);
>   m->m_next = m0;
>  
> - MH_ALIGN(m, 0);
> + m_align(m, 0);
>   m->m_len = 0;
>  
>   m = gre_encap(>sc_tunnel, m, htons(ETHERTYPE_TRANSETHER),
> @@ -3757,7 +3758,8 @@ nvgre_start(struct ifnet *ifp)
>   rw_exit_read(>sc_ether_lock);
>   }
>  
> - m = m_gethdr(M_DONTWAIT, m0->m_type);
> + /* force prepend mbuf because of alignment problems */
> + m = m_get(M_DONTWAIT, m0->m_type);
>   if (m == NULL) {
>   m_freem(m0);
>   continue;
> @@ -3766,7 +3768,7 @@ nvgre_start(struct ifnet *ifp)
>   M_MOVE_PKTHDR(m, m0);
>   m->m_next = m0;
>  
> - MH_ALIGN(m, 0);
> + m_align(m, 0);
>   m->m_len = 0;
>  
>   m = gre_encap_dst(tunnel, , m,
> @@ -3932,7 +3934,8 @@ eoip_start(struct ifnet *ifp)
>   bpf_mtap_ether(if_bpf, m0, BPF_DIRECTION_OUT);
>  #endif
>  
> - m = m_gethdr(M_DONTWAIT, m0->m_type);
> + /* force prepend mbuf because of alignment problems */
> + m = m_get(M_DONTWAIT, m0->m_type);
>   if (m == NULL) {
>   m_freem(m0);
>   continue;
> @@ -3941,7 +3944,7 @@ eoip_start(struct ifnet *ifp)
>   M_MOVE_PKTHDR(m, m0);
>   m->m_next = m0;
>  
> - MH_ALIGN(m, 0);
> + m_align(m, 0);
>   m->m_len = 0;
>  
>   m = eoip_encap(sc, m, gre_l2_tos(>sc_tunnel, m));
> Index: net/if_vxlan.c
> ===
> RCS file: /cvs/src/sys/net/if_vxlan.c,v
> retrieving revision 1.69
> diff -u -p -r1.69 if_vxlan.c
> --- net/if_vxlan.c15 Nov 2018 22:22:03 -  1.69
> +++ net/if_vxlan.c30 Nov 2018 09:48:19 -
> @@ -867,8 +867,8 @@ vxlan_output(struct ifnet *ifp, struct m
>   uint32_t tag;
>   struct mbuf *m0;
>  
> - /* VXLAN header */
> - MGETHDR(m0, M_DONTWAIT, m->m_type);
> + /* VXLAN header, needs new mbuf because of alignment issues */
> + MGET(m0, M_DONTWAIT, m->m_type);
>   if (m0 == NULL) {
>   ifp->if_oerrors++;
>   return (ENOBUFS);
> @@ -876,7 +876,7 @@ vxlan_output(struct ifnet *ifp, struct m
>   M_MOVE_PKTHDR(m0, m);
>   m0->m_next = m;
>   m = m0;
> - MH_ALIGN(m, sizeof(*vu));
> + m_align(m, sizeof(*vu));
>   m->m_len = sizeof(*vu);
>   m->m_pkthdr.len += sizeof(*vu);
>  
> Index: netinet6/ip6_output.c
> ===
> RCS file: /cvs/src/sys/netinet6/ip6_output.c,v
> retrieving revision 1.240
> diff -u -p -r1.240 ip6_output.c
> --- netinet6/ip6_output.c 9 Nov 2018 14:14:32 -   1.240
> +++ netinet6/ip6_output.c 30 Nov 2018 09:48:19 -
> @@ -2571,13 +2571,13 @@ ip6_splithdr(struct mbuf *m, struct ip6_
>  
>   ip6 = mtod(m, struct ip6_hdr *);
>   if (m->m_len > sizeof(*ip6)) {
> - MGETHDR(mh, M_DONTWAIT, MT_HEADER);
> + MGET(mh, M_DONTWAIT, MT_HEADER);
>   if (mh == NULL) {
>   m_freem(m);
>   return ENOBUFS;
>   }
>   M_MOVE_PKTHDR(mh, m);
> - MH_ALIGN(mh, sizeof(*ip6));
> + m_align(mh, sizeof(*ip6));
>   m->m_len -= sizeof(*ip6);
>   m->m_data += sizeof(*ip6);
>   mh->m_next = m;



bgpd, network add broken with rdomains ?

2018-12-03 Thread Julien Dhaille
Hi. I am using bgpd within a rdomain (1).
After the upgrade to 6.4 stable, I can’t announce prefixes anymore via
bgpctl :

router# ps aux -o rtable|grep bgp

root  4039  0.0  0.1   300  1292 p0  S+p    5:12PM    0:00.00 grep
bgp  0
root 68170  0.0  0.2  1056  2060 p2  I+ 4:52PM    0:00.01 bgpd
-dvv 1
_bgpd    80238  0.0  0.4  4160  4264 p2  I+p    4:52PM    0:00.01 bgpd:
route deci        1
_bgpd    26255  0.0  0.2  1456  2164 p2  S+p    4:52PM    0:00.04 bgpd:
session en  1

router# route -T1 exec bgpctl network add 10.0.0.1/32 nexthop 10.0.0.20
prepend-self 11
or
router# route -T1 exec bgpctl -s /var/run/bgpd.sock.1 network add
10.0.0.1/32 nexthop 10.0.0.20 prepend-self 11

results in :

network_add: prefix 10.0.0.1/32 in non-existing rdomain 0

Am I missing a change or something ?

thanks



Re: usbd_delay_ms() and splusb(), in axen(4)

2018-12-03 Thread Nils Frohberg
On Mon, Dec 03, 2018 at 01:44:38PM +0100, Mark Kettenis wrote:
> > Date: Mon, 3 Dec 2018 12:51:32 +0100
> > From: Nils Frohberg 
> >
> > Hi,
> >
> > is it safe to call usbd_delay_ms() from under splusb()? In axen(4),
> > axen_rxeof() is called from usb_transfer_complete() which runs at
> > splusb() (called via usbd_transfer()).
>
> It is perfectly fine to call usbd_delay_ms() from within code
> protected by splusb().  It is not ok to call this code from interrupt
> context though.
>

Thanks for the reply. Maybe I didn't dig deep enough yet.

> > The following patch prevents a panic on my system.
>
> What panic?  And what is the ddb backtrace you see?

Unfortunately the box locks up and I have to cycle its power to
resume. The following is OCRed/hand copied.

axen0: axen_rxeof: hdr_offset (13518) > total_len (12288)
splassert: assertwaitok: want 0 have 5
Starting stack trace...
assertwaitok() at assertwaitok+0x40
mi_switch() at mi_switch+0x4c
sleep_finish(7f46a7a3c4f58676,800031797fb0) at sleep_finish+0x7f
sleep_finish_all(35d5abeefe2b03b6,800031797fb0) at sleep_finish_all+0x1f
tsleep(966025df8a2b3822,34ce,80a8c000,ff0079fed000) at tsleep+0xcd
axen_rxeof(a32d84d5ed79e3d7,ff027e480878,80e33000) at 
axen_rxeof+0x368
usb_transfer_complete(7f61ed570abf387f) at usb_transfer_complete+0x1ff
xhci_event_dequeue(1) at xhci_event_dequeue+0xfd
xhci_softintr(27dca8a8be38a839) at xhci_softintr+0x30
softintr_dispatch(e606342f3f0cdd) at softintr_dispatch+0xfc
Xsoftnet(4,8193c2b0,4,18041969,0,d) at Xsoftnet+0x1f
Xspllower(1aae5abe56c575fd,81d35ab8,814cb23f,81d35aa8,81d35aa0,826bd2fe661a86e9)
 at XSpllower+0x19
softintr_dispatch(e606342f3f002d) at softintr_dispatch+0xed
Xsoftclock(1aae5abe56b35319,8109cff0,814cb23f,0,8000f9c8,826bd2fe661a86e9)
 at XSoftclock+0x1f
sched_idle(0) at sched_idle+0x1b9
end trace frame: 0x0, count: 242
End of stack trace.
panic: kernel diagnostic assertion ”p->p_wchan == NULL" failed: file 
"/usr/src/sys/kern/kern_sched.c", line 338
Stopped at  db_enter+0x12:  popq%r11
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
db_enter() at db_enter+0x12
panic() at panic+0x120
assert(816da0d4,800031797ea0,8109cff0,8000f9c8) at 
__assert+0x24
sched_chooseproc() at sched_chooseproc+0x241
mi_switch() at mi_switch+0x1b4
sleep_finish(7f46a7a3c4f58676,800031797fb0) at sleep_finish+0x7f
sleep_finish_all(35d5abeefe2b03b6,800031797fb0) at sleep_finish_all+0x1f
tsleep(966025df8a2b3822,34ce,80a8,ff0079fed000) at tsleep+0xcd
axen_rxeof(a32d84d5ed79e3d7,ff027e480878,80e33000) at axen_rxeof+0x 
368
usb_transfer_complete(7f61ed570abf387f) at usb_transfer_complete+0x1ff
xhci_event_dequeue(1) at xhci_event_dequeue+0xfd
xhci_softintr(27dca8a8be38a839) at xhci_softintr+0x30
softintr_dispatch(e606342f3f00dd) at softintr_dispatch+0xfc
Xsoftnet(4,8193c2b0,4,18041969,0,d) at Xsoftnet+0x1f
end trace frame: 0x800031798280, count: 0
https://www.openbsd.org/ddb.htm1 describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb{0}>

>
> > Index: sys/dev/usb/if_axen.c
> > ===
> > RCS file: /cvs/src/sys/dev/usb/if_axen.c,v
> > retrieving revision 1.25
> > diff -u -r1.25 if_axen.c
> > --- sys/dev/usb/if_axen.c   12 Jun 2018 06:59:27 -  1.25
> > +++ sys/dev/usb/if_axen.c   30 Nov 2018 14:47:49 -
> > @@ -937,7 +937,6 @@
> > /* sanity check */
> > if (hdr_offset > total_len) {
> > ifp->if_ierrors++;
> > -   usbd_delay_ms(sc->axen_udev, 100);
> > goto done;
> > }
> >
> >
> >
>



Re: usbd_delay_ms() and splusb(), in axen(4)

2018-12-03 Thread Mark Kettenis
> Date: Mon, 3 Dec 2018 12:51:32 +0100
> From: Nils Frohberg 
> 
> Hi,
> 
> is it safe to call usbd_delay_ms() from under splusb()? In axen(4),
> axen_rxeof() is called from usb_transfer_complete() which runs at
> splusb() (called via usbd_transfer()).

It is perfectly fine to call usbd_delay_ms() from within code
protected by splusb().  It is not ok to call this code from interrupt
context though.

> The following patch prevents a panic on my system.

What panic?  And what is the ddb backtrace you see?

> Index: sys/dev/usb/if_axen.c
> ===
> RCS file: /cvs/src/sys/dev/usb/if_axen.c,v
> retrieving revision 1.25
> diff -u -r1.25 if_axen.c
> --- sys/dev/usb/if_axen.c 12 Jun 2018 06:59:27 -  1.25
> +++ sys/dev/usb/if_axen.c 30 Nov 2018 14:47:49 -
> @@ -937,7 +937,6 @@
>   /* sanity check */
>   if (hdr_offset > total_len) {
>   ifp->if_ierrors++;
> - usbd_delay_ms(sc->axen_udev, 100);
>   goto done;
>   }
>  
> 
> 



fix #define in axen(4)

2018-12-03 Thread Nils Frohberg
Hi,

I'm currently investigating crashes that occur when using an ASIX
88179 USB 3.0 adapter. I'm comparing axen(4) with FreeBSD and Linux
drivers. There is a bigger patch to clean up axen(4) in the pipeline,
but there's still a bit of work to be done.

The following patch was sent by sc.dying to tech@ in 2017, but was
never comitted.

It fixes missing spaces in the diagram to align bits in receive
header and sets the L3 type mask to the correct value.

AXEN_RXHDR_L3_TYPE_MASK is currently not used, but I will want to
check it in the future.

- source: sc.dying's patch
  https://marc.info/?l=openbsd-tech=150012119607284=4

- This matches FreeBSD's #define, the diagram (L3_type, bits 5 and
  6 of receive header), and Linux
  
https://github.com/freebsd/freebsd/blob/5385d2813017f54e4091cfc28d89974abc69beeb/sys/dev/usb/net/if_axgereg.h#L184
  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/usb/ax88179_178a.c

- L3/4_err must have slipped by 1 bit in diagram on yuo@' commit
  when fixing drop/mii/crc_err (FreeBSD's driver shows the same bit
  masks)
  https://github.com/openbsd/src/commit/29ef1ec31c0315fbf760dd223695935daf23

- L3 type mask now also matches the diagram and L3 type offset
  value

Index: sys/dev/usb/if_axenreg.h
===
RCS file: /cvs/src/sys/dev/usb/if_axenreg.h,v
retrieving revision 1.6
diff -u -r1.6 if_axenreg.h
--- sys/dev/usb/if_axenreg.h14 Sep 2016 12:41:09 -  1.6
+++ sys/dev/usb/if_axenreg.h27 Nov 2018 13:11:54 -
@@ -26,8 +26,8 @@
  * || ++-L3_type (1:ipv4, 0/2:ipv6)
  *pkt_len(13)  || ||+ ++-L4_type(0: icmp, 1: UDP, 4: TCP)
  * |765|43210 76543210|7654 3210 7654 3210|
- *  ||+-crc_err  |+-L4_err |+-L4_CSUM_ERR
- *  |+-mii_err   +--L3_err +--L3_CSUM_ERR
+ *  ||+-crc_err   |+-L4_err |+-L4_CSUM_ERR
+ *  |+-mii_err+--L3_err +--L3_CSUM_ERR
  *  +-drop_err
  *
  * ex) pkt_hdr 0x00680820
@@ -70,7 +70,7 @@
 #define   AXEN_RXHDR_L4_TYPE_TCP   0x4
 
 /* L3 packet type (2bit) */
-#define AXEN_RXHDR_L3_TYPE_MASK0x0600
+#define AXEN_RXHDR_L3_TYPE_MASK0x0060
 #define AXEN_RXHDR_L3_TYPE_OFFSET  5
 #define   AXEN_RXHDR_L3_TYPE_UNDEF 0x0
 #define   AXEN_RXHDR_L3_TYPE_IPV4  0x1



usbd_delay_ms() and splusb(), in axen(4)

2018-12-03 Thread Nils Frohberg
Hi,

is it safe to call usbd_delay_ms() from under splusb()? In axen(4),
axen_rxeof() is called from usb_transfer_complete() which runs at
splusb() (called via usbd_transfer()).

The following patch prevents a panic on my system.

Index: sys/dev/usb/if_axen.c
===
RCS file: /cvs/src/sys/dev/usb/if_axen.c,v
retrieving revision 1.25
diff -u -r1.25 if_axen.c
--- sys/dev/usb/if_axen.c   12 Jun 2018 06:59:27 -  1.25
+++ sys/dev/usb/if_axen.c   30 Nov 2018 14:47:49 -
@@ -937,7 +937,6 @@
/* sanity check */
if (hdr_offset > total_len) {
ifp->if_ierrors++;
-   usbd_delay_ms(sc->axen_udev, 100);
goto done;
}
 



bgpd optimize filter rules

2018-12-03 Thread Claudio Jeker
There is a trivial optimization that bgpd can do when loading the filter
ruleset. If the rule is the same as the previous rule than the filterset
can be merged.  e.g.

match from ebgp set community delete $myAS:*
match from ebgp set community $myAS:15
match from ebgp set med 100

Will be optimized into:

match from ebgp set { metric 100 community delete $myAS:* community 
$myAS:15 }

The following diff is doing this and saves around 5% of the rules in
arouteserver configs and probably similar amount in other peoples config.
-- 
:wq Claudio

Index: bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.355
diff -u -p -r1.355 bgpd.h
--- bgpd.h  28 Nov 2018 08:32:26 -  1.355
+++ bgpd.h  28 Nov 2018 10:07:50 -
@@ -966,10 +966,10 @@ enum action_types {
ACTION_SET_NEXTHOP_BLACKHOLE,
ACTION_SET_NEXTHOP_NOMODIFY,
ACTION_SET_NEXTHOP_SELF,
-   ACTION_SET_COMMUNITY,
ACTION_DEL_COMMUNITY,
-   ACTION_SET_EXT_COMMUNITY,
+   ACTION_SET_COMMUNITY,
ACTION_DEL_EXT_COMMUNITY,
+   ACTION_SET_EXT_COMMUNITY,
ACTION_PFTABLE,
ACTION_PFTABLE_ID,
ACTION_RTLABEL,
Index: parse.y
===
RCS file: /cvs/src/usr.sbin/bgpd/parse.y,v
retrieving revision 1.364
diff -u -p -r1.364 parse.y
--- parse.y 28 Nov 2018 08:32:27 -  1.364
+++ parse.y 28 Nov 2018 10:09:34 -
@@ -150,7 +150,7 @@ int  expand_rule(struct filter_rule *, 
 int str2key(char *, char *, size_t);
 int neighbor_consistent(struct peer *);
 int merge_filterset(struct filter_set_head *, struct filter_set *);
-voidmerge_filter_lists(struct filter_head *, struct filter_head *);
+voidoptimize_filters(struct filter_head *);
 struct filter_rule *get_rule(enum action_types);
 
 int parsecommunity(struct filter_community *, int, char *);
@@ -3345,13 +3345,15 @@ parse_config(char *filename, struct bgpd
free_config(conf);
} else {
/*
-* Move filter list and static group and peer filtersets
+* Concatenate filter list and static group and peer filtersets
 * together. Static group sets come first then peer sets
 * last normal filter rules.
 */
-   merge_filter_lists(conf->filters, groupfilter_l);
-   merge_filter_lists(conf->filters, peerfilter_l);
-   merge_filter_lists(conf->filters, filter_l);
+   TAILQ_CONCAT(conf->filters, groupfilter_l, entry);
+   TAILQ_CONCAT(conf->filters, peerfilter_l, entry);
+   TAILQ_CONCAT(conf->filters, filter_l, entry);
+
+   optimize_filters(conf->filters);
 
errors += mrt_mergeconfig(xconf->mrt, conf->mrt);
errors += merge_config(xconf, conf, peer_l);
@@ -4180,39 +4182,17 @@ neighbor_consistent(struct peer *p)
return (0);
 }
 
-int
-merge_filterset(struct filter_set_head *sh, struct filter_set *s)
+static void
+filterset_add(struct filter_set_head *sh, struct filter_set *s)
 {
struct filter_set   *t;
 
TAILQ_FOREACH(t, sh, entry) {
-   /*
-* need to cycle across the full list because even
-* if types are not equal filterset_cmp() may return 0.
-*/
-   if (filterset_cmp(s, t) == 0) {
-   if (s->type == ACTION_SET_COMMUNITY)
-   yyerror("community is already set");
-   else if (s->type == ACTION_DEL_COMMUNITY)
-   yyerror("community will already be deleted");
-   else if (s->type == ACTION_SET_EXT_COMMUNITY)
-   yyerror("ext-community is already set");
-   else if (s->type == ACTION_DEL_EXT_COMMUNITY)
-   yyerror(
-   "ext-community will already be deleted");
-   else
-   yyerror("redefining set parameter %s",
-   filterset_name(s->type));
-   return (-1);
-   }
-   }
-
-   TAILQ_FOREACH(t, sh, entry) {
if (s->type < t->type) {
TAILQ_INSERT_BEFORE(t, s, entry);
-   return (0);
+   return;
}
-   if (s->type == t->type)
+   if (s->type == t->type) {
switch (s->type) {
case ACTION_SET_COMMUNITY:
case ACTION_DEL_COMMUNITY:
@@ -4220,42 +4200,145 @@ merge_filterset(struct filter_set_head *
>action.community,
   

More MH_ALIGN conversions

2018-12-03 Thread Claudio Jeker
Next round of conversions. Additionally to converting MH_ALIGN() to
m_align() this also switches m_gethdr/M_GETHDR calls to m_get/M_GET calls
because M_MOVE_PKTHDR() is initialising the pkthdr and so there is no need
to do that when allocating the mbuf.

OK?
-- 
:wq Claudio

 
Index: net/if_gre.c
===
RCS file: /cvs/src/sys/net/if_gre.c,v
retrieving revision 1.140
diff -u -p -r1.140 if_gre.c
--- net/if_gre.c29 Nov 2018 00:14:29 -  1.140
+++ net/if_gre.c30 Nov 2018 09:48:19 -
@@ -1939,7 +1939,8 @@ egre_start(struct ifnet *ifp)
bpf_mtap_ether(if_bpf, m0, BPF_DIRECTION_OUT);
 #endif
 
-   m = m_gethdr(M_DONTWAIT, m0->m_type);
+   /* force prepend mbuf because of alignment problems */
+   m = m_get(M_DONTWAIT, m0->m_type);
if (m == NULL) {
m_freem(m0);
continue;
@@ -1948,7 +1949,7 @@ egre_start(struct ifnet *ifp)
M_MOVE_PKTHDR(m, m0);
m->m_next = m0;
 
-   MH_ALIGN(m, 0);
+   m_align(m, 0);
m->m_len = 0;
 
m = gre_encap(>sc_tunnel, m, htons(ETHERTYPE_TRANSETHER),
@@ -3757,7 +3758,8 @@ nvgre_start(struct ifnet *ifp)
rw_exit_read(>sc_ether_lock);
}
 
-   m = m_gethdr(M_DONTWAIT, m0->m_type);
+   /* force prepend mbuf because of alignment problems */
+   m = m_get(M_DONTWAIT, m0->m_type);
if (m == NULL) {
m_freem(m0);
continue;
@@ -3766,7 +3768,7 @@ nvgre_start(struct ifnet *ifp)
M_MOVE_PKTHDR(m, m0);
m->m_next = m0;
 
-   MH_ALIGN(m, 0);
+   m_align(m, 0);
m->m_len = 0;
 
m = gre_encap_dst(tunnel, , m,
@@ -3932,7 +3934,8 @@ eoip_start(struct ifnet *ifp)
bpf_mtap_ether(if_bpf, m0, BPF_DIRECTION_OUT);
 #endif
 
-   m = m_gethdr(M_DONTWAIT, m0->m_type);
+   /* force prepend mbuf because of alignment problems */
+   m = m_get(M_DONTWAIT, m0->m_type);
if (m == NULL) {
m_freem(m0);
continue;
@@ -3941,7 +3944,7 @@ eoip_start(struct ifnet *ifp)
M_MOVE_PKTHDR(m, m0);
m->m_next = m0;
 
-   MH_ALIGN(m, 0);
+   m_align(m, 0);
m->m_len = 0;
 
m = eoip_encap(sc, m, gre_l2_tos(>sc_tunnel, m));
Index: net/if_vxlan.c
===
RCS file: /cvs/src/sys/net/if_vxlan.c,v
retrieving revision 1.69
diff -u -p -r1.69 if_vxlan.c
--- net/if_vxlan.c  15 Nov 2018 22:22:03 -  1.69
+++ net/if_vxlan.c  30 Nov 2018 09:48:19 -
@@ -867,8 +867,8 @@ vxlan_output(struct ifnet *ifp, struct m
uint32_t tag;
struct mbuf *m0;
 
-   /* VXLAN header */
-   MGETHDR(m0, M_DONTWAIT, m->m_type);
+   /* VXLAN header, needs new mbuf because of alignment issues */
+   MGET(m0, M_DONTWAIT, m->m_type);
if (m0 == NULL) {
ifp->if_oerrors++;
return (ENOBUFS);
@@ -876,7 +876,7 @@ vxlan_output(struct ifnet *ifp, struct m
M_MOVE_PKTHDR(m0, m);
m0->m_next = m;
m = m0;
-   MH_ALIGN(m, sizeof(*vu));
+   m_align(m, sizeof(*vu));
m->m_len = sizeof(*vu);
m->m_pkthdr.len += sizeof(*vu);
 
Index: netinet6/ip6_output.c
===
RCS file: /cvs/src/sys/netinet6/ip6_output.c,v
retrieving revision 1.240
diff -u -p -r1.240 ip6_output.c
--- netinet6/ip6_output.c   9 Nov 2018 14:14:32 -   1.240
+++ netinet6/ip6_output.c   30 Nov 2018 09:48:19 -
@@ -2571,13 +2571,13 @@ ip6_splithdr(struct mbuf *m, struct ip6_
 
ip6 = mtod(m, struct ip6_hdr *);
if (m->m_len > sizeof(*ip6)) {
-   MGETHDR(mh, M_DONTWAIT, MT_HEADER);
+   MGET(mh, M_DONTWAIT, MT_HEADER);
if (mh == NULL) {
m_freem(m);
return ENOBUFS;
}
M_MOVE_PKTHDR(mh, m);
-   MH_ALIGN(mh, sizeof(*ip6));
+   m_align(mh, sizeof(*ip6));
m->m_len -= sizeof(*ip6);
m->m_data += sizeof(*ip6);
mh->m_next = m;



Re: uvm_fault: ip_ctloutput

2018-12-03 Thread Claudio Jeker
On Sun, Dec 02, 2018 at 11:15:03AM +0100, Claudio Jeker wrote:
> On Sun, Dec 02, 2018 at 09:29:23AM +0100, Claudio Jeker wrote:
> > On Sat, Dec 01, 2018 at 06:44:31PM -0800, Greg Steuck wrote:
> > > This thwarts the reproducer. Again, I don't know if the invariants are
> > > getting violated somewhere else and the patch below is simply papering 
> > > over
> > > the symptoms.
> > 
> > I would like to better understand how we get so far with a socket where
> > so_pcb is not initiallized. This and also the other bug are baisically the
> > same. The stack assumes that after a successful socket() operation both
> > socket and pcb exist and are a connected. Since this seems to not be
> > the case it is important to catch those errors further up in uipc_socket.c
> > before passing down into protocol specific functions.
> >  
> 
> So the issue is the double connect() call on the SOCk_RAW socket.
> The second connect is calling PRU_DISCONNECT which in the end does a
> FALLTHROUGH into PRU_ABORT which removes the inp by calling
> in_pcbdetach().
> 
> I think the proper fix is to not have this FALLTHROUGH and just call
> soisdisconnected(). Maybe inp->inp_faddr should also be reset to 0.
> 
> This will fix also other double connect() SOCk_RAW crashes you spotted.

The version I will commit will also reset inp->inp_faddr since this is
what other protos are doing and it is more correct anyway.

-- 
:wq Claudio

Index: raw_ip.c
===
RCS file: /cvs/src/sys/netinet/raw_ip.c,v
retrieving revision 1.115
diff -u -p -r1.115 raw_ip.c
--- raw_ip.c10 Nov 2018 18:40:34 -  1.115
+++ raw_ip.c3 Dec 2018 08:13:24 -
@@ -385,7 +385,9 @@ rip_usrreq(struct socket *so, int req, s
error = ENOTCONN;
break;
}
-   /* FALLTHROUGH */
+   soisdisconnected(so);
+   inp->inp_faddr.s_addr = INADDR_ANY;
+   break;
case PRU_ABORT:
soisdisconnected(so);
if (inp == NULL)