timeout_barrier(9)

2017-11-16 Thread David Gwynne
this adds a timeout_barrier function, which works like all the other
barriers we have (intr_barrier, ifq_barrier, etc).

my use case for this is a network driver im working on, which uses
a timeout to refill the rx ring if no mbufs are available in the
system, but needs to wait for a timeout to finish using the rx ring
structures before they're freed when the interface is being shut
down.

so in abstract, if you want to free something that a timeout refers
to, you need to wait for the timeout to stop using it before you
free it. if you're freeing without the kernel lock and splsoftnet
raised, you can't know if the timeout is currently running or not.
if you're using a timeout that gets handled in the softclock thread,
and it is sleeping, you can't free it yet either.

does that make sense?

Index: share/man/man9/timeout.9
===
RCS file: /cvs/src/share/man/man9/timeout.9,v
retrieving revision 1.44
diff -u -p -r1.44 timeout.9
--- share/man/man9/timeout.922 Sep 2016 12:55:24 -  1.44
+++ share/man/man9/timeout.917 Nov 2017 04:19:31 -
@@ -38,6 +38,7 @@
 .Nm timeout_add_ts ,
 .Nm timeout_add_bt ,
 .Nm timeout_del ,
+.Nm timeout_barrier ,
 .Nm timeout_pending ,
 .Nm timeout_initialized ,
 .Nm timeout_triggered ,
@@ -54,6 +55,8 @@
 .Fn timeout_add "struct timeout *to" "int ticks"
 .Ft int
 .Fn timeout_del "struct timeout *to"
+.Ft void
+.Fn timeout_barrier "struct timeout *to"
 .Ft int
 .Fn timeout_pending "struct timeout *to"
 .Ft int
@@ -153,6 +156,11 @@ will cancel the timeout in the argument
 If the timeout has already executed or has never been added
 the call will have no effect.
 .Pp
+.Fn timeout_barrier
+ensures that any current execution of the timeout in the argument
+.Fa to
+has completed before returning.
+.Pp
 The
 .Fn timeout_pending
 macro can be used to check if a timeout is scheduled to run.
@@ -217,6 +225,9 @@ context.
 can be called during autoconf, from process context, or from any
 interrupt context at or below
 .Dv IPL_CLOCK .
+.Pp
+.Fn timeout_barrier
+can be called from process context.
 .Pp
 When the timeout runs, the
 .Fa fn
Index: sys/sys/timeout.h
===
RCS file: /cvs/src/sys/sys/timeout.h,v
retrieving revision 1.26
diff -u -p -r1.26 timeout.h
--- sys/sys/timeout.h   22 Sep 2016 12:55:24 -  1.26
+++ sys/sys/timeout.h   17 Nov 2017 04:19:31 -
@@ -99,6 +99,7 @@ int timeout_add_msec(struct timeout *, i
 int timeout_add_usec(struct timeout *, int);
 int timeout_add_nsec(struct timeout *, int);
 int timeout_del(struct timeout *);
+void timeout_barrier(struct timeout *);
 
 void timeout_startup(void);
 void timeout_adjust_ticks(int);
Index: sys/kern/kern_timeout.c
===
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.50
diff -u -p -r1.50 kern_timeout.c
--- sys/kern/kern_timeout.c 3 Oct 2016 11:54:29 -   1.50
+++ sys/kern/kern_timeout.c 17 Nov 2017 04:19:31 -
@@ -324,6 +324,45 @@ timeout_del(struct timeout *to)
return (ret);
 }
 
+void   timeout_proc_barrier(void *);
+
+void
+timeout_barrier(struct timeout *to)
+{
+   if (!ISSET(to->to_flags, TIMEOUT_NEEDPROCCTX)) {
+   KERNEL_LOCK();
+   splx(splsoftclock());
+   KERNEL_UNLOCK();
+   } else {
+   int wait = 1;
+   struct timeout barrier;
+   struct sleep_state sls;
+
+   timeout_set(, timeout_proc_barrier, );
+
+   mtx_enter(_mutex);
+   CIRCQ_INSERT(>to_list, _proc);
+   mtx_leave(_mutex);
+
+   wakeup_one(_proc);
+
+   while (wait) {
+   sleep_setup(, , PSWP, "tmobar");
+   sleep_finish(, wait);
+   }
+   }
+}
+
+void
+timeout_proc_barrier(void *arg)
+{
+   int *wait = arg;
+
+   *wait = 0;
+
+   wakeup_one(wait);
+}
+
 /*
  * This is called from hardclock() once every tick.
  * We return !0 if we need to schedule a softclock.



Re: add an iqdrops view to systat to show interface queue drops

2017-11-16 Thread David Gwynne

> On 17 Nov 2017, at 05:39, Claudio Jeker  wrote:
> 
> On Thu, Nov 16, 2017 at 03:21:20PM +0100, Alexander Bluhm wrote:
>> On Wed, Nov 15, 2017 at 01:29:42PM +1000, David Gwynne wrote:
>>> im adding numbers to input and output qdrops in the kernel, so im
>>> aware that they exist now. however, i don't really see these values
>>> in userland. it seems netstat and systat think errors are more
>>> important.
>>> 
>>> i tried adding qdrops to the ifstat view, but it got too cluttered.
>>> so i made a new view called iqdrops that shows qdrops instead of
>>> errors. i used iqdrops instead of ifqdrops so "if" on its own is
>>> still not ambiguous.
>> 
>> Now ifstat and iqdrops show almost the same information.  Just two
>> commns are different, eight culumns are redundant.  I think one
>> page would be better.  Do we need DESC?  It takes a lot of space
>> that could be used for output of dynamic counters.

can we have modified displays within a view? kind of like how some views change 
their ordering.

i agree that everything should be on the ifstat page, but there really isnt 
enough space. if we can have tweaked displays, id like the main one to 
aggregate the qdrops and errs values into a single column, and then have 
separate displays within that view to show errors and drops as separate values.

> Would it make sense to extend the mbuf page instead? It has the ring size
> and so it may make sense to show drops there.

i dont think that would make sense for interfaces like vlan.




Re: add an iqdrops view to systat to show interface queue drops

2017-11-16 Thread David Gwynne
On Thu, Nov 16, 2017 at 07:22:48AM -0700, Theo de Raadt wrote:
> > Now ifstat and iqdrops show almost the same information.  Just two
> > commns are different, eight culumns are redundant.  I think one
> > page would be better.  Do we need DESC?  It takes a lot of space
> > that could be used for output of dynamic counters.
> 
> I use DESC every single day.

me too.



Re: mbuf statistics, tracking of drops

2017-11-16 Thread David Gwynne
On Thu, Nov 16, 2017 at 08:13:39PM +0100, Gregor Best wrote:
> On Thu, Nov 16, 2017 at 11:13:04AM +1000, David Gwynne wrote:
> > 
> > > On 16 Nov 2017, at 7:23 am, Gregor Best  wrote:
> > > 
> > > Hi,
> > > 
> > > On Mon, Nov 13, 2017 at 01:47:01PM +1000, David Gwynne wrote:
> > >> [...]
> > >> pools maintain count of how many times they failed to provide an
> > >> allocation. you can watch this with vmstat -m or systat pool.
> > >> however, we could use that to populate mb_drops too.
> > >> [...]
> > > 
> > > That's certainly smarter than my idea.
> > 
> > does it work though?
> > 
> 
> Now that I think about it, not quite, or at least not without reaching
> around into the pools innards to lock them while reading the statistics
> to get a consistent view.

currently pr_nfails is an unsigned long, which can be atomically
read on all our machines. it is a consistent view of the number.
however, i would like to make the stats uint64_t one day, so i agree
that it is better to lock to read them.

the diff below factors our the reading of the pool stats into a
struct kinfo_pool, which is used internally by pool and now by the
sysctl_mbstat.

> 
> The problem is that the MBSTAT_DROPS part of the `counters` data
> structure never gets modified, so we'd still need to loop over all pools
> in sysctl_mbstat, since this is not just about mbpool but also about the
> pools for payload data.

ok. the updated diff below loops over the cluster pools too now.

> 
> On the other hand, going back to my initial proposal would mean
> essentially duplicating functionality the pools already provide, so
> that's at least a bit unelegant. Especially since it adds at least one
> splnet/splx dance.
> 
> Another option would be to just say "screw it" and count the failed
> allocations without acquiring any locks on the pools, maybe via atomic
> operations. Sounds a bit too complicated though.
> 
> At the moment, and after a night's sleep, I think my initial proposal is
> the most straightforward way to do this. That, or getting rid of the
> confusing counters in `netstat -m` that stay at 0 all the time...

getting rid of the stats does appeal to me too...

Index: sys/pool.h
===
RCS file: /cvs/src/sys/sys/pool.h,v
retrieving revision 1.74
diff -u -p -r1.74 pool.h
--- sys/pool.h  13 Aug 2017 20:26:33 -  1.74
+++ sys/pool.h  17 Nov 2017 00:58:13 -
@@ -259,6 +259,7 @@ voidpool_init(struct pool *, size_t, u
const char *, struct pool_allocator *);
 void   pool_cache_init(struct pool *);
 void   pool_destroy(struct pool *);
+void   pool_info(struct pool *, struct kinfo_pool *);
 void   pool_setlowat(struct pool *, int);
 void   pool_sethiwat(struct pool *, int);
 intpool_sethardlimit(struct pool *, u_int, const char *, int);
Index: sys/sysctl.h
===
RCS file: /cvs/src/sys/sys/sysctl.h,v
retrieving revision 1.175
diff -u -p -r1.175 sysctl.h
--- sys/sysctl.h12 Oct 2017 09:14:16 -  1.175
+++ sys/sysctl.h17 Nov 2017 00:58:13 -
@@ -936,6 +936,7 @@ int sysctl_rdstruct(void *, size_t *, vo
 int sysctl_struct(void *, size_t *, void *, size_t, void *, size_t);
 int sysctl_file(int *, u_int, char *, size_t *, struct proc *);
 int sysctl_doproc(int *, u_int, char *, size_t *);
+int sysctl_mbstat(int *, u_int, void *, size_t *, void *, size_t);
 struct mbuf_queue;
 int sysctl_mq(int *, u_int, void *, size_t *, void *, size_t,
 struct mbuf_queue *);
Index: kern/kern_sysctl.c
===
RCS file: /cvs/src/sys/kern/kern_sysctl.c,v
retrieving revision 1.330
diff -u -p -r1.330 kern_sysctl.c
--- kern/kern_sysctl.c  11 Aug 2017 21:24:19 -  1.330
+++ kern/kern_sysctl.c  17 Nov 2017 00:58:13 -
@@ -392,24 +392,9 @@ kern_sysctl(int *name, u_int namelen, vo
case KERN_FILE:
return (sysctl_file(name + 1, namelen - 1, oldp, oldlenp, p));
 #endif
-   case KERN_MBSTAT: {
-   extern struct cpumem *mbstat;
-   uint64_t counters[MBSTAT_COUNT];
-   struct mbstat mbs;
-   unsigned int i;
-
-   memset(, 0, sizeof(mbs));
-   counters_read(mbstat, counters, MBSTAT_COUNT);
-   for (i = 0; i < MBSTAT_TYPES; i++)
-   mbs.m_mtypes[i] = counters[i];
-
-   mbs.m_drops = counters[MBSTAT_DROPS];
-   mbs.m_wait = counters[MBSTAT_WAIT];
-   mbs.m_drain = counters[MBSTAT_DRAIN];
-
-   return (sysctl_rdstruct(oldp, oldlenp, newp,
-   , sizeof(mbs)));
-   }
+   case KERN_MBSTAT:
+   return (sysctl_mbstat(name + 1, namelen -1, oldp, oldlenp,
+   newp, newlen));
 #if defined(GPROF) || defined(DDBPROF)
case 

athn(4) USB open firmware support

2017-11-16 Thread Stefan Sperling
This diff switches athn(4) USB devices to open source firmware.

I only have an AR9271 device which I can test with:
athn0 at uhub1 port 2 configuration 1 interface 0 "ATHEROS USB2.0 WLAN" rev 
2.00/1.08 addr 3
athn0: AR9271 rev 1 (1T1R), ROM rev 13, address xx:xx:xx:xx:xx:xx

The diff switches AR7010 devices over as well because this new code
will *not* support the old binary-only firmware anyway.
But it is possible that AR7010 devices don't work yet with this diff.
Can anybody help and test such a device? And if anyone would like to
donate an AR7010 device for me to develop with, that would be appreciated.

The new firmware package 'athn-firmware-1.1p3', which contains the open
firmware files, is required. Specifically, the diff needs these files
from the new firmware package:

  /etc/firmware/athn-open-ar7010
  /etc/firmware/athn-open-ar9271

The firmware mirrors currently still ship version 1.1p2 which lacks the
open firmware files. But the pre-built firmware images are already part
of package mirrors, so a new firmware package can be built without
having to first cross-compile the firmware, like this:

  pkg_delete athn-firmware  # else conflict during install, no idea why
  cd /usr/ports/sysutils/firmware/athn
  make FETCH_PACKAGES=Yes install

Thanks again to bentley@ for porting both the required cross toolchain
and the open ath9k firmware during p2k17!

Index: if_athn_usb.c
===
RCS file: /cvs/src/sys/dev/usb/if_athn_usb.c,v
retrieving revision 1.48
diff -u -p -r1.48 if_athn_usb.c
--- if_athn_usb.c   26 Oct 2017 15:00:28 -  1.48
+++ if_athn_usb.c   16 Nov 2017 23:24:55 -
@@ -629,12 +629,9 @@ athn_usb_load_firmware(struct athn_usb_s
/* Determine which firmware image to load. */
if (usc->flags & ATHN_USB_FLAG_AR7010) {
dd = usbd_get_device_descriptor(usc->sc_udev);
-   if (UGETW(dd->bcdDevice) == 0x0202)
-   name = "athn-ar7010-11";
-   else
-   name = "athn-ar7010";
+   name = "athn-open-ar7010";
} else
-   name = "athn-ar9271";
+   name = "athn-open-ar9271";
/* Read firmware image from the filesystem. */
if ((error = loadfirmware(name, , )) != 0) {
printf("%s: failed loadfirmware of file %s (error %d)\n",
@@ -1033,7 +1030,9 @@ athn_usb_newstate_cb(struct athn_usb_sof
 
s = splnet();
ostate = ic->ic_state;
-   DPRINTF(("newstate %d -> %d\n", ostate, cmd->state));
+   DPRINTF(("newstate %s -> %s\n",
+   ieee80211_state_name[ostate],
+   ieee80211_state_name[cmd->state]));
 
if (ostate == IEEE80211_S_RUN) {
sta_index = ((struct athn_node *)ic->ic_bss)->sta_index;
@@ -1228,8 +1227,6 @@ athn_usb_create_node(struct athn_usb_sof
memset(, 0, sizeof(sta));
IEEE80211_ADDR_COPY(sta.macaddr, ni->ni_macaddr);
IEEE80211_ADDR_COPY(sta.bssid, ni->ni_bssid);
-   sta.associd = htobe16(ni->ni_associd);
-   sta.valid = 1;
sta.sta_index = an->sta_index;
sta.maxampdu = 0x;
if (ni->ni_flags & IEEE80211_NODE_HT)
@@ -1522,7 +1519,6 @@ void
 athn_usb_rx_wmi_ctrl(struct athn_usb_softc *usc, uint8_t *buf, int len)
 {
struct ar_wmi_cmd_hdr *wmi;
-   struct ar_wmi_evt_txrate *txrate;
uint16_t cmd_id;
 
if (__predict_false(len < sizeof(*wmi)))
@@ -1547,10 +1543,16 @@ athn_usb_rx_wmi_ctrl(struct athn_usb_sof
athn_usb_swba(usc);
break;
 #endif
-   case AR_WMI_EVT_TXRATE:
-   txrate = (struct ar_wmi_evt_txrate *)[1];
-   DPRINTF(("txrate=%d\n", betoh32(txrate->txrate)));
+   case AR_WMI_EVT_TXSTATUS: {
+#ifdef ATH_DEBUG
+   struct ar_wmi_evt_txstatus *ts;
+   int i;
+   ts = (struct ar_wmi_evt_txstatus *)[1];
+   for (i = 0; i < ts->count && i < AR_HTC_MAX_TX_STATUS; i++)
+   DPRINTF(("ts[%d]=%d\n", i, ts->ts[i].rate));
+#endif
break;
+   }
case AR_WMI_EVT_FATAL:
printf("%s: fatal firmware error\n", usc->usb_dev.dv_xname);
break;
@@ -2286,13 +2288,9 @@ athn_usb_init(struct ifnet *ifp)
 
/* Update target capabilities. */
memset(, 0, sizeof(hic));
-   hic.flags = htobe32(0x400c2400);
-   hic.flags_ext = htobe32(0x00106080);
hic.ampdu_limit = htobe32(0x);
hic.ampdu_subframes = 20;
-   hic.protmode = 1;   /* XXX */
-   hic.lg_txchainmask = sc->txchainmask;
-   hic.ht_txchainmask = sc->txchainmask;
+   hic.txchainmask = sc->txchainmask;
DPRINTF(("updating target configuration\n"));
error = athn_usb_wmi_xcmd(usc, AR_WMI_CMD_TARGET_IC_UPDATE,
, sizeof(hic), NULL);
Index: if_athn_usb.h
===
RCS 

Tweak OF_getnodebyname()

2017-11-16 Thread Mark Kettenis
The current FDT implementation is fairly useless since it doesn't
actually look at the child nodes.  The macppc implementation walks the
entire tree.  But all current use cases of this function only look at
children of the passed node.  Diff below makes OF_getnodebyname()
behave like that.

ok?


Index: ofw/fdt.c
===
RCS file: /cvs/src/sys/dev/ofw/fdt.c,v
retrieving revision 1.20
diff -u -p -r1.20 fdt.c
--- ofw/fdt.c   12 Mar 2017 11:44:42 -  1.20
+++ ofw/fdt.c   16 Nov 2017 20:28:27 -
@@ -773,11 +773,9 @@ OF_getnodebyname(int handle, const char 
if (handle == 0)
node = fdt_find_node("/");
 
-   while (node) {
+   for (node = fdt_child_node(node); node; node = fdt_next_node(node)) {
if (strcmp(name, fdt_node_name(node)) == 0)
break;
-
-   node = fdt_next_node(node);
}
 
return node ? ((char *)node - (char *)tree.header) : 0;



Re: clang: Avoid EBX/RBX

2017-11-16 Thread Mark Kettenis
> Date: Wed, 15 Nov 2017 19:45:26 -0500
> From: Todd Mortimer 
> 
> Hi tech@,
> 
> This is an updated diff that shuffles the allocation order for registers
> on i386/amd64. The last one exposed a subtle bug with the way chromium
> and libexecinfo interact when creating backtraces. With this diff, make
> release seems fine, I didn't see any differences in regress, and
> chromium builds. We are still removing about 4500 unique gadgets from
> the kernel this way (about 6% of the total unique gadgets).
> 
> The problem with the previous diff was that it put RBP before RBX, which
> broke some assumptions between chromium and libexecinfo. When building
> chromium, the C++ code ended up allocating RBP as a general purpose
> register, which broke backtraces from the devel/libexecinfo port, which
> is expecting RBP to hold a frame pointer. In the original order, RBP
> points to a stack address that holds 0, so it 'worked' by terminating
> the backtrace immediately, but this seems like a bit of a fluke. I am
> not sure if / how we want to deal with this. Basically, libexecinfo
> is using the __builtin_frame_address and __builtin_return_address
> compiler builtins, which expect RBP to hold a frame pointer. Any port
> that calls backtrace() from code that is using RBP as a general purpose
> register will be incorrect / unstable. This seems like an uncommon
> problem.

Hi Todd,

Let me start by saying thatnothing what I say below should keep you
from going ahead with this slightly castrated version.

I would like to ask you to make some noise about this on the
appropriate clang mailing lists if you didn't do so already.  Be sure
to mention OpenBSD ;).

In case you haven't found it yet; gnu/llvm/doc/LangRef.rst doucments
the llvm.returnaddress and llvm.frameaddress that back these builtins.
>From their description it is obvious that what libexecinfo is doing is
simply broken on many architectures.  It'll work for those
architectures that have a genuine frame pointer.  It will probably
work correctly on i386 as long as none of the code being backtraced
was compiled with -fomit-frame-pointer.  But on amd64 that's the
default and compiling everything with -fno-omit-frame-pointer (like we
do for the kernel) really isn't an option.  Instead the amd64 ABI
requires DWARF2 EH info to be present.  The glibc backtrace
implementation uses this information to provide backtraces instead of
just hoping that %rbp hasn't been clobbered.

Fixing this shouldn't be too difficult.  The clang libunwind has
_Unwind_Backtrace() which does all the heavy lifting.  That of course
brings us back to the discussion whether we should provide libunwind
as a separate library (it's currently integrated in libc++abi).

But maybe robert@ should chime in and tell us what the various options
here are.  Personally I'd like to see devel/libexecinfo to be removed
from ports if possible.


> Index: lib/Target/X86/X86RegisterInfo.td
> ===
> RCS file: /cvs/src/gnu/llvm/lib/Target/X86/X86RegisterInfo.td,v
> retrieving revision 1.1.1.4
> diff -u -p -u -p -r1.1.1.4 X86RegisterInfo.td
> --- lib/Target/X86/X86RegisterInfo.td 4 Oct 2017 20:28:02 -   1.1.1.4
> +++ lib/Target/X86/X86RegisterInfo.td 11 Nov 2017 14:04:23 -
> @@ -339,8 +339,8 @@ def GR16 : RegisterClass<"X86", [i16], 1
>R8W, R9W, R10W, R11W, R14W, R15W, R12W, R13W)>;
> 
>  def GR32 : RegisterClass<"X86", [i32], 32,
> - (add EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP,
> -  R8D, R9D, R10D, R11D, R14D, R15D, R12D, R13D)>;
> + (add EAX, ECX, EDX, ESI, EDI,
> +  R8D, R9D, R10D, R11D, R14D, R15D, R12D, R13D, 
> EBX, EBP, ESP)>;
> 
>  // GR64 - 64-bit GPRs. This oddly includes RIP, which isn't accurate, since
>  // RIP isn't really a register and it can't be used anywhere except in an
> @@ -349,7 +349,7 @@ def GR32 : RegisterClass<"X86", [i32], 3
>  // tests because of the inclusion of RIP in this register class.
>  def GR64 : RegisterClass<"X86", [i64], 64,
>   (add RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,
> -  RBX, R14, R15, R12, R13, RBP, RSP, RIP)>;
> +  R14, R15, R12, R13, RBX, RBP, RSP, RIP)>;
> 
>  // Segment registers for use by MOV instructions (and others) that have a
>  //   segment register as one operand.  Always contain a 16-bit segment
> 
> - End forwarded message -
> 
> 



[PATCH] update struct vm_create_params

2017-11-16 Thread Carlos Cardenas
Add cdrom entry to vm_create_params in preparation for cdrom
support in vmd.

Ok?

diff --git sys/arch/amd64/include/vmmvar.h sys/arch/amd64/include/vmmvar.h
index 4847fa3defa..0e067f3f49d 100644
--- sys/arch/amd64/include/vmmvar.h
+++ sys/arch/amd64/include/vmmvar.h
@@ -26,6 +26,7 @@
 #define VMM_MAX_MEM_RANGES 16
 #define VMM_MAX_DISKS_PER_VM   4
 #define VMM_MAX_PATH_DISK  128
+#define VMM_MAX_PATH_CDROM 128
 #define VMM_MAX_NAME_LEN   32
 #define VMM_MAX_KERNEL_PATH128
 #define VMM_MAX_VCPUS_PER_VM   64
@@ -419,6 +420,7 @@ struct vm_create_params {
size_t  vcp_nnics;
struct vm_mem_range vcp_memranges[VMM_MAX_MEM_RANGES];
char
vcp_disks[VMM_MAX_DISKS_PER_VM][VMM_MAX_PATH_DISK];
+   charvcp_cdrom[VMM_MAX_PATH_CDROM];
charvcp_name[VMM_MAX_NAME_LEN];
charvcp_kernel[VMM_MAX_KERNEL_PATH];
uint8_t vcp_macs[VMM_MAX_NICS_PER_VM][6];
diff --git sys/arch/i386/include/vmmvar.h sys/arch/i386/include/vmmvar.h
index 32ba1234800..640120a7ab4 100644
--- sys/arch/i386/include/vmmvar.h
+++ sys/arch/i386/include/vmmvar.h
@@ -25,6 +25,7 @@
 #define VMM_MAX_MEM_RANGES 16
 #define VMM_MAX_DISKS_PER_VM   4
 #define VMM_MAX_PATH_DISK  128
+#define VMM_MAX_PATH_CDROM 128
 #define VMM_MAX_NAME_LEN   32
 #define VMM_MAX_KERNEL_PATH128
 #define VMM_MAX_VCPUS_PER_VM   64
@@ -357,6 +358,7 @@ struct vm_create_params {
size_t  vcp_nnics;
struct vm_mem_range vcp_memranges[VMM_MAX_MEM_RANGES];
char
vcp_disks[VMM_MAX_DISKS_PER_VM][VMM_MAX_PATH_DISK];
+   charvcp_cdrom[VMM_MAX_PATH_CDROM];
charvcp_name[VMM_MAX_NAME_LEN];
charvcp_kernel[VMM_MAX_KERNEL_PATH];
uint8_t vcp_macs[VMM_MAX_NICS_PER_VM][6];
-- 
2.15.0



Re: add an iqdrops view to systat to show interface queue drops

2017-11-16 Thread Claudio Jeker
On Thu, Nov 16, 2017 at 03:21:20PM +0100, Alexander Bluhm wrote:
> On Wed, Nov 15, 2017 at 01:29:42PM +1000, David Gwynne wrote:
> > im adding numbers to input and output qdrops in the kernel, so im
> > aware that they exist now. however, i don't really see these values
> > in userland. it seems netstat and systat think errors are more
> > important.
> > 
> > i tried adding qdrops to the ifstat view, but it got too cluttered.
> > so i made a new view called iqdrops that shows qdrops instead of
> > errors. i used iqdrops instead of ifqdrops so "if" on its own is
> > still not ambiguous.
> 
> Now ifstat and iqdrops show almost the same information.  Just two
> commns are different, eight culumns are redundant.  I think one
> page would be better.  Do we need DESC?  It takes a lot of space
> that could be used for output of dynamic counters.
> 

Would it make sense to extend the mbuf page instead? It has the ring size
and so it may make sense to show drops there.

-- 
:wq Claudio



Re: mbuf statistics, tracking of drops

2017-11-16 Thread Gregor Best
On Thu, Nov 16, 2017 at 11:13:04AM +1000, David Gwynne wrote:
> 
> > On 16 Nov 2017, at 7:23 am, Gregor Best  wrote:
> > 
> > Hi,
> > 
> > On Mon, Nov 13, 2017 at 01:47:01PM +1000, David Gwynne wrote:
> >> [...]
> >> pools maintain count of how many times they failed to provide an
> >> allocation. you can watch this with vmstat -m or systat pool.
> >> however, we could use that to populate mb_drops too.
> >> [...]
> > 
> > That's certainly smarter than my idea.
> 
> does it work though?
> 

Now that I think about it, not quite, or at least not without reaching
around into the pools innards to lock them while reading the statistics
to get a consistent view.

The problem is that the MBSTAT_DROPS part of the `counters` data
structure never gets modified, so we'd still need to loop over all pools
in sysctl_mbstat, since this is not just about mbpool but also about the
pools for payload data.

On the other hand, going back to my initial proposal would mean
essentially duplicating functionality the pools already provide, so
that's at least a bit unelegant. Especially since it adds at least one
splnet/splx dance.

Another option would be to just say "screw it" and count the failed
allocations without acquiring any locks on the pools, maybe via atomic
operations. Sounds a bit too complicated though.

At the moment, and after a night's sleep, I think my initial proposal is
the most straightforward way to do this. That, or getting rid of the
confusing counters in `netstat -m` that stay at 0 all the time...

-- 
Gregor



Re: ihidev(4) polling

2017-11-16 Thread joshua stein

On Thu, 16 Nov 2017 at 13:42:03 -0500, James Turner wrote:

So I have to ask since when I hit you up on icb the above terms where
all mentioned. With this diff and what you just committed to src add
touchpad support to the Xiaomi Mi Air 12.5"?


The Xiaomi Mi Air needs a Sunrise Point GPIO driver to do interrupts 
for the touchpad, so this workaround should help there too but the 
proper fix for that machine is to write a new driver.




Re: ihidev(4) polling

2017-11-16 Thread James Turner
On Thu, Nov 16, 2017 at 12:32:44PM -0600, joshua stein wrote:
> Now that the dwiic(4) PCI support is in, this implements polling 
> mode for ihidev to work around the problem of ioapic interrupts not 
> arriving for them after setup.  I've spent far too much time trying 
> to debug that problem (including much of my time at t2k17), so I 
> made this as a workaround until someone else fixes that issue.
> 
> The new PCI attachment of dwiic sets sc_poll_ihidev, which 
> dwiic_acpi_found_ihidev() checks for and does not set ia_intr, and
> this diff implements the other half which sets up a polling mode in 
> response to that.
> 
> It does slow polling by default to conserve power, and once movement 
> is detected on a device under ihidev (like imt(4)), it starts 
> polling quickly for a while.
> 
> It's not perfect but it's enough to make the touchpad work on these 
> machines and will hopefully be a temporary fix.
> 
> 

So I have to ask since when I hit you up on icb the above terms where
all mentioned. With this diff and what you just committed to src add
touchpad support to the Xiaomi Mi Air 12.5"?

Even if not, thanks for all your hard work on adding touchpad support to
these new devices.

-- 
James Turner



ihidev(4) polling

2017-11-16 Thread joshua stein
Now that the dwiic(4) PCI support is in, this implements polling 
mode for ihidev to work around the problem of ioapic interrupts not 
arriving for them after setup.  I've spent far too much time trying 
to debug that problem (including much of my time at t2k17), so I 
made this as a workaround until someone else fixes that issue.

The new PCI attachment of dwiic sets sc_poll_ihidev, which 
dwiic_acpi_found_ihidev() checks for and does not set ia_intr, and
this diff implements the other half which sets up a polling mode in 
response to that.

It does slow polling by default to conserve power, and once movement 
is detected on a device under ihidev (like imt(4)), it starts 
polling quickly for a while.

It's not perfect but it's enough to make the touchpad work on these 
machines and will hopefully be a temporary fix.


Index: sys/dev/i2c/ihidev.c
===
RCS file: /cvs/src/sys/dev/i2c/ihidev.c,v
retrieving revision 1.13
diff -u -p -u -p -r1.13 ihidev.c
--- sys/dev/i2c/ihidev.c8 Apr 2017 02:57:23 -   1.13
+++ sys/dev/i2c/ihidev.c16 Nov 2017 18:18:33 -
@@ -38,6 +38,9 @@
 #define DPRINTF(x)
 #endif
 
+#define SLOW_POLL_MS   200
+#define FAST_POLL_MS   10
+
 /* 7.2 */
 enum {
I2C_HID_CMD_DESCR   = 0x0,
@@ -70,6 +73,8 @@ int   ihidev_maxrepid(void *buf, int len);
 intihidev_print(void *aux, const char *pnp);
 intihidev_submatch(struct device *parent, void *cf, void *aux);
 
+extern int hz;
+
 struct cfattach ihidev_ca = {
sizeof(struct ihidev_softc),
ihidev_match,
@@ -108,15 +113,27 @@ ihidev_attach(struct device *parent, str
sc->sc_addr = ia->ia_addr;
sc->sc_hid_desc_addr = ia->ia_size;
 
-   if (ia->ia_intr)
-   printf(" %s", iic_intr_string(sc->sc_tag, ia->ia_intr));
-
if (ihidev_hid_command(sc, I2C_HID_CMD_DESCR, NULL) ||
ihidev_hid_desc_parse(sc)) {
printf(", failed fetching initial HID descriptor\n");
return;
}
 
+   if (ia->ia_intr) {
+   printf(" %s", iic_intr_string(sc->sc_tag, ia->ia_intr));
+
+   sc->sc_ih = iic_intr_establish(sc->sc_tag, ia->ia_intr,
+   IPL_TTY, ihidev_intr, sc, sc->sc_dev.dv_xname);
+   if (sc->sc_ih == NULL)
+   printf(", can't establish interrupt");
+   }
+
+   if (sc->sc_ih == NULL) {
+   printf(" (polling)");
+   sc->sc_poll = 1;
+   sc->sc_fastpoll = 1;
+   }
+
printf(", vendor 0x%x product 0x%x, %s\n",
letoh16(sc->hid_desc.wVendorID), letoh16(sc->hid_desc.wProductID),
(char *)ia->ia_cookie);
@@ -148,21 +165,12 @@ ihidev_attach(struct device *parent, str
if (isize > sc->sc_isize)
sc->sc_isize = isize;
 
-   DPRINTF(("%s: repid %d size %d\n", sc->sc_dev.dv_xname, repid,
-   repsz));
+   if (repsz != 0)
+   DPRINTF(("%s: repid %d size %d\n", sc->sc_dev.dv_xname,
+   repid, repsz));
}
sc->sc_ibuf = malloc(sc->sc_isize, M_DEVBUF, M_NOWAIT | M_ZERO);
 
-   /* register interrupt with system */
-   if (ia->ia_intr) {
-   sc->sc_ih = iic_intr_establish(sc->sc_tag, ia->ia_intr,
-   IPL_TTY, ihidev_intr, sc, sc->sc_dev.dv_xname);
-   if (sc->sc_ih == NULL) {
-   printf(", can't establish interrupt\n");
-   return;
-   }
-   }
-
iha.iaa = ia;
iha.parent = sc;
 
@@ -219,6 +227,20 @@ ihidev_detach(struct device *self, int f
return (0);
 }
 
+void
+ihidev_sleep(struct ihidev_softc *sc, int ms)
+{
+   int to = ms * hz / 1000;
+
+   if (cold)
+   delay(ms * 1000);
+   else {
+   if (to <= 0)
+   to = 1;
+   tsleep(, PWAIT, "ihidev", to);
+   }
+}
+
 int
 ihidev_hid_command(struct ihidev_softc *sc, int hidcmd, void *arg)
 {
@@ -363,7 +385,7 @@ ihidev_hid_command(struct ihidev_softc *
I2C_HID_CMD_SET_REPORT,
0, 0, 0, 0, 0, 0,
};
-   int cmdlen = 10;
+   int cmdlen = sizeof(cmd);
int report_id = rreq->id;
int report_len = 2 + (report_id ? 1 : 0) + rreq->len;
int dataoff;
@@ -377,7 +399,7 @@ ihidev_hid_command(struct ihidev_softc *
DPRINTF(("\n"));
 
/*
-* 7.2.2.4 - "The protocol is optimized for Report < 15.  If a
+* 7.2.3.4 - "The protocol is optimized for Report < 15.  If a
 * report ID >= 15 is necessary, then the Report ID in the Low
 * Byte must be set to  and a Third Byte is appended to the
 * protocol.  This Third Byte contains the 

Re: faster printf

2017-11-16 Thread Theo de Raadt
> Quick answer, more later:
> 
> Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700:
> > Todd Miller wrote:
> 
> >> Also, POSIX isn't explicit as to whether that restriction applies
> >> to the format string or just the arguments to %lc and %ls conversions.
> >> 
> >> What it does say is:
> >> 
> >> The format is composed of zero or more directives: ordinary
> >> characters, which are simply copied to the output stream, and
> >> conversion specifications, each of which shall result in the
> >> fetching of zero or more arguments.
> 
> > Well that says the format string is a string, not a wide string.
> 
> There are three kinds of strings, not two.  You are confusing wide
> strings and multibyte strings.  It is certainly not a wide string.
> It is a multibyte string, that is what the use of the word "character"
> indicates.  If it were a byte string, it would talk about bytes
> instead, see for example how POSIX describes %s:
> 
>   The argument shall be a pointer to an array of char.  Bytes from
>   the array shall be written up to (but not including) any terminating
>   null byte.

Doesn't make sense to me.  It says

   which are simply copied to the output stream

What part of "simply copied" involves calling a function which makes
a decision and decides to error instead?


Have you found a single example which needs to do the check?  Surely
if this is important, there will be at least one in ports.

So I'm positive they should be copied out byte-for-byte.  Without a
check.  Simply can mean without additional labour. I also feel adding
-1 handling would only serve one purpose: Increasing fragility.

Let us recall that the source tree used to contain no checks for
sprintf/snprintf returning -1.  Only checks for < size.  Solaris was
the first system to do this -1 stuff, and it took 15 years to complete
the work of adding the clumsy checks to our tree.

15 years of nearly wasted work, I suspect.

> > I think EILSEQ and -1 are intended to apply entirely to failed
> > conversions,
> 
> That is not true.  For example, the function mblen(3) is specified
> to return EILSEQ, and it does so.  So EILSEQ is also used for
> validation even without conversion, even elsewhere.

The suggestion is that it should not be called.  If it is not called,
then it is irrelevant.

> > and these checks were mistakenly added to printf a while
> > ago.
> 
> The *printf() functions set EILSEQ in these cases since revision 1.1
> in 1995.

Yes, I already proposed that someone made a mistake a while ago.




Re: faster printf

2017-11-16 Thread Ingo Schwarze
Hi Theo,

Quick answer, more later:

Theo de Raadt wrote on Thu, Nov 16, 2017 at 09:52:39AM -0700:
> Todd Miller wrote:

>> Also, POSIX isn't explicit as to whether that restriction applies
>> to the format string or just the arguments to %lc and %ls conversions.
>> 
>> What it does say is:
>> 
>> The format is composed of zero or more directives: ordinary
>> characters, which are simply copied to the output stream, and
>> conversion specifications, each of which shall result in the
>> fetching of zero or more arguments.

> Well that says the format string is a string, not a wide string.

There are three kinds of strings, not two.  You are confusing wide
strings and multibyte strings.  It is certainly not a wide string.
It is a multibyte string, that is what the use of the word "character"
indicates.  If it were a byte string, it would talk about bytes
instead, see for example how POSIX describes %s:

  The argument shall be a pointer to an array of char.  Bytes from
  the array shall be written up to (but not including) any terminating
  null byte.


There isn't the slightest doubt that passing a format containing
non-UTF-8 bytes under a UTF-8 locale is invalid.  The only questions
are whether the standard says that is merely undefined, or whether
it requires failure.  If it requires failure, some think we should
ignore the standard; i say that isn't safe in this case (i'll explain
in more detail later why it isn't).  If it is undefined behaviour,
some seem to say it doesn't matter; i say failing closed is safer
even then.

> I think EILSEQ and -1 are intended to apply entirely to failed
> conversions,

That is not true.  For example, the function mblen(3) is specified
to return EILSEQ, and it does so.  So EILSEQ is also used for
validation even without conversion, even elsewhere.

> and these checks were mistakenly added to printf a while
> ago.

The *printf() functions set EILSEQ in these cases since revision 1.1
in 1995.

Yours,
  Ingo



Re: faster printf

2017-11-16 Thread Todd C. Miller
On Thu, 16 Nov 2017 09:52:39 -0700, "Theo de Raadt" wrote:

> > Also, POSIX isn't explicit as to whether that restriction applies
> > to the format string or just the arguments to %lc and %ls conversions.
> > 
> > What it does say is:
> > 
> > The format is composed of zero or more directives: ordinary
> > characters, which are simply copied to the output stream, and
> > conversion specifications, each of which shall result in the
> > fetching of zero or more arguments.
> 
> Well that says the format string is a string, not a wide string.
> 
> I think EILSEQ and -1 are intended to apply entirely to failed
> conversions, and these checks were mistakenly added to printf a while
> ago.

Yes, that's what I was trying to get across.

 - todd



Re: faster printf

2017-11-16 Thread Theo de Raadt
> Also, POSIX isn't explicit as to whether that restriction applies
> to the format string or just the arguments to %lc and %ls conversions.
> 
> What it does say is:
> 
> The format is composed of zero or more directives: ordinary
> characters, which are simply copied to the output stream, and
> conversion specifications, each of which shall result in the
> fetching of zero or more arguments.

Well that says the format string is a string, not a wide string.

I think EILSEQ and -1 are intended to apply entirely to failed
conversions, and these checks were mistakenly added to printf a while
ago.



Re: dwiic: add pci attachment

2017-11-16 Thread Mark Kettenis
> Date: Sat, 11 Nov 2017 08:19:28 -0600
> From: joshua stein 
> 
> Here is a new version of the dwiic patch that restores the 
> acpi_attach_deps call, confirmed working by Cesare Gargano.
> 
> Any other testers?

One nit, see below.  ok with that fixed

> Index: sys/dev/pci/dwiic_pci.c
> ===
> RCS file: sys/dev/pci/dwiic_pci.c
> diff -N sys/dev/pci/dwiic_pci.c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ sys/dev/pci/dwiic_pci.c   10 Nov 2017 15:56:34 -
> @@ -0,0 +1,204 @@
> +/* $OpenBSD$ */
> +/*
> + * Synopsys DesignWare I2C controller
> + * PCI attachment
> + *
> + * Copyright (c) 2015-2017 joshua stein 
> + *
> + * Permission to use, copy, modify, and/or distribute this software for any
> + * purpose with or without fee is hereby granted, provided that the above
> + * copyright notice and this permission notice appear in all copies.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
> + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +/* 13.3: I2C Additional Registers Summary */
> +#define LPSS_RESETS  0x204
> +#define  LPSS_RESETS_I2C (1 << 0) | (1 << 1)
> +#define  LPSS_RESETS_IDMA(1 << 2)
> +#define LPSS_ACTIVELTR   0x210
> +#define LPSS_IDLELTR 0x214
> +#define LPSS_CAPS0x2fc
> +#define  LPSS_CAPS_NO_IDMA   (1 << 8)
> +#define  LPSS_CAPS_TYPE_SHIFT4
> +#define  LPSS_CAPS_TYPE_MASK (0xf << LPSS_CAPS_TYPE_SHIFT)
> +
> +int  dwiic_pci_match(struct device *, void *, void *);
> +void dwiic_pci_attach(struct device *, struct device *, void *);
> +int  dwiic_pci_activate(struct device *, int);
> +void dwiic_pci_bus_scan(struct device *,
> + struct i2cbus_attach_args *, void *);
> +
> +#include "acpi.h"
> +#if NACPI > 0
> +struct aml_node *acpi_pci_match(struct device *dev, struct pci_attach_args 
> *pa);
> +#endif
> +
> +struct cfattach dwiic_pci_ca = {
> + sizeof(struct dwiic_softc),
> + dwiic_pci_match,
> + dwiic_pci_attach,
> + NULL,
> + dwiic_pci_activate,
> +};
> +
> +const struct pci_matchid dwiic_pci_ids[] = {
> + { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_100SERIES_LP_I2C_1 },
> + { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_100SERIES_LP_I2C_2 },
> +};
> +
> +int
> +dwiic_pci_match(struct device *parent, void *match, void *aux)
> +{
> + return (pci_matchbyid(aux, dwiic_pci_ids, nitems(dwiic_pci_ids)));
> +}
> +
> +void
> +dwiic_pci_attach(struct device *parent, struct device *self, void *aux)
> +{
> + struct dwiic_softc *sc = (struct dwiic_softc *)self;
> + struct pci_attach_args *pa = aux;
> + bus_size_t iosize;
> + pci_intr_handle_t ih;
> + const char *intrstr = NULL;
> + uint8_t type;
> +
> + memcpy(>sc_paa, pa, sizeof(sc->sc_paa));
> +
> + pci_set_powerstate(pa->pa_pc, pa->pa_tag, PCI_PMCSR_STATE_D0);
> +
> + if (pci_mapreg_map(pa, PCI_MAPREG_START, PCI_MAPREG_MEM_TYPE_64BIT, 0,
> + >sc_iot, >sc_ioh, NULL, , 0)) {
> + printf(": can't map mem space\n");
> + return;
> + }
> +
> + sc->sc_caps = bus_space_read_4(sc->sc_iot, sc->sc_ioh, LPSS_CAPS);
> + type = sc->sc_caps & LPSS_CAPS_TYPE_MASK;
> + type >>= LPSS_CAPS_TYPE_SHIFT;
> + if (type != 0) {
> + printf(": type %d not supported\n", type);
> + return;
> + }
> +
> + /* un-reset - page 958 */
> + bus_space_write_4(sc->sc_iot, sc->sc_ioh, LPSS_RESETS,
> + (LPSS_RESETS_I2C | LPSS_RESETS_IDMA));
> +
> + /* fetch timing parameters */
> + sc->ss_hcnt = dwiic_read(sc, DW_IC_SS_SCL_HCNT);
> + sc->ss_lcnt = dwiic_read(sc, DW_IC_SS_SCL_LCNT);
> + sc->fs_hcnt = dwiic_read(sc, DW_IC_FS_SCL_HCNT);
> + sc->fs_lcnt = dwiic_read(sc, DW_IC_FS_SCL_LCNT);
> + sc->sda_hold_time = dwiic_read(sc, DW_IC_SDA_HOLD);
> +
> + if (dwiic_init(sc)) {
> + printf(": failed initializing\n");
> + return;
> + }
> +
> + /* leave the controller disabled */
> + dwiic_write(sc, DW_IC_INTR_MASK, 0);
> + dwiic_enable(sc, 0);
> + dwiic_read(sc, DW_IC_CLR_INTR);
> +
> + /* install interrupt handler */
> + sc->sc_poll = 1;
> + if (pci_intr_map(>sc_paa, ) == 0) {
> + intrstr = pci_intr_string(sc->sc_paa.pa_pc, ih);
> + sc->sc_ih = pci_intr_establish(sc->sc_paa.pa_pc, ih, 

Re: faster printf

2017-11-16 Thread Todd C. Miller
On Thu, 16 Nov 2017 16:19:52 +0100, Stefan Sperling wrote:

> I would expect EILSEQ during %lc and %ls conversions which explicitly
> expect wide characters as arguments, but not for arbitrary data that
> happens to be part of the format string.

It is worth noting that this restriction is a POSIX extension not
present in the C standard.

Also, POSIX isn't explicit as to whether that restriction applies
to the format string or just the arguments to %lc and %ls conversions.

What it does say is:

The format is composed of zero or more directives: ordinary
characters, which are simply copied to the output stream, and
conversion specifications, each of which shall result in the
fetching of zero or more arguments.

I don't think the proposed change makes us non-compliant.

 - todd



Re: faster printf

2017-11-16 Thread Stefan Sperling
On Thu, Nov 16, 2017 at 09:57:06AM -0500, Ted Unangst wrote:
> Ingo Schwarze wrote:
> > [EILSEQ]
> >   A wide-character code that does not correspond to a valid
> >   character has been detected.
> > 
> > That means that the functions are *required* to fail ("shall fail")
> > if encoding errors can be detected, that -1 must be returned, and
> > that errno must be set.
> 
> I understand what you're saying, but I think strictly following the standard
> is a bit silly here. This is a very poor way for the fastidious programmer to
> check for valid byte sequences. Also, nobody is actually that careful. I'm
> actually kind of surprised that such a check is performed. Usually the rule is
> that the caller is responsible for checking. strdup doesn't return an error
> for a NULL pointer. I would expect if I printf "\xc0\x80" that I will get the
> two bytes c0 80 as output.
> 
> There's some danger of slippery slope here, which parts of the standard are
> safe to ignore and which aren't. It's reasonable to argue it's best not to
> ignore anything, but I feel pretty good about ignoring this rule as an
> exception.
> 

I agree with Ted.

I would expect EILSEQ during %lc and %ls conversions which explicitly
expect wide characters as arguments, but not for arbitrary data that
happens to be part of the format string.



Re: faster printf

2017-11-16 Thread Ted Unangst
Ingo Schwarze wrote:
> [EILSEQ]
>   A wide-character code that does not correspond to a valid
>   character has been detected.
> 
> That means that the functions are *required* to fail ("shall fail")
> if encoding errors can be detected, that -1 must be returned, and
> that errno must be set.

I understand what you're saying, but I think strictly following the standard
is a bit silly here. This is a very poor way for the fastidious programmer to
check for valid byte sequences. Also, nobody is actually that careful. I'm
actually kind of surprised that such a check is performed. Usually the rule is
that the caller is responsible for checking. strdup doesn't return an error
for a NULL pointer. I would expect if I printf "\xc0\x80" that I will get the
two bytes c0 80 as output.

There's some danger of slippery slope here, which parts of the standard are
safe to ignore and which aren't. It's reasonable to argue it's best not to
ignore anything, but I feel pretty good about ignoring this rule as an
exception.



carp: address update hook gone after interface detach

2017-11-16 Thread Patrick Wildt
Hi,

when we detach the interface from the carp the hook that is called
when an address is updated is disestablished.  But it never again
gets re-established, which makes the carp interface unusable.  This
happens for instance if you run carp on trunk and destroy the trunk
interface.

This diff seems to fix it for me, but I'm not sure what kind of re-
purcussions there are if we keep the hook without an interface.

What do you think?

Patrick

diff --git a/sys/netinet/ip_carp.c b/sys/netinet/ip_carp.c
index 649c7501798..074307ae79c 100644
--- a/sys/netinet/ip_carp.c
+++ b/sys/netinet/ip_carp.c
@@ -878,6 +878,8 @@ carp_clone_destroy(struct ifnet *ifp)
 
NET_LOCK();
carpdetach(sc);
+   if (sc->ah_cookie != NULL)
+   hook_disestablish(sc->sc_if.if_addrhooks, sc->ah_cookie);
NET_UNLOCK();
 
ether_ifdetach(ifp);
@@ -920,9 +922,6 @@ carpdetach(struct carp_softc *sc)
carp_setrun_all(sc, 0);
carp_multicast_cleanup(sc);
 
-   if (sc->ah_cookie != NULL)
-   hook_disestablish(sc->sc_if.if_addrhooks, sc->ah_cookie);
-
ifp0 = sc->sc_carpdev;
if (ifp0 == NULL)
return;



Re: add an iqdrops view to systat to show interface queue drops

2017-11-16 Thread Theo de Raadt
> Now ifstat and iqdrops show almost the same information.  Just two
> commns are different, eight culumns are redundant.  I think one
> page would be better.  Do we need DESC?  It takes a lot of space
> that could be used for output of dynamic counters.

I use DESC every single day.



Re: add an iqdrops view to systat to show interface queue drops

2017-11-16 Thread Alexander Bluhm
On Wed, Nov 15, 2017 at 01:29:42PM +1000, David Gwynne wrote:
> im adding numbers to input and output qdrops in the kernel, so im
> aware that they exist now. however, i don't really see these values
> in userland. it seems netstat and systat think errors are more
> important.
> 
> i tried adding qdrops to the ifstat view, but it got too cluttered.
> so i made a new view called iqdrops that shows qdrops instead of
> errors. i used iqdrops instead of ifqdrops so "if" on its own is
> still not ambiguous.

Now ifstat and iqdrops show almost the same information.  Just two
commns are different, eight culumns are redundant.  I think one
page would be better.  Do we need DESC?  It takes a lot of space
that could be used for output of dynamic counters.

bluhm



Re: netintro.4: sync struct ifreq with if.h

2017-11-16 Thread Sebastian Benoit
ok

Theo Buehler(t...@theobuehler.org) on 2017.11.16 12:12:55 +0100:
> ifr_vnetid is now a proper member of struct ifreq and is no longer
> overloaded with ifr_metric.  Moreover, ifr_index and ifr_llprio were
> missing and mandoc -Tlint complained about a "useless macro: Tn".
> 
> ok?
> 
> Index: share/man/man4/netintro.4
> ===
> RCS file: /var/cvs/src/share/man/man4/netintro.4,v
> retrieving revision 1.49
> diff -u -p -r1.49 netintro.4
> --- share/man/man4/netintro.4 11 Sep 2015 13:04:05 -  1.49
> +++ share/man/man4/netintro.4 16 Nov 2017 11:04:12 -
> @@ -125,9 +125,7 @@ The system currently supports the
>  Internet protocols (IPv4 and IPv6),
>  MPLS,
>  and a few others.
> -Raw socket interfaces are provided to the
> -.Tn IP
> -protocol
> +Raw socket interfaces are provided to the IP protocol
>  layer of the
>  Internet.
>  Consult the appropriate manual pages in this section for more
> @@ -215,8 +213,10 @@ struct   ifreq {
>   struct  sockaddrifru_broadaddr;
>   short   ifru_flags;
>   int ifru_metric;
> + int64_t ifru_vnetid;
>   uint64_tifru_media;
>   caddr_t ifru_data;
> + unsigned intifru_index;
>   } ifr_ifru;
>  #define  ifr_addrifr_ifru.ifru_addr  /* address */
>  #define  ifr_dstaddr ifr_ifru.ifru_dstaddr   /* other end of p-to-p 
> link */
> @@ -227,9 +227,11 @@ struct   ifreq {
>  #define  ifr_hardmtu ifr_ifru.ifru_metric/* hardmtu (overload) */
>  #define  ifr_media   ifr_ifru.ifru_media /* media options */
>  #define  ifr_rdomainid   ifr_ifru.ifru_metric/* VRF instance 
> (overload) */
> -#define  ifr_vnetid  ifr_ifru.ifru_metric/* Virtual Net Id 
> (overload) */
> +#define  ifr_vnetid  ifr_ifru.ifru_vnetid/* Virtual Net Id */
>  #define  ifr_ttl ifr_ifru.ifru_metric/* tunnel TTL 
> (overload) */
>  #define  ifr_dataifr_ifru.ifru_data  /* for use by interface 
> */
> +#define  ifr_index   ifr_ifru.ifru_index /* interface index */
> +#define  ifr_llprio  ifr_ifru.ifru_metric/* link layer priority 
> */
>  };
>  .Ed
>  .Pp
> 



Re: maximum size of http headers in relayd

2017-11-16 Thread Sebastian Benoit
Alexander Bluhm(alexander.bl...@gmx.net) on 2017.11.16 14:05:43 +0100:
> On Wed, Nov 15, 2017 at 08:15:15PM +0100, Sebastian Benoit wrote:
> > Thanks, here is a diff on top of the last to check that.
> > 
> > If you manage to set the default headerlen on non http protocols, it does
> > not catch that, but i dont want to add another variable for that corner
> > case.
> 
> I think it is simpler to check when we parse the http options.

yes you are right, the type is already set at that point.

ok benno
 
> bluhm
> 
> Index: parse.y
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/relayd/parse.y,v
> retrieving revision 1.217
> diff -u -p -r1.217 parse.y
> --- parse.y   15 Nov 2017 19:03:26 -  1.217
> +++ parse.y   16 Nov 2017 12:57:46 -
> @@ -1055,6 +1055,11 @@ httpflags_l: httpflags comma httpflags_
>   ;
>  
>  httpflags: HEADERLEN NUMBER  {
> + if (proto->type != RELAY_PROTO_HTTP) {
> + yyerror("can set http options only for "
> + "http protocol");
> + YYERROR;
> + }
>   if ($2 < 0 || $2 > RELAY_MAXHEADERLENGTH) {
>   yyerror("invalid headerlen: %d", $2);
>   YYERROR;
> 



Re: maximum size of http headers in relayd

2017-11-16 Thread Alexander Bluhm
On Wed, Nov 15, 2017 at 08:15:15PM +0100, Sebastian Benoit wrote:
> Thanks, here is a diff on top of the last to check that.
> 
> If you manage to set the default headerlen on non http protocols, it does
> not catch that, but i dont want to add another variable for that corner
> case.

I think it is simpler to check when we parse the http options.

bluhm

Index: parse.y
===
RCS file: /data/mirror/openbsd/cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.217
diff -u -p -r1.217 parse.y
--- parse.y 15 Nov 2017 19:03:26 -  1.217
+++ parse.y 16 Nov 2017 12:57:46 -
@@ -1055,6 +1055,11 @@ httpflags_l  : httpflags comma httpflags_
;
 
 httpflags  : HEADERLEN NUMBER  {
+   if (proto->type != RELAY_PROTO_HTTP) {
+   yyerror("can set http options only for "
+   "http protocol");
+   YYERROR;
+   }
if ($2 < 0 || $2 > RELAY_MAXHEADERLENGTH) {
yyerror("invalid headerlen: %d", $2);
YYERROR;



netintro.4: sync struct ifreq with if.h

2017-11-16 Thread Theo Buehler
ifr_vnetid is now a proper member of struct ifreq and is no longer
overloaded with ifr_metric.  Moreover, ifr_index and ifr_llprio were
missing and mandoc -Tlint complained about a "useless macro: Tn".

ok?

Index: share/man/man4/netintro.4
===
RCS file: /var/cvs/src/share/man/man4/netintro.4,v
retrieving revision 1.49
diff -u -p -r1.49 netintro.4
--- share/man/man4/netintro.4   11 Sep 2015 13:04:05 -  1.49
+++ share/man/man4/netintro.4   16 Nov 2017 11:04:12 -
@@ -125,9 +125,7 @@ The system currently supports the
 Internet protocols (IPv4 and IPv6),
 MPLS,
 and a few others.
-Raw socket interfaces are provided to the
-.Tn IP
-protocol
+Raw socket interfaces are provided to the IP protocol
 layer of the
 Internet.
 Consult the appropriate manual pages in this section for more
@@ -215,8 +213,10 @@ struct ifreq {
struct  sockaddrifru_broadaddr;
short   ifru_flags;
int ifru_metric;
+   int64_t ifru_vnetid;
uint64_tifru_media;
caddr_t ifru_data;
+   unsigned intifru_index;
} ifr_ifru;
 #defineifr_addrifr_ifru.ifru_addr  /* address */
 #defineifr_dstaddr ifr_ifru.ifru_dstaddr   /* other end of p-to-p 
link */
@@ -227,9 +227,11 @@ struct ifreq {
 #defineifr_hardmtu ifr_ifru.ifru_metric/* hardmtu (overload) */
 #defineifr_media   ifr_ifru.ifru_media /* media options */
 #defineifr_rdomainid   ifr_ifru.ifru_metric/* VRF instance 
(overload) */
-#defineifr_vnetid  ifr_ifru.ifru_metric/* Virtual Net Id 
(overload) */
+#defineifr_vnetid  ifr_ifru.ifru_vnetid/* Virtual Net Id */
 #defineifr_ttl ifr_ifru.ifru_metric/* tunnel TTL 
(overload) */
 #defineifr_dataifr_ifru.ifru_data  /* for use by interface 
*/
+#defineifr_index   ifr_ifru.ifru_index /* interface index */
+#defineifr_llprio  ifr_ifru.ifru_metric/* link layer priority 
*/
 };
 .Ed
 .Pp