Re: systat(1) counter overflow

2021-07-13 Thread Anindya Mukherjee
On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > Thanks for the discussion. This has been very illuminating. I have been 
> > digging
> > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > stuck) it
> > looks like it should be possible to use uint64_t everywhere. I'm playing 
> > with
> > some changes on my machine to see if I can get at least systat(1) and 
> > vmstat(8)
> > to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> > 
> > If it works in the base system then ideally every relevant port should be
> > updated to be consistent. That is indeed quite a big change; more than I
> > realised so thanks for setting me straight on that.
> 
> We have coped with bigger changes in structs like this before,
> it didn't used to be too difficult, but that was before go...
> 

Hi,

I have been running for a week with the following diff. This is just a POC and
hence there are a few ugly hacks. So far top(1), systat(1), and vmstat(8) seem
to be happy. I haven't hit the 32-bit overflow point for any counters yet but
the counts look right. I have completely ignored ports, but it looks like the
base system can run with this change. This was mostly to satisfy my curiosity.

Regards,
Anindya

? usr.bin/systat/vim_session
Index: sys/arch/amd64/include/atomic.h
===
RCS file: /cvs/src/sys/arch/amd64/include/atomic.h,v
retrieving revision 1.21
diff -u -p -r1.21 atomic.h
--- sys/arch/amd64/include/atomic.h 11 Mar 2021 11:16:55 -  1.21
+++ sys/arch/amd64/include/atomic.h 13 Jul 2021 07:42:51 -
@@ -150,6 +150,14 @@ _atomic_inc_long(volatile unsigned long 
 #define atomic_inc_long(_p) _atomic_inc_long(_p)
 
 static inline void
+_atomic_inc_uint64(volatile uint64_t *p)
+{
+   __asm volatile(_LOCK " incq %0"
+   : "+m" (*p));
+}
+#define atomic_inc_uint64(_p) _atomic_inc_uint64(_p)
+
+static inline void
 _atomic_dec_int(volatile unsigned int *p)
 {
__asm volatile(_LOCK " decl %0"
@@ -166,6 +174,14 @@ _atomic_dec_long(volatile unsigned long 
 #define atomic_dec_long(_p) _atomic_dec_long(_p)
 
 static inline void
+_atomic_dec_uint64(volatile uint64_t *p)
+{
+   __asm volatile(_LOCK " decq %0"
+   : "+m" (*p));
+}
+#define atomic_dec_uint64(_p) _atomic_dec_uint64(_p)
+
+static inline void
 _atomic_add_int(volatile unsigned int *p, unsigned int v)
 {
__asm volatile(_LOCK " addl %1,%0"
@@ -182,6 +198,15 @@ _atomic_add_long(volatile unsigned long 
: "a" (v));
 }
 #define atomic_add_long(_p, _v) _atomic_add_long(_p, _v)
+
+static inline void
+_atomic_add_uint64(volatile uint64_t *p, uint64_t v)
+{
+   __asm volatile(_LOCK " addq %1,%0"
+   : "+m" (*p)
+   : "a" (v));
+}
+#define atomic_add_uint64(_p, _v) _atomic_add_uint64(_p, _v)
 
 static inline void
 _atomic_sub_int(volatile unsigned int *p, unsigned int v)
Index: sys/sys/sysctl.h
===
RCS file: /cvs/src/sys/sys/sysctl.h,v
retrieving revision 1.218
diff -u -p -r1.218 sysctl.h
--- sys/sys/sysctl.h17 May 2021 17:54:31 -  1.218
+++ sys/sys/sysctl.h13 Jul 2021 07:42:52 -
@@ -38,7 +38,8 @@
 #ifndef _SYS_SYSCTL_H_
 #define_SYS_SYSCTL_H_
 
-#include 
+/*#include */
+#include "/usr/src/sys/uvm/uvmexp.h"
 
 /*
  * Definitions for sysctl call.  The sysctl call uses a hierarchical name
Index: sys/uvm/uvm_anon.c
===
RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
retrieving revision 1.54
diff -u -p -r1.54 uvm_anon.c
--- sys/uvm/uvm_anon.c  26 Mar 2021 13:40:05 -  1.54
+++ sys/uvm/uvm_anon.c  13 Jul 2021 07:42:52 -
@@ -119,7 +119,7 @@ uvm_anfree_list(struct vm_anon *anon, st
if (anon->an_swslot != 0) {
/* This page is no longer only in swap. */
KASSERT(uvmexp.swpgonly > 0);
-   atomic_dec_int(&uvmexp.swpgonly);
+   atomic_dec_uint64(&uvmexp.swpgonly);
}
}
anon->an_lock = NULL;
Index: sys/uvm/uvm_aobj.c
===
RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v
retrieving revision 1.99
diff -u -p -r1.99 uvm_aobj.c
--- sys/uvm/uvm_aobj.c  28 Jun 2021 11:19:01 -  1.99
+++ sys/uvm/uvm_aobj.c  13 Jul 2021 07:42:52 -
@@ -1516,6 +1516,6 @@ uao_dropswap_range(struct uvm_object *uo
 */
if (swpgonlydelta > 0) {
KASSERT(uvmexp.swpgonly >= swpgonlydelta);
-   atomic_add_int(&uvmexp.swpgonly, -swpgonlydelta);
+   atomic_add_uint64(&uvmexp.swpgonly, -swpgonlydelta);
}
 }
Index: sys/uvm/uvm_km.c
===
RCS file: /cvs/src/sys/uvm/uvm_km.c,v
retrieving revision 1.145
diff 

Re: ipsec: remove unused `PolicyHead' from 'sockaddr_encap' structure

2021-07-13 Thread Vitaliy Makkoveev
On Mon, Jul 12, 2021 at 01:47:25PM +0200, Tobias Heider wrote:
> On Sun, Jul 11, 2021 at 05:33:18AM +0300, Vitaliy Makkoveev wrote:
> > This member is never set or used. Also I kept 'SENT_IP6' definition for
> > prevent the potential break of third party software. Is it ok to
> > redefine it to '0x0002'? At least openswan wants this [1].
> > 
> > 1. 
> > https://github.com/xelerance/Openswan/blob/master/include/openswan/ipsec_encap.h#L20
> 
> I wouldn't worry about third party software. Those defines are within
> #ifdef _KERNEL so everyone using those is doing it wrong anyway.
> 

'sockaddr_encap' structure defined outside _KERNEL boundaries but some
related definitions like 'sen_*' and 'SENT_*' are defined within. This
makes 'sockaddr_encap' definition inconsistent. We don't use this
structure outside kernel, so it looks like it should not be visible to
userland too.

> ok with SENT_IP6 changed to 0x0002.
> 

Commited, thanks.



bgpctl add support for RFC8050 (add-path support for MRT parser)

2021-07-13 Thread Claudio Jeker
This diff adds support to read MRT files using the new introduced _ADDPATH
types as defined in RFC8050. I also started adding MRT support to bgpd but
that depends on ADD-PATH itself.

There are a few gotchas, especially the MRT_DUMP_V2 RIB_GENERIC_ADDPATH
handling is different from all other RIB entry handling. This is a major
pain point for bgpd less so for the bgpctl parser.

Some MRT update dumps that can be downloaded and use ADDPATH do actually
use the BGP4MP_MESSAGE _ADDPATH variant for non-addpath enabled sessions.
The update messages can not be parsed because the NLRI encoding is incorrect.

I tested with a few RIB and UPDATE dumps from RIS, route-views and other
open collectors and it works for me.
-- 
:wq Claudio

Index: usr.sbin/bgpctl/bgpctl.c
===
RCS file: /cvs/src/usr.sbin/bgpctl/bgpctl.c,v
retrieving revision 1.269
diff -u -p -r1.269 bgpctl.c
--- usr.sbin/bgpctl/bgpctl.c16 Jun 2021 16:24:11 -  1.269
+++ usr.sbin/bgpctl/bgpctl.c13 Jul 2021 13:20:51 -
@@ -470,7 +470,7 @@ show(struct imsg *imsg, struct parse_res
warnx("bad IMSG_CTL_SHOW_RIB_ATTR received");
break;
}
-   output->attr(imsg->data, ilen, res->flags);
+   output->attr(imsg->data, ilen, res->flags, 0);
break;
case IMSG_CTL_SHOW_RIB_MEM:
if (imsg->hdr.len < IMSG_HEADER_SIZE + sizeof(stats))
@@ -1150,6 +1150,10 @@ show_mrt_dump(struct mrt_rib *mr, struct
ctl.local_pref = mre->local_pref;
ctl.med = mre->med;
/* weight is not part of the mrt dump so it can't be set */
+   if (mr->add_path) {
+   ctl.flags |= F_PREF_PATH_ID;
+   ctl.path_id = mre->path_id;
+   }
 
if (mre->peer_idx < mp->npeers) {
ctl.remote_addr = mp->peers[mre->peer_idx].addr;
@@ -1195,7 +1199,7 @@ show_mrt_dump(struct mrt_rib *mr, struct
if (req->flags & F_CTL_DETAIL) {
for (j = 0; j < mre->nattrs; j++)
output->attr(mre->attrs[j].attr,
-   mre->attrs[j].attr_len, req->flags);
+   mre->attrs[j].attr_len, req->flags, 0);
}
}
 }
@@ -1211,6 +1215,10 @@ network_mrt_dump(struct mrt_rib *mr, str
time_t   now;
u_int16_ti, j;
 
+   /* can't announce more than one path so ignore add-path */
+   if (mr->add_path)
+   return;
+
now = time(NULL);
for (i = 0; i < mr->nentries; i++) {
mre = &mr->entries[i];
@@ -1586,10 +1594,11 @@ show_mrt_notification(u_char *p, u_int16
 
 /* XXX this function does not handle JSON output */
 static void
-show_mrt_update(u_char *p, u_int16_t len, int reqflags)
+show_mrt_update(u_char *p, u_int16_t len, int reqflags, int addpath)
 {
struct bgpd_addr prefix;
int pos;
+   u_int32_t pathid;
u_int16_t wlen, alen;
u_int8_t prefixlen;
 
@@ -1609,12 +1618,25 @@ show_mrt_update(u_char *p, u_int16_t len
if (wlen > 0) {
printf("\n Withdrawn prefixes:");
while (wlen > 0) {
+   if (addpath) {
+   if (wlen <= sizeof(pathid)) {
+   printf("bad withdraw prefix");
+   return;
+   }
+   memcpy(&pathid, p, sizeof(pathid));
+   pathid = ntohl(pathid);
+   p += sizeof(pathid);
+   len -= sizeof(pathid);
+   wlen -= sizeof(pathid);
+   }
if ((pos = nlri_get_prefix(p, wlen, &prefix,
&prefixlen)) == -1) {
printf("bad withdraw prefix");
return;
}
printf(" %s/%u", log_addr(&prefix), prefixlen);
+   if (addpath)
+   printf(" path-id %u", pathid);
p += pos;
len -= pos;
wlen -= pos;
@@ -1655,7 +1677,7 @@ show_mrt_update(u_char *p, u_int16_t len
attrlen += 1 + 2;
}
 
-   output->attr(p, attrlen, reqflags);
+   output->attr(p, attrlen, reqflags, addpath);
p += attrlen;
alen -= attrlen;
len -= attrlen;
@@ -1664,12 +1686,24 @@ show_mrt_update(u_char *p, u_int16_t len
if (len > 0) {
printf("NLRI prefixes:");
while (len > 0) {
+ 

rsync getopt_long cleanup

2021-07-13 Thread Claudio Jeker
I never really liked the getopt_long definitions in rsync. Too much magic
and chaos.

This moves the table out of main to gain some more space and to make it a
proper read-only object. Because of this struct opts also needs to become
a global but that is OK.

Clean up the required_argument options that have no short from. Instead of
small numbers use some defines and make the values larger than any char
value (I chose 1000 and up).

Fix --no-motd, it is just a flag setting a value. So just use the
getopt_long() method for doing that.

Sort the options alphabetically with the exception of no-XYZ options which
I added below the XYZ option itself.

IMO the result is better than what was there before.
-- 
:wq Claudio

Index: main.c
===
RCS file: /cvs/src/usr.bin/rsync/main.c,v
retrieving revision 1.55
diff -u -p -r1.55 main.c
--- main.c  30 Jun 2021 13:10:04 -  1.55
+++ main.c  13 Jul 2021 17:54:13 -
@@ -269,61 +269,67 @@ fargs_parse(size_t argc, char *argv[], s
return f;
 }
 
+static struct opts  opts;
+
+#define OP_ADDRESS 1000
+#define OP_PORT1001
+#define OP_RSYNCPATH   1002
+#define OP_TIMEOUT 1003
+#define OP_VERSION 1004
+
+const struct option lopts[] = {
+{ "address",   required_argument, NULL,OP_ADDRESS },
+{ "archive",   no_argument,NULL,   'a' },
+{ "compress",  no_argument,NULL,   'z' },
+{ "del",   no_argument,&opts.del,  1 },
+{ "delete",no_argument,&opts.del,  1 },
+{ "devices",   no_argument,&opts.devices,  1 },
+{ "no-devices",no_argument,&opts.devices,  0 },
+{ "dry-run",   no_argument,&opts.dry_run,  1 },
+{ "group", no_argument,&opts.preserve_gids,1 },
+{ "no-group",  no_argument,&opts.preserve_gids,0 },
+{ "help",  no_argument,NULL,   'h' },
+{ "links", no_argument,&opts.preserve_links,   1 },
+{ "no-links",  no_argument,&opts.preserve_links,   0 },
+{ "no-motd",   no_argument,&opts.no_motd,  1 },
+{ "numeric-ids",   no_argument,&opts.numeric_ids,  1 },
+{ "owner", no_argument,&opts.preserve_uids,1 },
+{ "no-owner",  no_argument,&opts.preserve_uids,0 },
+{ "perms", no_argument,&opts.preserve_perms,   1 },
+{ "no-perms",  no_argument,&opts.preserve_perms,   0 },
+{ "port",  required_argument, NULL,OP_PORT },
+{ "recursive", no_argument,&opts.recursive,1 },
+{ "no-recursive",  no_argument,&opts.recursive,0 },
+{ "rsh",   required_argument, NULL,'e' },
+{ "rsync-path",required_argument, NULL,OP_RSYNCPATH },
+{ "sender",no_argument,&opts.sender,   1 },
+{ "server",no_argument,&opts.server,   1 },
+{ "specials",  no_argument,&opts.specials, 1 },
+{ "no-specials",   no_argument,&opts.specials, 0 },
+{ "timeout",   required_argument, NULL,OP_TIMEOUT },
+{ "times", no_argument,&opts.preserve_times,   1 },
+{ "no-times",  no_argument,&opts.preserve_times,   0 },
+{ "verbose",   no_argument,&verbose,   1 },
+{ "no-verbose",no_argument,&verbose,   0 },
+{ "version",   no_argument,NULL,   OP_VERSION },
+{ NULL,0,  NULL,   0 }
+};
+
 int
 main(int argc, char *argv[])
 {
-   struct opts  opts;
pid_tchild;
int  fds[2], sd = -1, rc, c, st, i;
struct sess   sess;
struct fargs*fargs;
char**args;
const char  *errstr;
-   const struct option  lopts[] = {
-   { "port",   required_argument, NULL,3 },
-   { "rsh",required_argument, NULL,'e' },
-   { "rsync-path", required_argument, NULL,1 },
-   { "sender", no_argument,&opts.sender,   1 },
-   { "server", no_argument,&opts.server,   1 },
-   { "dry-run",no_argument,&opts.dry_run,  1 },
-   { "version",no_argument,NULL,   2 },
-   { "archive",no_argument,NULL,   'a' },
-   { "help",   no_argument,NULL,   'h' },
-   { "compress",   no_argument,NULL,   'z' },
-   { "del",no_argument,&opts.del,  1 },
-   

Re: WIP iwx(4) Tx aggregation

2021-07-13 Thread Hrvoje Popovski
On 30.6.2021. 13:28, Stefan Sperling wrote:
> On Mon, Jun 21, 2021 at 08:37:11PM +0200, Stefan Sperling wrote:
>> This patch attempts to implement Tx aggregation support for iwx(4).
>>
>> It is not yet ready to be committed because of outstanding problems:
>>
>> - Under load the firmware throws a fatal firmware error every few minutes.
>>
>> - Starting a background scan under load can cause firmware errors and
>>   might error out when the driver attempts to flush Tx queues.
>>   However, roaming seems to be generally working while traffic is light.
>>
>> - Sometimes traffic seems to get stuck for no apparent reason and the driver
>>   won't recover without down/up. This is independent from the rx_reorder()
>>   fix which was committed today.
> Following my commits to iwx from today, here is a rebased txagg patch.
> The above issues are still present, unfortunately.


Hi,

without this diff i'm getting 25/5 Mb and with that diff 100/75Mb ..

Thank you ..

e14gen2# ifconfig iwx0
iwx0: flags=8847 mtu 1500
lladdr b0:7d:64:1e:60:e0
index 2 priority 4 llprio 3
groups: wlan egress
media: IEEE802.11 autoselect (HT-MCS3 mode 11n)
status: active
ieee80211: join hrvojewless chan 2 bssid 30:e9:8e:ad:90:3c 67%
wpakey wpaprotos wpa2 wpaakms psk wpaciphers ccmp wpagroupcipher ccmp
inet 192.168.100.2 netmask 0xff00 broadcast 192.168.100.255



OpenBSD 6.9-current (GENERIC.MP) #2: Tue Jul 13 22:10:50 CEST 2021
hrv...@e14gen2.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 7742496768 (7383MB)
avail mem = 7491899392 (7144MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.2 @ 0xbf913000 (63 entries)
bios0: vendor LENOVO version "R1AET37W (1.13 )" date 04/09/2021
bios0: LENOVO 20T6000TSC
acpi0 at bios0: ACPI 6.3
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT SSDT IVRS SSDT SSDT TPM2 SSDT MSDM BATB
HPET APIC MCFG SBST WSMT VFCT SSDT CRAT CDIT FPDT SSDT SSDT SSDT BGRT
UEFI SSDT SSDT
acpi0: wakeup devices GPP3(S3) GPP4(S4) GPP5(S3) XHC0(S3) XHC1(S3)
GP19(S3) LID_(S4) SLPB(S3)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Ryzen 5 4500U with Radeon Graphics, 2370.88 MHz, 17-60-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 32KB 64b/line 8-way I-cache, 32KB 64b/line 8-way D-cache, 512KB
64b/line 8-way L2 cache
cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD Ryzen 5 4500U with Radeon Graphics, 2370.55 MHz, 17-60-01
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu1: 32KB 64b/line 8-way I-cache, 32KB 64b/line 8-way D-cache, 512KB
64b/line 8-way L2 cache
cpu1: ITLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu1: DTLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu1: disabling user TSC (skew=-189)
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD Ryzen 5 4500U with Radeon Graphics, 2370.55 MHz, 17-60-01
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu2: 32KB 64b/line 8-way I-cache, 32KB 64b/line 8-way D-cache, 512KB
64b/line 8-way L2 cache
cpu2: ITLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu2: DTLB 64 4KB entries fully associative, 64 4MB entries fully

Re: dwiic(4): wait for tx empty when hitting tx limit

2021-07-13 Thread Patrick Wildt
Am Mon, Jul 05, 2021 at 07:52:28PM +0200 schrieb Mark Kettenis:
> > Date: Mon, 5 Jul 2021 19:30:28 +0200
> > From: Patrick Wildt 
> > 
> > Am Mon, Jul 05, 2021 at 07:07:24PM +0200 schrieb Mark Kettenis:
> > > > Date: Mon, 5 Jul 2021 19:02:32 +0200
> > > > From: Patrick Wildt 
> > > > 
> > > > Am Mon, Jul 05, 2021 at 06:34:31PM +0200 schrieb Mark Kettenis:
> > > > > > Date: Mon, 5 Jul 2021 00:04:24 +0200
> > > > > > From: Patrick Wildt 
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > I had trouble interfacing with a machine's IPMI through dwiic(4).  
> > > > > > What
> > > > > > I saw was that when sending 'bigger' commands, it would never 
> > > > > > receive
> > > > > > the STOP bit interrupt.
> > > > > > 
> > > > > > The trouble is, as can be seen in the log, that we want to send (it
> > > > > > says read, but it's a write OP, so it's send) 20 bytes, but the tx
> > > > > > limit says 14.
> > > > > > 
> > > > > > What we should do is send 14 bytes, then wait for it to send us the
> > > > > > tx empty interrupt (like we do when we first enable the controller),
> > > > > > and then re-read the tx limit.  The last line in the log is some
> > > > > > debug print I added for myself, but is not part of the diff.
> > > > > > 
> > > > > > With this, I was finally able to change the IPMI password and regain
> > > > > > access to the web interface after updating the BMC's firmware...
> > > > > > 
> > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read reqs
> > > > > > dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 
> > > > > > 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read reqs
> > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 
> > > > > > 0x00
> > > > > > dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read reqs
> > > > > > dwiic0: new tx limit 8
> > > > > > 
> > > > > > Opinions? ok?
> > > > > 
> > > > > I think you're on to something.  But this needs to handle I2C_F_POLL.
> > > > 
> > > > True that.  The previous code, which waits for the controller to accept
> > > > commands, just does DELAY(200), but I'm not sure that's good enough for
> > > > inbetween transfers.  One can apparently though poll through the raw
> > > > interrupt status register, where the interrupt mask isn't applied.  So
> > > > maybe like that?  Guess I should try setting ipmi to polling mode...
> > > 
> > > Polling the interrupt status register should work I suppose.  But for
> > > read operations we actually poll the DW_IC_RXFLR register.
> > 
> > Yeah, that would work for TX as well.  Maybe something like this, but
> > then the diff still needs to address what happens when we timeout and
> > there's still no tx_limit > 0.  Maybe timeout like the read stuff:
> > 
> > if (rx_avail == 0) {
> > printf("%s: timed out reading remaining %d\n",
> > sc->sc_dev.dv_xname, (int)(len - readpos));
> > sc->sc_i2c_xfer.error = 1;
> > sc->sc_busy = 0;
> > 
> > return (1);
> > }
> 
> Yes.

This works for me. ok?

Patrick

diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
index 84d97b8645b..8588b0905ea 100644
--- a/sys/dev/ic/dwiic.c
+++ b/sys/dev/ic/dwiic.c
@@ -416,6 +416,42 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t addr, 
const void *cmdbuf,
tx_limit = sc->tx_fifo_depth -
dwiic_read(sc, DW_IC_TXFLR);
}
+
+   if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
+   if (flags & I2C_F_POLL) {
+   for (retries = 1000; retries > 0; retries--) {
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   if (tx_limit > 0)
+   break;
+   DELAY(50);
+   }
+   } else {
+   s = splbio();
+   dwiic_read(sc, DW_IC_CLR_INTR);
+   dwiic_write(sc, DW_IC_INTR_MASK,
+   DW_IC_INTR_TX_EMPTY);
+
+   if (tsleep_nsec(&sc->sc_writewait, PRIBIO,
+   "dwiic", MSEC_TO_NSEC(500)) != 0)
+   printf("%s: timed out waiting for "
+   "tx_empty intr\n",
+   sc->sc_dev.dv_xname);
+   splx(s);
+
+   tx_limit = sc->tx_fifo_depth -
+   dwiic_read(sc, DW_IC_TXFLR);
+   }
+
+   if (tx_limit == 0) {
+   printf("%s: timed out wri

Re: dwiic(4): wait for tx empty when hitting tx limit

2021-07-13 Thread Mark Kettenis
> Date: Tue, 13 Jul 2021 21:29:35 +0200
> From: Patrick Wildt 
> 
> Am Mon, Jul 05, 2021 at 07:52:28PM +0200 schrieb Mark Kettenis:
> > > Date: Mon, 5 Jul 2021 19:30:28 +0200
> > > From: Patrick Wildt 
> > > 
> > > Am Mon, Jul 05, 2021 at 07:07:24PM +0200 schrieb Mark Kettenis:
> > > > > Date: Mon, 5 Jul 2021 19:02:32 +0200
> > > > > From: Patrick Wildt 
> > > > > 
> > > > > Am Mon, Jul 05, 2021 at 06:34:31PM +0200 schrieb Mark Kettenis:
> > > > > > > Date: Mon, 5 Jul 2021 00:04:24 +0200
> > > > > > > From: Patrick Wildt 
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I had trouble interfacing with a machine's IPMI through dwiic(4). 
> > > > > > >  What
> > > > > > > I saw was that when sending 'bigger' commands, it would never 
> > > > > > > receive
> > > > > > > the STOP bit interrupt.
> > > > > > > 
> > > > > > > The trouble is, as can be seen in the log, that we want to send 
> > > > > > > (it
> > > > > > > says read, but it's a write OP, so it's send) 20 bytes, but the tx
> > > > > > > limit says 14.
> > > > > > > 
> > > > > > > What we should do is send 14 bytes, then wait for it to send us 
> > > > > > > the
> > > > > > > tx empty interrupt (like we do when we first enable the 
> > > > > > > controller),
> > > > > > > and then re-read the tx limit.  The last line in the log is some
> > > > > > > debug print I added for myself, but is not part of the diff.
> > > > > > > 
> > > > > > > With this, I was finally able to change the IPMI password and 
> > > > > > > regain
> > > > > > > access to the web interface after updating the BMC's firmware...
> > > > > > > 
> > > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 3, flags 
> > > > > > > 0x00
> > > > > > > dwiic0: dwiic_i2c_exec: need to read 3 bytes, can send 14 read 
> > > > > > > reqs
> > > > > > > dwiic0: dwiic_i2c_exec: op 5, addr 0x10, cmdlen 1, len 33, flags 
> > > > > > > 0x00
> > > > > > > dwiic0: dwiic_i2c_exec: need to read 33 bytes, can send 15 read 
> > > > > > > reqs
> > > > > > > dwiic0: dwiic_i2c_exec: op 7, addr 0x10, cmdlen 2, len 20, flags 
> > > > > > > 0x00
> > > > > > > dwiic0: dwiic_i2c_exec: need to read 20 bytes, can send 14 read 
> > > > > > > reqs
> > > > > > > dwiic0: new tx limit 8
> > > > > > > 
> > > > > > > Opinions? ok?
> > > > > > 
> > > > > > I think you're on to something.  But this needs to handle 
> > > > > > I2C_F_POLL.
> > > > > 
> > > > > True that.  The previous code, which waits for the controller to 
> > > > > accept
> > > > > commands, just does DELAY(200), but I'm not sure that's good enough 
> > > > > for
> > > > > inbetween transfers.  One can apparently though poll through the raw
> > > > > interrupt status register, where the interrupt mask isn't applied.  So
> > > > > maybe like that?  Guess I should try setting ipmi to polling mode...
> > > > 
> > > > Polling the interrupt status register should work I suppose.  But for
> > > > read operations we actually poll the DW_IC_RXFLR register.
> > > 
> > > Yeah, that would work for TX as well.  Maybe something like this, but
> > > then the diff still needs to address what happens when we timeout and
> > > there's still no tx_limit > 0.  Maybe timeout like the read stuff:
> > > 
> > > if (rx_avail == 0) {
> > > printf("%s: timed out reading remaining %d\n",
> > > sc->sc_dev.dv_xname, (int)(len - readpos));
> > > sc->sc_i2c_xfer.error = 1;
> > > sc->sc_busy = 0;
> > > 
> > > return (1);
> > > }
> > 
> > Yes.
> 
> This works for me. ok?

ok kettenis@

> diff --git a/sys/dev/ic/dwiic.c b/sys/dev/ic/dwiic.c
> index 84d97b8645b..8588b0905ea 100644
> --- a/sys/dev/ic/dwiic.c
> +++ b/sys/dev/ic/dwiic.c
> @@ -416,6 +416,42 @@ dwiic_i2c_exec(void *cookie, i2c_op_t op, i2c_addr_t 
> addr, const void *cmdbuf,
>   tx_limit = sc->tx_fifo_depth -
>   dwiic_read(sc, DW_IC_TXFLR);
>   }
> +
> + if (I2C_OP_WRITE_P(op) && tx_limit == 0 && x < len) {
> + if (flags & I2C_F_POLL) {
> + for (retries = 1000; retries > 0; retries--) {
> + tx_limit = sc->tx_fifo_depth -
> + dwiic_read(sc, DW_IC_TXFLR);
> + if (tx_limit > 0)
> + break;
> + DELAY(50);
> + }
> + } else {
> + s = splbio();
> + dwiic_read(sc, DW_IC_CLR_INTR);
> + dwiic_write(sc, DW_IC_INTR_MASK,
> + DW_IC_INTR_TX_EMPTY);
> +
> + if (tsleep_nsec(&sc->sc_writewait, PRIBIO,
> + "dwiic", MSEC_TO_NSEC(500)) != 0)
> + printf("%s: timed out waiting for "
> + "t

Re: vscsi/iscsid: wait for scsi_probe to complete after connections are established

2021-07-13 Thread Ashton Fagg
Friendly weekly ping. Patches reattached.

- ajf

Ashton Fagg  writes:

> I have been working on fixing an issue (which was partially fixed by a
> diff I sent in earlier this year) with iscsid. With iscsi disks in
> /etc/fstab, sometimes the devices aren't fully up and ready before fsck
> tries to run - causing the machine to go into single user mode on boot.
>
> The diff that was merged earlier in the year added a poll routine which
> monitors for connection success before letting iscsictl return. This
> fixed the issue in some cases, however it's still only half the fix. The
> principled fix is to, additionally, wait until the scsi_probe calls are
> complete - at which point we can reasonably assume the disk device are
> ready for use. This requires some work to the vscsi driver to make this
> happen, as well as changes to both iscsid and iscsictl. The diffs
> attached here are a full implementation of this.
>
> I was still encountering this issue on one of my machines (much slower
> than my normal machine), where the connections would be up but the
> scsi_probe calls would not have completed - even with my earlier diff
> this would still cause the machine to go to single user mode. However,
> this indeed fixes that problem completely and I've been running it for a
> couple of weeks with no problems.
>
> The diffs are designed to be applied in the order they appear. In
> summary, the proposed changes are:
>
> (1) taskq.diff adds a taskq_empty function, which lets us check to see
> if a taskq is, well, empty. This is used in (2). Updates the man pages
> for taskq/task_add to reflect this.
>
> (2) vscsi.diff adds an ioctl to the vscsi driver which lets us monitor
> for device event queue completion. To aid in this it also separates
> calls to scsi_probe and scsi_detach to a dedicated taskq, rather than
> using systq. Updates the man pages for vscsi to reflect this.
>
> (3) iscsid.diff does the plumbing to actually call the new ioctl and
> provide that information back to iscsictl during status poll.
>
> (4) iscsictl.diff integrates the device queue information into the
> polling routine. Updates the man page for iscsictl to describe the new
> behavior.
>
> Based on commmit messages around vscsi it seems there was some plan to
> do this quite some years ago but it's hard for me to know what the
> proposed method was (though I assume what was envisaged might be similar
> to something like this).
>
> Feedback sought and greatly welcomed. I am basically certain there are
> ways this can be improved.
>
> Thanks,
>
> Ash

Index: share/man/man9/task_add.9
===
RCS file: /cvs/src/share/man/man9/task_add.9,v
retrieving revision 1.22
diff -u -p -u -p -r1.22 task_add.9
--- share/man/man9/task_add.9	8 Jun 2020 00:29:51 -	1.22
+++ share/man/man9/task_add.9	13 Jul 2021 23:45:59 -
@@ -20,6 +20,7 @@
 .Sh NAME
 .Nm taskq_create ,
 .Nm taskq_destroy ,
+.Nm taskq_empty ,
 .Nm taskq_barrier ,
 .Nm taskq_del_barrier ,
 .Nm task_set ,
@@ -43,6 +44,8 @@
 .Fn taskq_barrier "struct taskq *tq"
 .Ft void
 .Fn taskq_del_barrier "struct taskq *tq" "struct task *t"
+.Ft int
+.Fn taskq_empty "struct taskq *tq"
 .Ft void
 .Fn task_set "struct task *t" "void (*fn)(void *)" "void *arg"
 .Ft int
@@ -86,6 +89,9 @@ argument:
 The threads servicing the taskq will be run without the kernel big lock.
 .El
 .Pp
+.Fn taskq_empty
+indicates whether there are any queued tasks.
+.Pp
 .Fn taskq_destroy
 causes the resources associated with a previously created taskq to be freed.
 It will wait till all the tasks in the work queue are completed before
@@ -220,6 +226,9 @@ or 0 if the task was not already on the 
 .Fn task_pending
 will return non-zero if the task is queued to run, or 0 if the task
 is not queued.
+.Pp
+.Fn taskq_empty
+will return 1 if there are no tasks queued, or 0 otherwise.
 .Sh SEE ALSO
 .Xr autoconf 9 ,
 .Xr spl 9
Index: sys/kern/kern_task.c
===
RCS file: /cvs/src/sys/kern/kern_task.c,v
retrieving revision 1.31
diff -u -p -u -p -r1.31 kern_task.c
--- sys/kern/kern_task.c	1 Aug 2020 08:40:20 -	1.31
+++ sys/kern/kern_task.c	13 Jul 2021 23:46:00 -
@@ -394,6 +394,18 @@ task_del(struct taskq *tq, struct task *
 }
 
 int
+taskq_empty(struct taskq *tq)
+{
+	int rv;
+
+	mtx_enter(&tq->tq_mtx);
+	rv = TAILQ_EMPTY(&tq->tq_worklist);
+	mtx_leave(&tq->tq_mtx);
+
+	return (rv);
+}
+
+int
 taskq_next_work(struct taskq *tq, struct task *work)
 {
 	struct task *next;
Index: sys/sys/task.h
===
RCS file: /cvs/src/sys/sys/task.h,v
retrieving revision 1.18
diff -u -p -u -p -r1.18 task.h
--- sys/sys/task.h	1 Aug 2020 08:40:20 -	1.18
+++ sys/sys/task.h	13 Jul 2021 23:46:00 -
@@ -51,6 +51,8 @@ void		 taskq_barrier(struct taskq *);
 
 void		 taskq_del_barrier(struct taskq *, struct task *);
 
+int		 taskq_empty(struct taskq *);
+
 void		 task_s

Re: Sync dwctwo(4) with NetBSD

2021-07-13 Thread Martin Reindl
On Sun, Jul 11, 2021 at 05:55:14PM +0200, Marcus Glocker wrote:
> 
> Following the entire diff.  And of course, general feedback and
> comments? :-)
> 

While I can't comment on the diff itself, I gave it a spin on my
Raspberry Pi 3 Model B Plus Rev 1.3. mue(4) now works and I am happy to test
further diffs. Quick test with an uvideo(4) webcam shows isochronous transfer
still don't work but maybe this can be fixed down the road?

-m



ix(4): fix Rx hash type

2021-07-13 Thread Kevin Lo
Hi,

The diff below fixes Rx desc RSS type.  This matches what Linux and FreeBSD do.
ok?

Index: sys/dev/pci/if_ix.c
===
RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.178
diff -u -p -u -p -r1.178 if_ix.c
--- sys/dev/pci/if_ix.c 22 Dec 2020 23:25:37 -  1.178
+++ sys/dev/pci/if_ix.c 14 Jul 2021 05:41:08 -
@@ -3071,7 +3071,8 @@ ixgbe_rxeof(struct rx_ring *rxr)
 
i = rxr->next_to_check;
while (if_rxr_inuse(&rxr->rx_ring) > 0) {
-   uint32_t hash, hashtype;
+   uint32_t hash;
+   uint16_t hashtype;
 
bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map,
dsize * i, dsize, BUS_DMASYNC_POSTREAD);
@@ -3101,7 +3102,8 @@ ixgbe_rxeof(struct rx_ring *rxr)
vtag = letoh16(rxdesc->wb.upper.vlan);
eop = ((staterr & IXGBE_RXD_STAT_EOP) != 0);
hash = lemtoh32(&rxdesc->wb.lower.hi_dword.rss);
-   hashtype = lemtoh32(&rxdesc->wb.lower.lo_dword.data) &
+   hashtype =
+   lemtoh16(&rxdesc->wb.lower.lo_dword.hs_rss.pkt_info) &
IXGBE_RXDADV_RSSTYPE_MASK;
 
if (staterr & IXGBE_RXDADV_ERR_FRAME_ERR_MASK) {