Re: epair/vbridge: no IPv6 traffic egress until first IPv6 packet flows in

2023-10-11 Thread Kristof Provost
On 10 Oct 2023, at 19:26, FreeBSD User wrote:
> Hello,
>
> at first: observation is below, marked [OBSERVATION].
>
> Running recent CURRENT (FreeBSD 15.0-CURRENT #26 main-n265831-3523f0677ef: 
> Mon Oct  9 14:00:42
> CEST 2023 amd64), I've configured a bridge (bridge0), the hosts's interface 
> igb0 (I350-T2 two
> port Gigabit Network Connection ) is member of that bridge and so a couple of 
> epair(4) devices
> belonging to a couple of jails.
>

> On epairs as well as on the main hosts igb0 NIC, both IPv4 and IPv6 are 
> configured, IPv6 uses
> ULA and setup doesn't has anything fancy.
>
Do not assign addresses to bridge member interfaces. Assign the addresses to 
the bridge itself.
Misconfiguring that breaks multicast, which almost certainly explains all of 
your symptoms.

Best regards,
Kristof



Re: panic in cypto code

2023-10-06 Thread Kristof Provost



> On 6 Oct 2023, at 08:46, Steve Kargl  
> wrote:
> 
> On Thu, Oct 05, 2023 at 03:11:02PM -0700, Steve Kargl wrote:
>> 
>> I'll ping you off list when it's available.
>> 
> 
> Well, this is interesting.  I cannot upload the files to
> a location from which I can then put them up on freefall. :(
> 
> % scp -P1234 kernel.debug 10.95.76.21:
> kernel.debug0%  255KB 255.0KB/s   04:01 
> ETAclient_loop: send disconnect: Broken pipe
> lost connection
> % scp -P1234 vmcore.2  10.95.76.21:
> vmcore.20%  255KB 254.9KB/s   49:46 
> ETAclient_loop: send disconnect: Broken pipe
> lost connection
> 
> Looks like if_ovpn,ko is autoloaded.
> %  kldstat | grep ovpn
> 231 0x82042000 6650 if_ovpn.ko
> 
> Don't know what if_ovpn.ko does in hijacking tun0, but dying after
> 255kB is likely not correct.
> 
If_ovpn.ko is the kernel side of the DCO (data channel offload) thing for 
OpenVPN. It’s loaded and activated automatically if available. 
You can add “disable-dco” to your OpenVPN config to disable that.

Kristof


Re: panic in cypto code

2023-10-05 Thread Kristof Provost
On 5 Oct 2023, at 19:34, Steve Kargl wrote:
> On Thu, Oct 05, 2023 at 06:05:37PM +0200, Kristof Provost wrote:
>> Hi Steve,
>>
>> On 5 Oct 2023, at 17:36, Steve Kargl wrote:
>>> In case anyone else is using openvpn.
>>>
>>> %  pkg info openvpn
>>> openvpn-2.6.6
>>> Name   : openvpn
>>> Version: 2.6.6
>>> Installed on   : Tue Sep 19 08:48:55 2023 PDT
>>> Origin : security/openvpn
>>> Architecture   : FreeBSD:15:amd64
>>>
>>> % uname -a
>>> FreeBSD hotrats 15.0-CURRENT #1 main-n265325-9c30461dd25b:
>>> Thu Sep 14 08:09:18 PDT 2023 kargl@hotrats:$PATH/HOTRATS amd64
>>>
>>>
>>> Fatal double fault
>>> rip 0x8099b408 rsp 0xfe000e1cc000 rbp 0xfe000e1cc010
>>> rax 0x53749f62934c5349 rdx 0x53749f62934c5349 rbx 0xfe000e1cc200
>>> rcx 0x57bf32fec3cbde70 rsi 0x32e8db2f0591c5da rdi 0x832f0fb1e6d07eb0
>>> r8 0 r9 0 r10 0
>>> r11 0x60 r12 0x5af7589946bd13d9 r13 0xbeddd6a808e1dd54
>>> r14 0xcdf12bbf2708189c r15 0xeb262ae8536a7adf rflags 0x10246
>>> cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
>>> fsbase 0x1c02e381d120 gsbase 0x81a1 kgsbase 0
>>> cpuid = 0; apic id = 00
>>> panic: double fault
>>> cpuid = 0
>>> time = 1696512769
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>>> 0x812b3060
>>> vpanic() at vpanic+0x132/frame 0x812b3190
>>> panic() at panic+0x43/frame 0x812b31f0
>>> dblfault_handler() at dblfault_handler+0x1ce/frame 0x812b32b0
>>> Xdblfault() at Xdblfault+0xd7/frame 0x812b32b0
>>> --- trap 0x17, rip = 0x8099b408, rsp = 0xfe000e1cc000, rbp = 
>>> 0xfe000e1cc010 ---
>>> gfmultword4() at gfmultword4+0x8/frame 0xfe000e1cc010
>>> gf128_mul4b() at gf128_mul4b+0x63/frame 0xfe000e1cc050
>>> AES_GMAC_Update() at AES_GMAC_Update+0x73/frame 0xfe000e1cc0b0
>>> swcr_gcm() at swcr_gcm+0x660/frame 0xfe000e1cc830
>>> swcr_process() at swcr_process+0x1a/frame 0xfe000e1cc850
>>> crypto_dispatch() at crypto_dispatch+0x42/frame 0xfe000e1cc870
>>> ovpn_transmit_to_peer() at ovpn_transmit_to_peer+0x54e/frame 
>>> 0xfe000e1cc8d0
>>> ovpn_output() at ovpn_output+0x2a2/frame 0xfe000e1cc950
>>> ip_output() at ip_output+0x11f6/frame 0xfe000e1cca40
>>> ovpn_encap() at ovpn_encap+0x3e7/frame 0xfe000e1ccac0
>>>
>>> #13 0x80ae08ce in dblfault_handler (frame=)
>>> at /usr/src/sys/amd64/amd64/trap.c:1012
>>> #14 
>>> #15 0x8099b408 in gfmultword4 (worda=3668422891496654298,
>>> wordb=9452791399630012080, wordc=6013606648173318985,
>>> wordd=6322828471639465584, x=..., tbl=0xfe000e1cc200)
>>> at /usr/src/sys/opencrypto/gfmult.c:174
>>> #16 0x8099b5d3 in gf128_mul4b (r=...,
>>> v=v@entry=0xf800076b9a64 
>>> "\3156}\373\312w\254iBnD\001ܹ˾\353&*\350Sjz߃/\017\261\346\320~\260Z\367X\231F\275\023\331St\237b\223LSI\276\335֨\b\341\335TW\2772\376\303\313\336pN\265\023\352\2054\002\a/˦9R\321\366p\f\352\204P\360\270\371\250\\\aE?7s\377\253\217b\262%\214\317m",
>>> tbl=tbl@entry=0xfe000e1cc200) at 
>>> /usr/src/sys/opencrypto/gfmult.c:268
>>> #17 0x8099ab13 in AES_GMAC_Update (ctx=0xfe000e1cc200,
>>> vdata=, len=144) at /usr/src/sys/opencrypto/gmac.c:94
>>> #18 0x80998ae0 in swcr_gcm (ses=0xf8020376a048,
>>> crp=0xf80023386c08) at /usr/src/sys/opencrypto/cryptosoft.c:505
>>> #19 0x80997c4a in swcr_process (dev=,
>>> crp=0xf80023386c08, hint=)
>>> at /usr/src/sys/opencrypto/cryptosoft.c:1680
>>>
>> Do you have a bit more information about what happened here?
>> As in: can you reproduce this, or do you have any idea what
>> was going on to trigger this?  Did anything change in your
>> setup (i.e. is if_ovpn use new, or did you update either kernel
>> or userspace or ?
>
> I updated the system on the date displayed by 'uname -a'.
> This included both base system and all installed ports;
> including openvpn.  I normally leave the system running Xorg,
> and I would find the system in a "locked-up" blank-screen
> saver state.  I assumed I was having a Xorg/drm-kmod problem,
> so I shut Xorg down last night.  The above panic was waiting
> for me this morning.  The panic happens every night.
>
> Note , I don't use if_ovpn.  This a client over a tun0 device
> through wlan0.
>
The backtrace contradicts you, but DCO is relatively transparent, so it’s quite 
possible you didn’t notice. It defaults to being enabled, and ought to just 
work.

>>
>> Do you have the full core dump to poke at?
>>
>
> Yes, I do, but it's on a home system.  I can put it up on
> my kargl@freefall later tonight (in 10-ish hours).  I'll
> include the dmesg.boot so you have some idea about the
> hardware.
>
That’d be very helpful, thanks.

Best regards,
Kristof



Re: panic in cypto code

2023-10-05 Thread Kristof Provost
Hi Steve,

On 5 Oct 2023, at 17:36, Steve Kargl wrote:
> In case anyone else is using openvpn.
>
> %  pkg info openvpn
> openvpn-2.6.6
> Name   : openvpn
> Version: 2.6.6
> Installed on   : Tue Sep 19 08:48:55 2023 PDT
> Origin : security/openvpn
> Architecture   : FreeBSD:15:amd64
>
> % uname -a
> FreeBSD hotrats 15.0-CURRENT #1 main-n265325-9c30461dd25b:
> Thu Sep 14 08:09:18 PDT 2023 kargl@hotrats:$PATH/HOTRATS amd64
>
>
> Fatal double fault
> rip 0x8099b408 rsp 0xfe000e1cc000 rbp 0xfe000e1cc010
> rax 0x53749f62934c5349 rdx 0x53749f62934c5349 rbx 0xfe000e1cc200
> rcx 0x57bf32fec3cbde70 rsi 0x32e8db2f0591c5da rdi 0x832f0fb1e6d07eb0
> r8 0 r9 0 r10 0
> r11 0x60 r12 0x5af7589946bd13d9 r13 0xbeddd6a808e1dd54
> r14 0xcdf12bbf2708189c r15 0xeb262ae8536a7adf rflags 0x10246
> cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
> fsbase 0x1c02e381d120 gsbase 0x81a1 kgsbase 0
> cpuid = 0; apic id = 00
> panic: double fault
> cpuid = 0
> time = 1696512769
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x812b3060
> vpanic() at vpanic+0x132/frame 0x812b3190
> panic() at panic+0x43/frame 0x812b31f0
> dblfault_handler() at dblfault_handler+0x1ce/frame 0x812b32b0
> Xdblfault() at Xdblfault+0xd7/frame 0x812b32b0
> --- trap 0x17, rip = 0x8099b408, rsp = 0xfe000e1cc000, rbp = 
> 0xfe000e1cc010 ---
> gfmultword4() at gfmultword4+0x8/frame 0xfe000e1cc010
> gf128_mul4b() at gf128_mul4b+0x63/frame 0xfe000e1cc050
> AES_GMAC_Update() at AES_GMAC_Update+0x73/frame 0xfe000e1cc0b0
> swcr_gcm() at swcr_gcm+0x660/frame 0xfe000e1cc830
> swcr_process() at swcr_process+0x1a/frame 0xfe000e1cc850
> crypto_dispatch() at crypto_dispatch+0x42/frame 0xfe000e1cc870
> ovpn_transmit_to_peer() at ovpn_transmit_to_peer+0x54e/frame 
> 0xfe000e1cc8d0
> ovpn_output() at ovpn_output+0x2a2/frame 0xfe000e1cc950
> ip_output() at ip_output+0x11f6/frame 0xfe000e1cca40
> ovpn_encap() at ovpn_encap+0x3e7/frame 0xfe000e1ccac0
>
> #13 0x80ae08ce in dblfault_handler (frame=)
> at /usr/src/sys/amd64/amd64/trap.c:1012
> #14 
> #15 0x8099b408 in gfmultword4 (worda=3668422891496654298,
> wordb=9452791399630012080, wordc=6013606648173318985,
> wordd=6322828471639465584, x=..., tbl=0xfe000e1cc200)
> at /usr/src/sys/opencrypto/gfmult.c:174
> #16 0x8099b5d3 in gf128_mul4b (r=...,
> v=v@entry=0xf800076b9a64 
> "\3156}\373\312w\254iBnD\001ܹ˾\353&*\350Sjz߃/\017\261\346\320~\260Z\367X\231F\275\023\331St\237b\223LSI\276\335֨\b\341\335TW\2772\376\303\313\336pN\265\023\352\2054\002\a/˦9R\321\366p\f\352\204P\360\270\371\250\\\aE?7s\377\253\217b\262%\214\317m",
> tbl=tbl@entry=0xfe000e1cc200) at /usr/src/sys/opencrypto/gfmult.c:268
> #17 0x8099ab13 in AES_GMAC_Update (ctx=0xfe000e1cc200,
> vdata=, len=144) at /usr/src/sys/opencrypto/gmac.c:94
> #18 0x80998ae0 in swcr_gcm (ses=0xf8020376a048,
> crp=0xf80023386c08) at /usr/src/sys/opencrypto/cryptosoft.c:505
> #19 0x80997c4a in swcr_process (dev=,
> crp=0xf80023386c08, hint=)
> at /usr/src/sys/opencrypto/cryptosoft.c:1680
>
Do you have a bit more information about what happened here? As in: can you 
reproduce this, or do you have any idea what was going on to trigger this? Did 
anything change in your setup (i.e. is if_ovpn use new, or did you update 
either kernel or userspace or …?

Do you have the full core dump to poke at?

It might be a bug in the crypto code, but it could also be a bug in the if_ovpn 
code, so I’d like to work out what caused this.

Best regards,
Kristof



Re: git: 8d49fd7331bc - main - pf: remove DIOCGETRULE and DIOCGETSTATUS : net/py-libdnet and net/scapy now broken, kyua test suite damaged

2023-09-17 Thread Kristof Provost
On 14 Sep 2023, at 15:34, Mark Millard wrote:
> [I've cc'd a couple of folks that have dealt with fixing
> breakage in the past.]
>
I’ve submitted a fix for libdnet in 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=273899 because it blocks 
net/scapy, which we rely on for tests.

I do not plan to fix other ports as well.

Best regards,
Kristof



Re: git: 8d49fd7331bc - main - pf: remove DIOCGETRULE and DIOCGETSTATUS : net/py-libdnet and net/scapy now broken, kyua test suite damaged

2023-09-14 Thread Kristof Provost
Hi Mark,

On 14 Sep 2023, at 7:37, Mark Millard wrote:
> This change leads the port net/py-libdnet to be broken:
>
> --- fw-pf.lo ---
> fw-pf.c:212:22: error: use of undeclared identifier 'DIOCGETRULE'
> if (ioctl(fw->fd, DIOCGETRULE, ) == 0 &&
> ^
> fw-pf.c:252:22: error: use of undeclared identifier 'DIOCGETRULE'
> if (ioctl(fw->fd, DIOCGETRULE, ) == 0 &&
> ^
> --- intf.lo ---
> for (cnt = 0; !matched && cnt < (int) entry->intf_alias_num; cnt++) {
> ^
> intf.c:571:2: note: previous statement is here
> if (entry->intf_addr.addr_type == ADDR_TYPE_IP &&
> ^
> --- fw-pf.lo ---
> fw-pf.c:296:28: error: use of undeclared identifier 'DIOCGETRULE'
> if ((ret = ioctl(fw->fd, DIOCGETRULE, )) < 0)
> ^
> 3 errors generated.
>
> That leads to:
>
> [00:00:41] [29] [00:00:26] Finished net/py-libdnet@py39 | 
> py39-libdnet-1.13_4: Failed: build
> [00:00:42] [29] [00:00:27] Skipping net/scapy@py39 | py39-scapy-2.5.0_1: 
> Dependent port net/py-libdnet@py39 | py39-libdnet-1.13_4 failed
>

The commit removed those ioctls because they’ve been superseded by newer 
(nvlist-based) versions.
Ports are strongly advised to use libpfctl rather than trying to deal with 
nvlists themselves.

See https://lists.freebsd.org/archives/freebsd-pf/2023-April/000345.html for an 
example of what the ports will have to do. It’s generally a trivial change.

Best regards,
Kristof



Re: buildkernel is broken

2022-07-07 Thread Kristof Provost

On 7 Jul 2022, at 19:00, Steve Kargl wrote:

On Thu, Jul 07, 2022 at 10:37:40AM -0600, Warner Losh wrote:

On Thu, Jul 7, 2022 at 10:37 AM Steve Kargl <
s...@troutmask.apl.washington.edu> wrote:


Thanks, but

root[216] git cherry-pick -n 37f604b49d4a
fatal: bad revision '37f604b49d4a'
root[217] pwd
/usr/src



git fetch maybe?



A cursory google search suggests that 'git fetch'
works on repositories not single files.

I did look at the diff associated with 37f604b49d4a.
I am surprised that the commit that broke buildkernel
for me was allowed to be committed.


It was posted for review in https://reviews.freebsd.org/D35716

I’ll also point out that this commit works just fine in nearly all of 
our kernel configs, because there are very few (only one powerpc config, 
as far as I can tell) that do not have VIMAGE.
Arguably we should have a non-VIMAGE kernel config around (probably for 
amd64) so it’s more likely we spot these issues prior to commit.
Arbitrary non-default kernel configs are more likely to see issues like 
this one. I don’t think that can be avoided.



The fix in
37f604b49d4a seems rather questionable especially given
that there is no comment about why the macro is expanded
to a zero-trip loop.


I’m not sure how I could have been much more clear than this:

VNET_FOREACH() is a LIST_FOREACH if VIMAGE is set, but empty if 
it's

not. This means that users of the macro couldn't use 'continue' or
'break' as one would expect of a loop.

I welcome suggestions on how to improve my future commit messages.

To rephrase it a bit: VNET_FOREACH() used to be very misleading, in that 
it was only a loop with options VIMAGE, and empty (so any code within 
would be its own block, and be executed exactly once, for the only vnet 
that exists without VIMAGE). That’s fine, unless you want to 
‘continue’ or ‘break’ the loop. That worked with VIMAGE (so the 
issue in the dummynet fix was not seen) but not without it.


Kristof

Re: Kernel panic on armv7 when PF is enabled

2022-05-02 Thread Kristof Provost

On 1 May 2022, at 5:13, qroxana wrote:

After git bisecting the panic started since this commit.

commit 78bc3d5e1712bc1649aa5574d2b8d153f9665113

Author: Kristof Provost <
k...@freebsd.org




Date:   Mon Feb 14 20:09:54 2022 +0100

vlan: allow net.link.vlan.mtag_pcp to be set per vnet

The primary reason for this change is to facilitate testing.

MFC after:  1 week

sys/net/if_ethersubr.c | 9 +

sys/net/if_vlan.c  | 5 +++--

2 files changed, 8 insertions(+), 6 deletions(-)

The armv7 board boots from a NFS root,

it can boot without any problem if PF is disabled.

Any helps?

add host ::1: gateway lo0 fib 0: route already in table
add net fe80::: gateway ::1
add net ff02::: gateway ::1
add net :::0.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
Enabling pf.
Kernel page fault with the following non-sleepable locks held:
shared rm pf rulesets (pf rulesets) r = 0 (0xe3099430) locked @ 
/usr/src/sys/netpfil/pf/pf.c:6493
exclusive rw tcpinp (tcpinp) r = 0 (0xdb748d88) locked @ 
/usr/src/sys/netinet/tcp_usrreq.c:1008

stack backtrace:
#0 0xc0355cac at witness_debugger+0x7c
#1 0xc0356ef0 at witness_warn+0x3fc
#2 0xc05ec048 at abort_handler+0x1d8
#3 0xc05cb5ac at exception_exit+0
#4 0xe3083c10 at pf_syncookie_validate+0x60
#5 0xe30496a8 at pf_test+0x518
#6 0xe306d768 at pf_check_out+0x30
#7 0xc0415b44 at pfil_run_hooks+0xbc
#8 0xc0445cfc at ip_output+0xce8
#9 0xc045bc9c at tcp_default_output+0x20ac
#10 0xc0471eb4 at tcp_usr_send+0x1ac
#11 0xc0389464 at sosend_generic+0x490
#12 0xc0389790 at sosend+0x64
#13 0xc0502888 at clnt_vc_call+0x560
#14 0xc05009d8 at clnt_reconnect_call+0x170
#15 0xc01e7b14 at newnfs_request+0xb20
#16 0xc0230218 at nfscl_request+0x60
#17 0xc020d9bc at nfsrpc_getattr+0xb0
Fatal kernel mode data abort: 'Alignment Fault' on read
trapframe: 0xdf1f1c90
FSR=0001, FAR=d7840264, spsr=4013
r0 =6a228eda, r1 =dac0d785, r2 =d7840264, r3 =db5527c0
r4 =df1f1e00, r5 =dac0d75f, r6 =0018, r7 =d9422c00
r8 =c093e5e4, r9 =0001, r10=df1f1f5c, r11=df1f1d38
r12=e3098dd0, ssp=df1f1d20, slr=e3083bdc, pc =e3083c10


The commit you point at is entirely unrelated to the code where the 
panic occurred, so I’m pretty sure something went wrong in your 
bisect.


The backtrace would suggest the issue occurs in the  
pf_syncookie_validate() function, and likely in the line `if 
(atomic_load_64(_pf_status.syncookies_inflight[cookie.flags.oddeven]) 
== 0)`


The obvious way for that to panic would be to call it without the 
curvnet context set, but pf_test() uses it earlier, so that’s going to 
be fine.


Given that this is unique to armv7 I’d recommend talking to the armv7 
maintainer about 64 bit atomic operations.


You can probably avoid the atomic load with this patch (and not enabling 
syncookie support):


	diff --git a/sys/netpfil/pf/pf_syncookies.c 
b/sys/netpfil/pf/pf_syncookies.c

index 5230502be30c..c86d469d3cef 100644
--- a/sys/netpfil/pf/pf_syncookies.c
+++ b/sys/netpfil/pf/pf_syncookies.c
@@ -313,6 +313,9 @@ pf_syncookie_validate(struct pf_pdesc *pd)
ack = ntohl(pd->hdr.tcp.th_ack) - 1;
cookie.cookie = (ack & 0xff) ^ (ack >> 24);

+   if (V_pf_status.syncookies_mode == PF_SYNCOOKIES_NEVER)
+   return (0);
+
/* we don't know oddeven before setting the cookie (union) */
	 if 
(atomic_load_64(_pf_status.syncookies_inflight[cookie.flags.oddeven])

== 0)

That shouldn’t be required though.

Br,
Kristof


Re: Kernel panic for if_epair

2022-02-16 Thread Kristof Provost
On 16 Feb 2022, at 11:31, qroxana wrote:
> It's running 14.0-CURRENT armv7 main-n252983-d21e71efce39
>
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex epairidx (epairidx) r = 0 (0xe2fe9160) locked @ 
> /usr/src/sys/net/if_epair.c:165
> stack backtrace:
> #0 0xc03558f8 at witness_debugger+0x7c
> #1 0xc0356b3c at witness_warn+0x3fc
> #2 0xc05eb3c8 at abort_handler+0x1d8
> #3 0xc05ca8e0 at exception_exit+0
> #4 0xc0475928 at udp_input+0x1c0
> #5 0xc0441884 at ip_input+0xa18
> #6 0xc041426c at netisr_dispatch_src+0x100
> #7 0xc040b9a0 at ether_demux+0x1c8
> #8 0xc040d22c at ether_nh_input+0x514
> #9 0xc041426c at netisr_dispatch_src+0x100
> #10 0xc040be94 at ether_input+0x8c
> #11 0xe2fd8130 at $a.8+0x128
> #12 0xc02a1ee0 at ithread_loop+0x268
> #13 0xc029e088 at fork_exit+0xa0
> #14 0xc05ca870 at swi_exit+0
> Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xe2a0baf0
> FSR=0001, FAR=e3f02a56, spsr=2013
> r0 =, r1 =0001, r2 =0001, r3 =0a0a
> r4 =, r5 =e3f02a6a, r6 =e3f02a56, r7 =0044
> r8 =0044, r9 =c0af955c, r10=0014, r11=e2a0bc10
> r12=, ssp=e2a0bb80, slr=c0441884, pc =c0475928
>
> panic: Fatal abort

That backtrace suggests an alignment fault in udp_input(), not an issue with 
if_epair.
There’s not even any mention of if_epair in that backtrace, but I suppose it’s 
remotely possible that it’s in epair_intr(), calling epair_sintr() in #11. That 
would explain why the epair lock is held, at least.

Note that the epair code has been substantially reworked recently so if you 
retry with a recent (post 24f0bfbad57b9c3cb9b543a60b2ba00e4812c286) build you 
won’t see the epair lock mentioned (assuming you can reproduce the panic), but 
again, it doesn’t look to be involved here anyway.

Kristof



Re: test-includes breaks buildworld when WITHOUT_PF is set in src.conf

2022-02-09 Thread Kristof Provost
On 9 Feb 2022, at 10:57, Gary Jennejohn wrote:
> test-includes uses pf.h when checking usage of pfvar.h.
>
> But, these lines in include/Makefile remove pf.h when WITHOUT_PF is
> set in src.conf:
>
> .if ${MK_PF} != "no"
>  INCSGROUPS+=   PF
> .endif
>
> This breaks buildworld.  The error message:
>
> In file included from net_pfvar.c:1:
> /usr/obj/usr/src/amd64.amd64/tmp/usr/include/net/pfvar.h:65:10: fatal error:
> 'netpfil/pf/pf.h' file not found
> #include 
>  ^
> 1 error generated.
> --- net_pfvar.o ---
> *** [net_pfvar.o] Error code 1
>
> make[3]: stopped in /usr/src/tools/build/test-includes
> .ERROR_TARGET='net_pfvar.o'
>
> Removing the .if/.endif fixes it for me, although there may be a better
> way to avoid the error.
>
Warner’s working on a better fix. See https://reviews.freebsd.org/D34009 for 
the discussion.

Kristof



Re: netinet & netpfil tests failing

2022-01-18 Thread Kristof Provost
On 18 Jan 2022, at 3:07, Gleb Smirnoff wrote:
> * Another factor - scapy.  The python scapy library would emit warning to 
> stderr
>   if it sees interface without any IP address.  This happens right at 'import 
> scapy'.
>   The test suite considers a test failed if it has something on stderr, even 
> if
>   it returned success.
>
> So, result is that some test (absolutely unrelated to pcbs) leaves a jail with
> interfaces, then jail is released, interfaced pop up at vnet0, and then some
> other test (absolutely unrelated to pcbs) using scapy writes a warning to 
> stderr
> and triggers failure.
>
Several of the pf scapy scripts deal with that issue by setting the scapy log 
level:
https://cgit.freebsd.org/src/tree/tests/sys/netpfil/pf/CVE-2019-5597.py#n30

So that part at least we could probably mitigate easily.

(I’m not overly fond of that decision in scapy, but didn’t want to resort to 
patching scapy to cope with our fairly specific requirements.)

Kristof



Re: WITHOUT_PF breaks buildworld

2021-12-22 Thread Kristof Provost
On 22 Dec 2021, at 8:21, Konrad Sewiłło-Jopek wrote:
> Hi,
>
> I think the reason is somewhere in tools/build/test-includes:
>
> --- net/if_pfsync.o ---
> In file included from net/if_pfsync.c:1:
> In file included from
> [...]freebsd/arm64.aarch64/tmp/usr/include/net/if_pfsync.h:56:
> [...]freebsd/arm64.aarch64/tmp/usr/include/net/pfvar.h:65:10: fatal error:
> 'netpfil/pf/pf.h' file not found
> #include 
>  ^
> 1 error generated.
> *** [net/if_pfsync.o] Error code 1
>
> make[3]: stopped in [...]freebsd/tools/build/test-includes
> --- net/pfvar.o ---
> In file included from net/pfvar.c:1:
> [...]freebsd/arm64.aarch64/tmp/usr/include/net/pfvar.h:65:10: fatal error:
> 'netpfil/pf/pf.h' file not found
> #include 
>  ^
> 1 error generated.
> *** [net/pfvar.o] Error code 1
>
> make[3]: stopped in [...]freebsd/tools/build/test-includes
> 2 errors
>
> make[3]: stopped in [...]freebsd/tools/build/test-includes
> *** [test-includes] Error code 2
>
> make[2]: stopped in [...]freebsd
> 1 error
>
> Best regards,
> Konrad Sewiłło-Jopek
>
>
> niedz., 19 gru 2021 o 12:26 Gary Jennejohn 
> napisał(a):
>
>> On Sun, 19 Dec 2021 19:05:35 +0800
>> Alastair Hogge  wrote:
>>
>>> On Sunday, 19 December 2021 6:47:23 PM AWST Gary Jennejohn wrote:
 Some recent change, probably in a .mk file, breaks builworld on HEAD
 when WITHOUT_PF is enabled in src.conf.
>>>
>>> I have had to disable WITHOUT_PF since 2020-07-27, but probably earlier.
>>>
>>
>> Hmm.  I did a successful buildworld a few days ago with WITHOUT_PF
>> enabled, so it's new breakge for me at least.
>>
>> I don't enable pf in the kernel and don't need it in userland.
>>
 Disabling WITHOUT_PF results in a successful buildworld.

 The reported error is that netpfil/pf/pf.h can't be found.
>>>
>>> Some ports depend on that too.
>>>
>>
This is the test-includes target, which validates that include files are 
self-contained (that is, you can ‘#include <$file>’ without prerequisites.
The target fails because it looks at all headers in /usr/src/sys and then tries 
to build them, but some of those headers (like the pf headers) include other 
headers that may not be getting installed because they’re disabled.

I’m not quite sure how to best fix this.

Note that it is not happening because some pf tools are still getting built. 
This is a validation target that fails.

We could potentially add the pf headers to BADHDRS depending on the WITHOUT_ 
flag, but that would mean manually maintaining badfiles.inc.
Or perhaps we should keep installing the pf headers even when WITHOUT_PF is 
set, but I’m not actually sure how we convince the build system to do that. Or 
if it’s a good idea.

Warner might have better ideas on how to fix this.

Kristof



Re: HEADS-UP: ASLR for 64-bit executables enabled by default on main

2021-11-19 Thread Kristof Provost


> On 18 Nov 2021, at 11:43, Marcin Wojtas  wrote:
> czw., 18 lis 2021 o 19:07 Li-Wen Hsu  napisał(a):
>> 
>>> On Wed, Nov 17, 2021 at 6:30 AM Marcin Wojtas  wrote:
>>> 
>>> As of b014e0f15bc7 the ASLR (Address Space Layout
>>> Randomization) feature becomes enabled for the all 64-bit
>>> binaries by default.
>>> 
>>> Address Space Layout Randomization (ASLR) is an exploit mitigation
>>> technique implemented in the majority of modern operating systems.
>>> It involves randomly positioning the base address of an executable
>>> and the position of libraries, heap, and stack, in a process's address
>>> space. Although over the years ASLR proved to not guarantee full OS
>>> security on its own, this mechanism can make exploitation more difficult
>>> (especially when combined with other methods, such as W^X).
>>> 
>>> Tests on the tier 1 64-bit architectures demonstrated that the ASLR is
>>> stable and does not result in noticeable performance degradation,
>>> therefore it is considered safe to enable this mechanism by default.
>>> Moreover its effectiveness is increased for PIE (Position Independent
>>> Executable) binaries. Thanks to commit 9a227a2fd642 ("Enable PIE by
>>> default on 64-bit architectures"), building from src is not necessary
>>> to have PIE binaries and it is enough to control usage of ASLR in the
>>> OS solely by setting the appropriate sysctls. The defaults were toggled
>>> for the 64-bit PIE and non-PIE executables.
>>> 
>>> As for the drawbacks, a consequence of using the ASLR is more
>>> significant VM fragmentation, hence the issues may be encountered
>>> in the systems with a limited address space in high memory consumption
>>> cases, such as buildworld. As a result, although the tests on 32-bit
>>> architectures with ASLR enabled were mostly on par with what was
>>> observed on 64-bit ones, the defaults for the former are not changed
>>> at this time. Also, for the sake of safety the feature remains disabled
>>> for 32-bit executables on 64-bit machines, too.
>>> 
>>> The committed change affects the overall OS operation, so the
>>> following should be taken into consideration:
>>> * Address space fragmentation.
>>> * A changed ABI due to modified layout of address space.
>>> * More complicated debugging due to:
>>>  * Non-reproducible address space layout between runs.
>>>  * Some debuggers automatically disable ASLR for spawned processes,
>>>making target's environment different between debug and
>>>non-debug runs.
>>> 
>>> The known issues (such as PR239873 or PR253208) have been fixed in
>>> HEAD up front, however please pay attention to the system behavior after
>>> upgrading the kernel to the newest revisions.
>>> In order to confirm/rule-out the dependency of any encountered issue
>>> on ASLR it is strongly advised to re-run the test with the feature
>>> disabled - it can be done by setting the following sysctls
>>> in the /etc/sysctl.conf file:
>>> kern.elf64.aslr.enable=0
>>> kern.elf64.aslr.pie_enable=0
>>> 
>>> The change is a result of combined efforts under the auspices
>>> of the FreeBSD Foundation and the Semihalf team sponsored
>>> by Stormshield.
>>> 
>>> Best regards,
>>> Marcin
>> 
>> Thanks very much for working on this. FYI, there are some test cases
>> seem to be affected by this:
>> 
>> https://ci.freebsd.org/job/FreeBSD-main-amd64-test/19828/testReport/
>> 
>> The mkimg ones are a bit tricky, it seems the output is changed in
>> each run. We may need a way to generate reproducible results..
>> 
>> I'm still checking them, but hope more people can join and fix them.
>> 
> 
> Thanks for bringing this up! Apart from
> sys.netpfil.common.dummynet.pf_nat other are 23 are new.

I’ve just managed to reproduce that one locally (it only happens if ipfw is 
also loaded) and will dig in soon. It’s not going to be aslr related. You can 
ignore that failure. 

Kristof 



Re: libifconfig_sfp.h does not compile for me

2021-03-11 Thread Kristof Provost

On 11 Mar 2021, at 9:51, Ronald Klop wrote:

Hi,

This 
https://cgit.freebsd.org/src/tree/lib/libifconfig/libifconfig_sfp.h 
includes libifconfig_sfp_tables.h which does not exist.


My build fails on this. I cleaned /lib/libifconfig and /sbin/ifconfig, 
but no succes.
The last change in these files is a few days ago, am I the only one 
with this problem?



How are you building?

libifconfig_sfp_tables.h is a generated file, and the build should have 
created it.


It does for me:

	/usr/libexec/flua /usr/src/lib/libifconfig/sfp.lua 
/usr/src/lib/libifconfig/libifconfig_sfp_tables.tpl.h 
>libifconfig_sfp_tables.h
	/usr/libexec/flua /usr/src/lib/libifconfig/sfp.lua 
/usr/src/lib/libifconfig/libifconfig_sfp_tables.tpl.c 
>libifconfig_sfp_tables.c
	/usr/libexec/flua /usr/src/lib/libifconfig/sfp.lua 
/usr/src/lib/libifconfig/libifconfig_sfp_tables_internal.tpl.h 
>libifconfig_sfp_tables_internal.h


Although I do not understand the magical incantations in the makefile 
that make it do so I can see that they’re there and that they do 
what’s required.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ifa leak on VNET teardown

2021-03-06 Thread Kristof Provost

On 13 Feb 2021, at 21:58, Alexander V. Chernikov wrote:
It turns out we're leaking some ifas for loopback interfaces on VNET 
teardown:



There’s a recent bug about this as well: 253998.
The problem’s been around for a long time though. The pf tests trigger 
it from time to time, although it doesn’t appear to be 100% 
consistent, so my current feeling is that it may be racy.


I see ‘in6_purgeaddr: err=65, destination address delete failed’ 
when we do leak, and I’ve also been able to confirm this is about the 
::1 IPv6 loopback address.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Interface counter inaccurate

2021-02-16 Thread Kristof Provost

On 16 Feb 2021, at 0:58, Daniel Ponte wrote:

On Mon, Feb 15, 2021 at 10:25:47PM +0100, Kristof Provost wrote:

On 15 Feb 2021, at 22:09, Daniel Ponte wrote:
I've noticed that since upgrading to stable/13-n244514-18097ee2fb7c 
from

12.2-STABLE, throughput on my WAN interface (the box runs pf) is
incorrectly showing double in systat -if, as well as in vnstat and 
pftop

from ports. The LAN interface does not appear to be so afflicted.

`systat -if` doesn’t read the pf counters, so I wouldn’t expect 
that to be

related.

Those are the interface counters.
What network card and driver do you use?

Kristof


I, too, questioned the relation. They are igb(4) I210 builtin 
interfaces

(in a Protectli Vault 4).


systat -if during said speed test:

   igb1  in 11.670 Mb/s 11.670 Mb/s
1.067 GB
 out   319.975 Mb/s320.458 Mb/s
2.062 GB


   igb0  in640.120 Mb/s640.690 Mb/s
6.351 GB
 out 5.824 Mb/s  5.824 Mb/s  
987.050 MB


igb1 is inside, igb0 is outside. The 6GB:2GB difference in totals seen 
above
is indeed real; this machine did not initiate that much traffic on its 
own.


Even stranger, in that you appear to only have the issue on one of your 
igb0 interfaces.
My initial guess was that there was a driver bug causing it to double 
count the packets.


That machine has 4 ethernet ports, does it have 4 igbX interfaces as 
well? Are there any vlans configured? (My current thinking is still that 
it’s a driver issue, manifesting only on one of the interfaces because 
of a configuration difference.)


It may also be useful to try capturing packets on igb0 and correlating 
the number of captured packets with the counters.


I’m not all that familiar with the igb driver code, so I don’t know 
if I’ll be able to help much.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Interface counter inaccurate

2021-02-15 Thread Kristof Provost

On 15 Feb 2021, at 22:09, Daniel Ponte wrote:
I've noticed that since upgrading to stable/13-n244514-18097ee2fb7c 
from

12.2-STABLE, throughput on my WAN interface (the box runs pf) is
incorrectly showing double in systat -if, as well as in vnstat and 
pftop

from ports. The LAN interface does not appear to be so afflicted.

`systat -if` doesn’t read the pf counters, so I wouldn’t expect that 
to be related.


Those are the interface counters.
What network card and driver do you use?

Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling AESNI by default

2020-12-31 Thread Kristof Provost

On 31 Dec 2020, at 23:09, Rodney W. Grimes wrote:

Its for ever dead code on a large number of machines that do not have
the hardware for it.  I know that is a decreasing set, but imho it
would be better to somehow ONLY load the module if you had CPU
support for it.  The down side is that detection would probably have
to be in the laoder as this code can be used very early on.


According to kldstat it uses all of 42KB of memory.

161 0x83313000 a290 aesni.ko

That’s such a trivial amount of memory it’s not even worth 
mentioning. Even in tiny embedded systems (and who runs tiny embedded 
systems on x86?) it’s utterly insignificant.


Even if it were significant, how many of the systems without the 
relevant hardware are ever going to run 13?


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: git and the loss of revision numbers

2020-12-29 Thread Kristof Provost

On 29 Dec 2020, at 4:33, monochrome wrote:

sry forgot details:

source tree @ ead01bfe8

git -C /usr/src checkout gf20c0e331
error: pathspec 'gf20c0e331' did not match any file(s) known to git

what is the 'g' for?


That would have been a typo, I think.


git -C /usr/src checkout f20c0e331
M   sys/amd64/conf/GENERIC
HEAD is now at f20c0e331 caroot: drop $FreeBSD$ expansion from root 
bundle


yet I don't see any indication that anything changed, and now it wont 
update at all:



If something went wrong there’d be error output.
Git is *fast*, which can lead you to assume it’s not done anything 
when it has in fact done exactly what you asked. You should be on that 
commit now.



git -C /usr/src pull --ff-only
You are not currently on a branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

git pull  


Yes, you can’t merge to a detached head. Return to your original 
branch (presumably git checkout main) and then pull.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: FreeBSD src repo transitioning to git this weekend

2020-12-22 Thread Kristof Provost


On 22 Dec 2020, at 22:50, Mark Millard wrote:

On 2020-Dec-22, at 13:31, Mark Millard  wrote:

Clone
https://git.FreeBSD.org/src.git
anon...@git.freebsd.org:src.git
g...@gitrepo.freebsd.org:src.git


Hmm. It turns out that the last 2 are links on that page and the
links expand out to:

https://cgit.freebsd.org/src/anon...@git.freebsd.org:src.git
and:
https://cgit.freebsd.org/src/g...@gitrepo.freebsd.org:src.git

So it seems that there are ways to clone that involve referencing
cgit.freebsd.org .

No, that’s just a configuration bug in cgit. I’m sure it’ll get 
fixed in due course.


The text version of the links is the correct version.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: FreeBSD src repo transitioning to git this weekend

2020-12-22 Thread Kristof Provost
On 22 Dec 2020, at 22:06, bob prohaska wrote:
> On Tue, Dec 22, 2020 at 09:34:25PM +0100, Ronald Klop wrote:
>>
>> what does "pkg install git" do for you? NB: I use "pkg install git-lite".
>> Prevents about 1000 dependencies.
>>
>
> That seems to have worked. It reported something about package management
> not being installed, but after a prompt installed pkg-static and set
> up a version of git which seems to run. Svnlite had been working without
> this step.
>
> This is for a Pi2B v 1.1, arm v7 only.
>
> Using the "mini git primer" at https://hackmd.io/hJgnfzd5TMK-VHgUzshA2g
> I tried to clone stable/12 expecting that the -beta would be gone.
>
> It looks as if I'm still jumping the gun. Although
> cgit.freegbsd.org replies to ping, using
>
> bob@www:/usr % git clone cgit.freebsd.org -b stable/12 freebsd-src
>
> reports:
>
> fatal: repository 'cgit.freebsd.org' does not exist
>
That’s because you have the wrong URL for the src repo.
Try `git clone https://git.FreeBSD.org/src.git  -b stable/12 freebsd-src`

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: firewall choice

2020-11-27 Thread Kristof Provost

On 27 Nov 2020, at 9:29, tech-lists wrote:
What's the "best" [1] choice for firewalling these days, in the list's 
opinion?


There's pf, ipf and ipfw. Which is the one being most recently 
developed/updated?
I'm used to using pf, have done for over a decade. But OpenBSD's pf 
has diverged a lot more from when it first came across. There seems to 
be a lot more options.

Is FreeBSD's pf being actively developed still?

All three are actively maintained and grow new features from time to 
time.



[1] up-to-date

See above. All three are actively maintained.


low overhead, high throughput
I believe ipfw currently performs best. I can’t rank ipf and pf, 
because I’ve not seen benchmarks for ipf.



IPv6-able,

All three.


traffic shaping/queueing
Mostly ipfw, because dummynet. pf has ALTQ, but that has more 
limitations than dummynet.

I think ipf doesn’t do shaping, but I may be mistaken about that.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-10-03 Thread Kristof Provost

On 30 Sep 2020, at 13:52, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Tue, 29 Sep 2020 
23:20:44 +0200):



On 28 Sep 2020, at 16:44, Alexander Leidinger wrote:

Quoting Kristof Provost  (from Mon, 28 Sep 2020 
13:53:16 +0200):



On 28 Sep 2020, at 12:45, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Sun, 27 Sep 2020 
17:51:32 +0200):
Here’s an early version of a task queue based approach: 
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the 
sleep issue and the LOR.


There are some issues... seems like inside a jail I can't ping 
systems outside of the hardware.


Bridge setup:
  - member jail A
  - member jail B
  - member external_if of host

If I ping the router from the host, it works. If I ping from one 
jail to another, it works. If I ping from the jail to the IP of 
the external_if, it works. If I ping from a jail to the router, I 
do not get a response.


Can you check for 'failed ifpromisc' error messages in dmesg? And 
verify that all bridge member interfaces are in promiscuous mode?


I have a panic for you...:
- startup still in progress = 22 jails in startup, somewhere after a 
few jails started the panic happened

- tcpdump was running on the external interface
- a ping to a jail IP from another system was running, the first 
ping went through, then it paniced


First regarding your questions about promisc mode: no error, but the 
promisc mode is directly disabled again on all interfaces.


I think I see why you had issues with the promiscuous setting. I’ve 
updated the patch to be even more horrific than it was before.


Hmmm same behavior as before.
I haven't kept the old version of the patch, so I can't compare if I 
somehow downloaded the old version again, or if I got the updated 
one...



Okay, let’s abandon that patch. It’s ugly and it doesn’t work.

Here’s a different approach that I’m much happier with.
https://people.freebsd.org/~kp/0001-bridge-Call-member-interface-ioctl-without-NET_EPOCH.patch

It passes the regression tests with WITNESS and INVARIANTS enabled, and 
a hack in the epair ioctl() handler to make it sleep (to look a bit like 
the Intel ioctl() handler that currently trips up if_bridge).


Best,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-09-29 Thread Kristof Provost



On 28 Sep 2020, at 16:44, Alexander Leidinger wrote:

Quoting Kristof Provost  (from Mon, 28 Sep 2020 
13:53:16 +0200):



On 28 Sep 2020, at 12:45, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Sun, 27 Sep 2020 
17:51:32 +0200):
Here’s an early version of a task queue based approach: 
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the 
sleep issue and the LOR.


There are some issues... seems like inside a jail I can't ping 
systems outside of the hardware.


Bridge setup:
   - member jail A
   - member jail B
   - member external_if of host

If I ping the router from the host, it works. If I ping from one 
jail to another, it works. If I ping from the jail to the IP of the 
external_if, it works. If I ping from a jail to the router, I do not 
get a response.


Can you check for 'failed ifpromisc' error messages in dmesg? And 
verify that all bridge member interfaces are in promiscuous mode?


I have a panic for you...:
 - startup still in progress = 22 jails in startup, somewhere after a 
few jails started the panic happened

 - tcpdump was running on the external interface
 - a ping to a jail IP from another system was running, the first ping 
went through, then it paniced


First regarding your questions about promisc mode: no error, but the 
promisc mode is directly disabled again on all interfaces.


I think I see why you had issues with the promiscuous setting. I’ve 
updated the patch to be even more horrific than it was before.


I can’t explain the panic, and the backtrace also doesn’t appear to 
be directly related to this patch. Not sure what’s going on with that.


Krsitof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-09-28 Thread Kristof Provost

On 28 Sep 2020, at 12:45, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Sun, 27 Sep 2020 
17:51:32 +0200):
Here’s an early version of a task queue based approach: 
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the sleep 
issue and the LOR.


There are some issues... seems like inside a jail I can't ping systems 
outside of the hardware.


Bridge setup:
- member jail A
- member jail B
- member external_if of host

If I ping the router from the host, it works. If I ping from one jail 
to another, it works. If I ping from the jail to the IP of the 
external_if, it works. If I ping from a jail to the router, I do not 
get a response.


Can you check for 'failed ifpromisc' error messages in dmesg? And verify 
that all bridge member interfaces are in promiscuous mode?


Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-09-27 Thread Kristof Provost

On 21 Sep 2020, at 14:16, Shawn Webb wrote:

On Mon, Sep 21, 2020 at 09:57:40AM +0200, Kristof Provost wrote:

On 21 Sep 2020, at 2:52, Shawn Webb wrote:

From latest HEAD on a Dell Precision 7550 laptop:


https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2

The last working boot environment was 14 Aug 2020. If I get some 
time to

bisect commits, I'll try to figure out the culprit.


Try https://reviews.freebsd.org/D26418


That seems to fix the kernel panic. dmesg gets spammed with a freak
ton of these LOR messages now:

Here’s an early version of a task queue based approach: 
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the sleep 
issue and the LOR.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-09-23 Thread Kristof Provost
On 23 Sep 2020, at 19:37, xto...@hotmail.com wrote:
> Kristof Provost wrote:
>> On 21 Sep 2020, at 2:52, Shawn Webb wrote:
>>>>  From latest HEAD on a Dell Precision 7550 laptop:
>>>
>>> https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2
>>>
>>> The last working boot environment was 14 Aug 2020. If I get some time to
>>> bisect commits, I'll try to figure out the culprit.
>>>
>> Try https://reviews.freebsd.org/D26418
>
> Anything stopping this from being integrated?

Yes, it’s not correct.

I’ve got this on my todo list. I think I know how to fix it better.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: iflib/bridge kernel panic

2020-09-21 Thread Kristof Provost
On 21 Sep 2020, at 2:52, Shawn Webb wrote:
>> From latest HEAD on a Dell Precision 7550 laptop:
>
> https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2
>
> The last working boot environment was 14 Aug 2020. If I get some time to
> bisect commits, I'll try to figure out the culprit.
>
Try https://reviews.freebsd.org/D26418

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: bridge/igb panic: sleepq_add: td 0xfffffe01bbce5300 to sleep on wchan 0xffffffff8157d9a0 with sleeping prohibited

2020-09-12 Thread Kristof Provost
On 11 Sep 2020, at 19:06, Gleb Smirnoff wrote:
>   Kristof,
>
> can you please take a look? IMHO, the problem is that with r360345
> the bridge_ioctl() is fully covered by epoch. IMHO, should be either
> more fine grained covered, or use internal locking, because some of
> the code downstream (driver ioctl) may sleep.
>
How does https://reviews.freebsd.org/D26418 look?

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Plans for git

2020-09-03 Thread Kristof Provost

On 3 Sep 2020, at 19:56, Chris wrote:

Why was the intention to switch NOT announced as such MUCH sooner?

There was discussion about a possible switch to git on the freebsd-git 
mailing list as early as February 2017: 
https://lists.freebsd.org/pipermail/freebsd-git/2017-February/92.html


Ed gave a talk about FreeBSD and git back in 2018: 
https://www.youtube.com/watch?v=G8wQ88d85s4


The Git Transition Working group was mentioned in the quarterly status 
reports a year ago: 
https://www.freebsd.org/news/status/report-2019-07-2019-09.html and 
https://www.freebsd.org/news/status/report-2019-04-2019-06.html


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-07-23 Thread Kristof Provost

On 23 Jul 2020, at 11:00, Bjoern A. Zeeb wrote:

On 23 Jul 2020, at 8:09, Kristof Provost wrote:


On 23 Jul 2020, at 9:19, Kristof Provost wrote:

On 23 Jul 2020, at 0:15, John-Mark Gurney wrote:

So, it's pretty easy to trigger, just attach a couple USB ethernet
adapters, in my case, they were ure, but likely any two spare 
ethernet

interfaces will work, and wire them back to back..


I’ve been able to trigger it using epair as well:

`sudo sh testinterfaces.txt epair0a epair0b`

I did have to comment out the waitcarrier() check.

I’ve done a little bit of digging, and I think I’m starting to 
see how this breaks.


This always affects the jailed vlan interfaces. They’re getting 
deleted, but the ifp doesn’t go away just yet because it’s still 
in use by the multicast code.

The multicast code does its cleanup in task queues,


Wow, did I miss that back then? Did I review a change and not notice? 
Sorry if that was the case.


Vnet teardown is blocking and forceful.
Doing deferred cleanup work isn’t a good idea at all.
I think that is the real problem here.

I’d rather have us fix this than putting more bandaids into the 
code.


Yeah, agreed. I think hselasky has a better fix: 
https://reviews.freebsd.org/D24914


I just saw his e-mail in a different thread.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-07-23 Thread Kristof Provost

On 23 Jul 2020, at 9:19, Kristof Provost wrote:

On 23 Jul 2020, at 0:15, John-Mark Gurney wrote:

So, it's pretty easy to trigger, just attach a couple USB ethernet
adapters, in my case, they were ure, but likely any two spare 
ethernet

interfaces will work, and wire them back to back..


I’ve been able to trigger it using epair as well:

`sudo sh testinterfaces.txt epair0a epair0b`

I did have to comment out the waitcarrier() check.

I’ve done a little bit of digging, and I think I’m starting to see 
how this breaks.


This always affects the jailed vlan interfaces. They’re getting 
deleted, but the ifp doesn’t go away just yet because it’s still in 
use by the multicast code.
The multicast code does its cleanup in task queues, so by the time it 
gets around to doing that the ifp is already marked as dying and the 
vnet is gone.
There are still references to the ifp though, and when the multicast 
code tries to do its cleanup we get the panic.


This hack stops the panic for me, but I don’t know if this is the best 
solution:


diff --git a/sys/net/if.c b/sys/net/if.c
index 59dd38267cf..bd0c87eddf1 100644
--- a/sys/net/if.c
+++ b/sys/net/if.c
	@@ -3681,6 +3685,10 @@ if_delmulti_ifma_flags(struct ifmultiaddr *ifma, 
int flags)

ifp = NULL;
}
 #endif
+
+   if (ifp && ifp->if_flags & IFF_DYING)
+   return;
+
/*
	 	 * If and only if the ifnet instance exists: Acquire the address 
lock.

 */
diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c
index 39fc82c5372..6493e2a5bfb 100644
--- a/sys/netinet/in_mcast.c
+++ b/sys/netinet/in_mcast.c
@@ -623,7 +623,7 @@ inm_release(struct in_multi *inm)

/* XXX this access is not covered by IF_ADDR_LOCK */
CTR2(KTR_IGMPV3, "%s: purging ifma %p", __func__, ifma);
-   if (ifp != NULL) {
+   if (ifp != NULL && (ifp->if_flags & IFF_DYING) == 0) {
CURVNET_SET(ifp->if_vnet);
inm_purge(inm);
free(inm, M_IPMADDR);

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-07-23 Thread Kristof Provost
On 23 Jul 2020, at 0:15, John-Mark Gurney wrote:
> So, it's pretty easy to trigger, just attach a couple USB ethernet
> adapters, in my case, they were ure, but likely any two spare ethernet
> interfaces will work, and wire them back to back..
>
I’ve been able to trigger it using epair as well:

`sudo sh testinterfaces.txt epair0a epair0b`

I did have to comment out the waitcarrier() check.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of April 21st)

2020-04-26 Thread Kristof Provost

Hi,

This is likely the last update.

The main change (and the final test case) has been committed: 
https://svnweb.freebsd.org/changeset/base/360345

(And https://svnweb.freebsd.org/changeset/base/360346)

I intend to MFC this to stable/12 in due course, but it will do no hard 
to let it get a bit more testing in CURRENT first.


The Foundation wrote about this project here: 
https://www.freebsdfoundation.org/blog/500-if_bridge-performance-improvement/
There will be an in-depth article on this work in the upcoming May/June 
issue of FreeBSD Journal.


Best regards,
Kristof Provost
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: if_bridge performance improvements

2020-04-24 Thread Kristof Provost

On 22 Apr 2020, at 18:15, Xin Li wrote:

On 4/22/20 01:45, Kristof Provost wrote:

On 22 Apr 2020, at 10:20, Xin Li wrote:

Hi,

On 4/14/20 02:51, Kristof Provost wrote:

Hi,

Thanks to support from The FreeBSD Foundation I’ve been able to 
work on

improving the throughput of if_bridge.
It changes the (data path) locking to use the NET_EPOCH 
infrastructure.

Benchmarking shows substantial improvements (x5 in test setups).

This work is ready for wider testing now.

It’s under review here: https://reviews.freebsd.org/D24250

Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
Patches for stable/12:
https://people.freebsd.org/~kp/if_bridge/stable_12/

I’m not currently aware of any panics or issues resulting from 
these

patches.


I have observed the following panic with latest stable/12 after 
applying
the stable_12 patchset, it appears like a race condition related 
NULL

pointer deference, but I haven't took a deeper look yet.

The box have 7 igb(4) NICs, with several bridge and VLAN configured
acting as a router.  Please let me know if you need additional
information; I can try -CURRENT as well, but it would take some time 
as
the box is relatively slow (it's a ZFS based system so I can create 
a
separate boot environment for -CURRENT if needed, but that would 
take
some time as I might have to upgrade the packages, should there be 
any

ABI breakages).

Thanks for the report. I don’t immediately see how this could 
happen.


Are you running an L2 firewall on that bridge by any chance? An 
earlier
version of the patch had issues with a stray unlock in that code 
path.


I don't think I have a L2 firewall (I assume means filtering based on
MAC address like what can be done with e.g. ipfw?  The bridges were
created on vlan interfaces though, do they count as L2 firewall?), the
system is using pf with a few NAT rules:



That backtrace looks identical to the one Peter reported, up to and 
including the offset in the bridge_input() function.
Given that there’s no likely way to end up with a NULL mutex either I 
have to assume that it’s a case of trying to unlock a locked mutex, 
and the most likely reason is that you ran into the same problem Peter 
ran into.


The current version of the patch should resolve it.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: if_bridge performance improvements

2020-04-22 Thread Kristof Provost

On 22 Apr 2020, at 10:20, Xin Li wrote:

Hi,

On 4/14/20 02:51, Kristof Provost wrote:

Hi,

Thanks to support from The FreeBSD Foundation I’ve been able to 
work on

improving the throughput of if_bridge.
It changes the (data path) locking to use the NET_EPOCH 
infrastructure.

Benchmarking shows substantial improvements (x5 in test setups).

This work is ready for wider testing now.

It’s under review here: https://reviews.freebsd.org/D24250

Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
Patches for stable/12: 
https://people.freebsd.org/~kp/if_bridge/stable_12/


I’m not currently aware of any panics or issues resulting from 
these

patches.


I have observed the following panic with latest stable/12 after 
applying

the stable_12 patchset, it appears like a race condition related NULL
pointer deference, but I haven't took a deeper look yet.

The box have 7 igb(4) NICs, with several bridge and VLAN configured
acting as a router.  Please let me know if you need additional
information; I can try -CURRENT as well, but it would take some time 
as

the box is relatively slow (it's a ZFS based system so I can create a
separate boot environment for -CURRENT if needed, but that would take
some time as I might have to upgrade the packages, should there be any
ABI breakages).


Thanks for the report. I don’t immediately see how this could happen.

Are you running an L2 firewall on that bridge by any chance? An earlier 
version of the patch had issues with a stray unlock in that code path.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of April 14th)

2020-04-18 Thread Kristof Provost

Hi,

Again, relatively little to report on.
The review on the main patch is ongoing. It will likely be committed 
next week.


I launched a call for testing, and a number of people have done so. Only 
one major issue was reported, on the stable/12 version of the patch. 
That’s since been resolved.


Best regards,
Kristof Provost
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: if_bridge performance improvements

2020-04-16 Thread Kristof Provost

On 16 Apr 2020, at 8:34, Pavel Timofeev wrote:

Hi!
Thank you for your work!
Do you know if epair suffers from the same issue as tap?

I’ve not tested it, but I believe that epair scales significantly 
better than tap.
It has a per-cpu mutex (or more accurately, a mutex in each of its 
per-cpu structures), so I’d expect much better throughput from epair 
than you’d see from tap.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD CI Weekly Report 2020-04-12

2020-04-15 Thread Kristof Provost

On 15 Apr 2020, at 16:49, Olivier Cochard-Labbé wrote:
On Wed, Apr 15, 2020 at 4:10 PM Kristof Provost  
wrote:




The problem appears to be that
/usr/local/lib/python3.7/site-packages/scapy/arch/unix.py is 
misparsing

the `netstat -rnW` output.



Shouldn't scapy use the libxo output of netstat to mitigate this 
regression

?



That would likely help, yes. I’m going to leave that decision up to 
the maintainer, because I’m not going to do the work :)


I’m also not sure how “stable” we want the netstat output to be.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD CI Weekly Report 2020-04-12

2020-04-15 Thread Kristof Provost

On 15 Apr 2020, at 15:34, Kristof Provost wrote:

On 15 Apr 2020, at 0:37, Li-Wen Hsu wrote:
(Please send the followup to freebsd-testing@ and note Reply-To is 
set.)


FreeBSD CI Weekly Report 2020-04-12
===

Here is a summary of the FreeBSD Continuous Integration results for 
the period

from 2020-04-06 to 2020-04-12.

During this period, we have:

* 1801 builds (94.0% (+0.4) passed, 6.0% (-0.4) failed) of buildworld 
and
  buildkernel (GENERIC and LINT) were executed on aarch64, amd64, 
armv6,

  armv7, i386, mips, mips64, powerpc, powerpc64, powerpcspe, riscv64,
  sparc64 architectures for head, stable/12, stable/11 branches.
* 288 test runs (25.1% (-24.6) passed, 29.9% (+10.6) unstable, 45.1% 
(+14.1)
  exception) were executed on amd64, i386, riscv64 architectures for 
head,

  stable/12, stable/11 branches.
* 30 doc and www builds (83.3% (-1.3) passed, 16.7% (+1.3) failed)

Test case status (on 2020-04-12 23:59):
| Branch/Architecture | Total | Pass   | Fail | Skipped  
|
| --- | - | -- |  |  
|
| head/amd64  | 7744 (+4) | 7638 (+19) | 14 (+5)  | 92 (-20) 
|
| head/i386   | 7742 (+4) | 7628 (+15) | 16 (+5)  | 98 (-16) 
|
| 12-STABLE/amd64 | 7508 (0)  | 7449 (-3)  | 1 (+1)   | 58 (+2)  
|
| 12-STABLE/i386  | 7506 (0)  | 7425 (-17) | 2 (+2)   | 79 (+15) 
|
| 11-STABLE/amd64 | 6882 (0)  | 6829 (-6)  | 1 (+1)   | 52 (+5)  
|
| 11-STABLE/i386  | 6880 (0)  | 6749 (-82) | 80 (+80) | 51 (+2)  
|


(The statistics from experimental jobs are omitted)

If any of the issues found by CI are in your area of interest or 
expertise

please investigate the PRs listed below.

The latest web version of this report is available at
https://hackmd.io/@FreeBSD-CI/report-20200412 and archive is 
available at

https://hackmd.io/@FreeBSD-CI/ , any help is welcome.

## News

* The test env now loads the required module for firewall tests.

* New armv7 job is added (to replace armv6 one):
  * FreeBSD-head-armv7-testvm
  The images are available at https://artifact.ci.freebsd.org
  FreeBSD-head-armv7-test is ready but needs test env update.

## Failing jobs

* https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc6_build/
  * See console log for the error details.

## Failing tests

* https://ci.freebsd.org/job/FreeBSD-head-amd64-test/
  * local.kyua.integration.cmd_about_test.topic__authors__installed
  * sys.netipsec.tunnel.empty.v4
  * sys.netipsec.tunnel.empty.v6
  * sys.netpfil.common.forward.ipf_v4
  * sys.netpfil.common.forward.ipfw_v4
  * sys.netpfil.common.forward.pf_v4
  * sys.netpfil.common.tos.ipfw_tos
  * sys.netpfil.common.tos.pf_tos
  * sys.netpfil.pf.forward.v4
I can’t actually reproduce this failure in my test VM, but with the 
ci test VM I can reproduce the problem.
It’s not related to pf, because the sanity check ping we do before 
we set up pf already fails.
Or rather pft_ping.py sends an incorrect packet, because `ping` does 
get the packet to go where it’s supposed to go.


Scapy seems to fail to find the source IP address, so we get this:

	12:12:22.152652 IP 0.0.0.0 > 198.51.100.3: ICMP echo request, id 0, 
seq 0, length 12


I have a vague recollection that we’ve seem this problem before, but 
I can’t remember what we did about it.


In all likelihood most of the other netpfil tests fail for exactly the 
same reason.


The problem appears to be that 
/usr/local/lib/python3.7/site-packages/scapy/arch/unix.py is misparsing 
the `netstat -rnW` output.


For reference, this is the output in the test VM:

Routing tables

Internet:
	DestinationGatewayFlags   Nhop#Mtu  Netif 
Expire

127.0.0.1  link#2 UH  1  16384lo0
192.0.2.0/24   link#4 U   2   1500epair0a
192.0.2.1  link#4 UHS 1  16384lo0
198.51.100.0/24192.0.2.2  UGS 3   1500epair0a

Internet6:
	Destination   Gateway   Flags   
Nhop#MtuNetif Expire
	::/96 ::1   UGRS
4  16384  lo0
	::1   link#2UH  
1  16384  lo0
	:::0.0.0.0/96 ::1   UGRS
4  16384  lo0
	fe80::/10 ::1   UGRS
4  16384  lo0
	fe80::%lo0/64 link#2U   
3  16384  lo0
	fe80::1%lo0   link#2UHS 
2  16384  lo0
	fe80::%epair0a/64 link#4U   
5   1500  epair0a
	fe80::3d:9dff:fe7c:d70a%epair0a   link#4UHS 
1  16384  lo0
	fe80::%epair1a/64 link#6U   
6   1

Re: FreeBSD CI Weekly Report 2020-04-12

2020-04-15 Thread Kristof Provost

On 15 Apr 2020, at 0:37, Li-Wen Hsu wrote:
(Please send the followup to freebsd-testing@ and note Reply-To is 
set.)


FreeBSD CI Weekly Report 2020-04-12
===

Here is a summary of the FreeBSD Continuous Integration results for 
the period

from 2020-04-06 to 2020-04-12.

During this period, we have:

* 1801 builds (94.0% (+0.4) passed, 6.0% (-0.4) failed) of buildworld 
and
  buildkernel (GENERIC and LINT) were executed on aarch64, amd64, 
armv6,

  armv7, i386, mips, mips64, powerpc, powerpc64, powerpcspe, riscv64,
  sparc64 architectures for head, stable/12, stable/11 branches.
* 288 test runs (25.1% (-24.6) passed, 29.9% (+10.6) unstable, 45.1% 
(+14.1)
  exception) were executed on amd64, i386, riscv64 architectures for 
head,

  stable/12, stable/11 branches.
* 30 doc and www builds (83.3% (-1.3) passed, 16.7% (+1.3) failed)

Test case status (on 2020-04-12 23:59):
| Branch/Architecture | Total | Pass   | Fail | Skipped  |
| --- | - | -- |  |  |
| head/amd64  | 7744 (+4) | 7638 (+19) | 14 (+5)  | 92 (-20) |
| head/i386   | 7742 (+4) | 7628 (+15) | 16 (+5)  | 98 (-16) |
| 12-STABLE/amd64 | 7508 (0)  | 7449 (-3)  | 1 (+1)   | 58 (+2)  |
| 12-STABLE/i386  | 7506 (0)  | 7425 (-17) | 2 (+2)   | 79 (+15) |
| 11-STABLE/amd64 | 6882 (0)  | 6829 (-6)  | 1 (+1)   | 52 (+5)  |
| 11-STABLE/i386  | 6880 (0)  | 6749 (-82) | 80 (+80) | 51 (+2)  |

(The statistics from experimental jobs are omitted)

If any of the issues found by CI are in your area of interest or 
expertise

please investigate the PRs listed below.

The latest web version of this report is available at
https://hackmd.io/@FreeBSD-CI/report-20200412 and archive is available 
at

https://hackmd.io/@FreeBSD-CI/ , any help is welcome.

## News

* The test env now loads the required module for firewall tests.

* New armv7 job is added (to replace armv6 one):
  * FreeBSD-head-armv7-testvm
  The images are available at https://artifact.ci.freebsd.org
  FreeBSD-head-armv7-test is ready but needs test env update.

## Failing jobs

* https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc6_build/
  * See console log for the error details.

## Failing tests

* https://ci.freebsd.org/job/FreeBSD-head-amd64-test/
  * local.kyua.integration.cmd_about_test.topic__authors__installed
  * sys.netipsec.tunnel.empty.v4
  * sys.netipsec.tunnel.empty.v6
  * sys.netpfil.common.forward.ipf_v4
  * sys.netpfil.common.forward.ipfw_v4
  * sys.netpfil.common.forward.pf_v4
  * sys.netpfil.common.tos.ipfw_tos
  * sys.netpfil.common.tos.pf_tos
  * sys.netpfil.pf.forward.v4
I can’t actually reproduce this failure in my test VM, but with the ci 
test VM I can reproduce the problem.
It’s not related to pf, because the sanity check ping we do before we 
set up pf already fails.
Or rather pft_ping.py sends an incorrect packet, because `ping` does get 
the packet to go where it’s supposed to go.


Scapy seems to fail to find the source IP address, so we get this:

	12:12:22.152652 IP 0.0.0.0 > 198.51.100.3: ICMP echo request, id 0, seq 
0, length 12


I have a vague recollection that we’ve seem this problem before, but I 
can’t remember what we did about it.


In all likelihood most of the other netpfil tests fail for exactly the 
same reason.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


CFT: if_bridge performance improvements

2020-04-14 Thread Kristof Provost

Hi,

Thanks to support from The FreeBSD Foundation I’ve been able to work 
on improving the throughput of if_bridge.
It changes the (data path) locking to use the NET_EPOCH infrastructure. 
Benchmarking shows substantial improvements (x5 in test setups).


This work is ready for wider testing now.

It’s under review here: https://reviews.freebsd.org/D24250

Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
Patches for stable/12: 
https://people.freebsd.org/~kp/if_bridge/stable_12/


I’m not currently aware of any panics or issues resulting from these 
patches.


Do note that if you run a Bhyve + tap on bridges setup the tap code 
suffers from a similar bottleneck and you will likely not see major 
improvements in single VM to host throughput. I would expect, but have 
not tested, improvements in overall throughput (i.e. when multiple VMs 
send traffic at the same time).


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of April 6th)

2020-04-11 Thread Kristof Provost

Hi,

There’s relatively little to report on this week.
The main patch is still pending review. I’ll give that another week or 
so, so there may not be much to report on next week either.


There is one new test ready to be committed:

 - https://reviews.freebsd.org/D24337

I initially expected not to be able to MFC these changes back to 12, but 
that may be possible after all.

There is an experimental patch set for stable/12 here:
https://people.freebsd.org/~kp/if_bridge/stable_12/

Best regards,
Kristof Provost
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of March 30th)

2020-04-04 Thread Kristof Provost

Hi,

A productive week!

I’ve posted the main body of the patch for review:

 - https://reviews.freebsd.org/D24249

   A preparatory patch. Mostly a mechanical substitution of LIST -> 
CK_LIST


 - https://reviews.freebsd.org/D24250

   The main work. This changes the bridge data path to be mostly 
lockless (the only exception is when we have to add or update an rtnode.


 - https://reviews.freebsd.org/D24251

   Another test case, for PR 216510. That bug was fixed more or less by 
accident during this work.


I’ve also run performance testing with these patches, and I’m pretty 
happy with the results. The test shows an increase in throughput from 
3.7Mpps to 18.6Mpps.


The flame graphs also clearly show we’re no longer contending on the 
bridge mutex:


 - before: https://people.freebsd.org/~kp/if_bridge/unmodified.svg
 - after: https://people.freebsd.org/~kp/if_bridge/unicast.svg

I’ll give D245250 another week or two for reviews. It’s a relatively 
small patch, considering, but it’s complex and important.
I also intend to add another test case for a cleanup issue that’s 
since been fixed in D245250.


Best regards,
Kristof Provost
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of March 23rd)

2020-03-28 Thread Kristof Provost

Hi,

This week I got a prototype patch running, along the ideas discussed 
last week. Essentially, we keep the BRIDGE_LOCK for any add/delete of 
interfaces or rt entries, but use CK_LIST and epoch in the data path.
This is a relatively straightforward change, and currently passes the 
regression tests (WITNESS/INVARIANTS enabled). I’ve also run traffic 
through it without issue.
My current test setup fails to generate sufficient packets to truly 
stress the code. I’m hoping to get access to a more suitable setup 
next week so I can usefully measure the improvement.


The prototype patch is also running on my home gateway right now. So far 
so good!
Assuming no major issues come up in the next week or two I hope to post 
the patch for initial review in the near future.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Bridge project update (Week of March 16th)

2020-03-22 Thread Kristof Provost

Hi,

A quick status update this week.

I’ve been preoccupied with the AsiaBSDcon replacement hackathon 
(Thanks for organising hrs@!) and repeated changes to my travel plans.


I did get the chance to discuss my ideas with manu@, which was very 
helpful. I believe I have a good idea of how to approach the 
epoch-ification now, and hope to have a functional prototype in a few 
weeks.


Best regards,
Kristof Provost
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


bridge project update (Week of March 9th)

2020-03-15 Thread Kristof Provost

Hi,

As this is the first status report sent to a wider audience I’ll try 
to give a bit of background information.


I’m working on a performance improvement project for if_bridge. Right 
now it’s a big bottleneck for a number of different scenarios (e.g. 
for VNET jail or VM hosts).
if_bridge currently has a single mutex to protect its internal data 
structures. As a result it’s nowhere near as fast as it could be.


I’ve started the project by adding a number of tests to ensure that I 
don’t break things (or at least not everything) during this project.
A number of tests have already been committed. One more will go in soon 
(https://reviews.freebsd.org/D23961). They all live in 
/usr/tests/sys/net/if_bridge_test.


Aside from that I’ve been investigating the possibility of using the 
NET_EPOCH to improve bridge throughput. It’s very early, of course, 
but I’m investigating the possibility of keeping the bridge lock, but 
removing it from bridge_input/bridge_output/… (i.e. the data path), 
instead relying on NET_EPOCH to ensure that the important data 
structures don’t go away while we’re processing packets.


Part of that work was building my own understanding of how the epoch 
system is supposed to work. Very briefly (and with the caveat that 
I’ve only just started looking at it): Use lockless lists (CK_*). 
Objects should remain valid (i.e. not free()d) in between 
NET_EPOCH_ENTER() and NET_EPOCH_EXIT(). To accomplish this the object 
can be freed through a NET_EPOCH_CALL() callback, which will only be 
done once all CPUs have left their NET_EPOCH_(ENTER|EXIT) sections. This 
requires an epoch_context, which can best be placed in the to-be freed 
structure.


Feel free to get in touch if you have questions, remarks or suggestions.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: llvm 10 libomp build error

2020-03-12 Thread Kristof Provost

On 12 Mar 2020, at 16:58, Ronald Klop wrote:

Hi,

Clean build after svn update gives:

cc: error: no such file or directory: 
'/data/src/freebsd-current/contrib/llvm-pr

oject/openmp/runtime/src/thirdparty/ittnotify/ittnotify_static.c'
cc: error: no input files
*** [ittnotify_static.pico] Error code 1

make[5]: stopped in /data/src/freebsd-current/lib/libomp

The file ittnotify_static.c does indeed not exist. A .cpp version does 
exist.



[builder@sjakie ~]$ more /etc/src.conf
KERNCONF?=GENERIC-NODEBUG
# Don't build these
WITHOUT_LLVM_TARGET_ALL=true
WITHOUT_LPR=true
WITHOUT_PROFILE=true
WITHOUT_SENDMAIL=true
WITHOUT_TESTS=true


What is the advice? Currently rebuilding with -j 1, but that will 
hours for building the new clang again.


Try removing /usr/obj/* (or whatever your object directory is) and 
rebuilding.

That worked for me.

Best,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kyua test

2020-01-31 Thread Kristof Provost

On 31 Jan 2020, at 7:34, Clay Daniels wrote:
I've started running kyua test when I load the weekly current 
snapshot, and
I'm a little confused about if I should run kyua test as user or root. 
In
order to make the /usr/ports/devel/kyua port you need to be root and I 
have

just been doing the test as root, but I notice in the instructions I'm
using in the test(7) manpage it says:

$ kyua test -k /usr/tests/Kyuafile

Which suggested to me run as user with the $ (not #)

Of course, when I run it as user as I'm doing right now, it skips some
tests that are only for root. I guess I could use a little advice.

Some tests require root, some do not. It depends on what you want to 
test.
All tests that require root should announce this in their configuration, 
so running tests as a regular user should work, but you’ll end up with 
more skipped tests than if you run them as root.


I personally mostly care about network (and specifically pf) tests, so I 
tend to always run them as root. If you care about (e.g.) grep tests 
they should just work as a regular user.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD CI Weekly Report 2019-06-09

2019-06-15 Thread Kristof Provost

On 15 Jun 2019, at 11:35, Kristof Provost wrote:

On 12 Jun 2019, at 16:49, Li-Wen Hsu wrote:

* https://ci.freebsd.org/job/FreeBSD-head-i386-test/
* Same as amd64:
* sys.netinet.socket_afinet.socket_afinet_bind_zero
* Others:
* sys.netpfil.pf.forward.v6
* sys.netpfil.pf.forward.v4
* sys.netpfil.pf.set_tos.v4


I’ve finally gotten around to taking a look at this, and it appears 
to not be a pf problem. forward:v4 already fails at its sanity check, 
before it configures pf.


It creates a vnet jail, telling it to route traffic through, and then 
we run a sanity check with pft_ping.py.
Scapy tries to resolve the MAC address of the gateway (jail, 
192.0.2.1). The jail replies, but scapy never picks up the reply, so 
the traffic looks like this:


	13:19:29.953468 02:be:b4:57:9f:0a > ff:ff:ff:ff:ff:ff, ethertype ARP 
(0x0806), length 42: Request who-has 192.0.2.2 tell 192.0.2.1, length 
28
	13:19:29.953572 02:be:b4:57:9f:0b > 00:a0:98:b2:48:59, ethertype ARP 
(0x0806), length 42: Reply 192.0.2.2 is-at 02:be:b4:57:9f:0b, length 
28
	13:19:32.082843 02:be:b4:57:9f:0a > ff:ff:ff:ff:ff:ff, ethertype IPv4 
(0x0800), length 52: 192.0.2.1 > 198.51.100.3: ICMP echo request, id 
0, seq 0, length 18


The jail doesn’t forward the broadcast ICMP echo request and the 
test fails.


My current guess is that it’s related to bpf. It’s interesting to 
note that it fails on i386, but succeeds on amd64.



I’ve done a little dtracing, and I think that points at bpf too:

#!/usr/sbin/dtrace -s

fbt:kernel:bpf_buffer_uiomove:entry
{
tracemem(arg1, 1500, arg2);
stack();
}

Results in:

  1  49539 bpf_buffer_uiomove:entry
	 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f  
0123456789abcdef
	 0: ce 0e 05 5d 17 ea 00 00 2a 00 00 00 2a 00 00 00  
...]*...*...
	10: 12 00 ff ff ff ff ff ff 02 fd 10 30 e6 0a 08 06  
...0
	20: 00 01 08 00 06 04 00 01 00 a0 98 b2 48 59 c0 00  
HY..
	30: 02 01 00 00 00 00 00 00 c0 00 02 02 ce 0e 05 5d  
...]
	40: 60 ea 00 00 2a 00 00 00 2a 00 00 00 12 00 00 a0  
`...*...*...
	50: 98 b2 48 59 02 fd 10 30 e6 0b 08 06 00 01 08 00  
..HY...0
	60: 06 04 00 02 02 fd 10 30 e6 0b c0 00 02 02 00 a0  
...0

70: 98 b2 48 59 c0 00 02 01  ..HY

  kernel`bpfread+0x137
  kernel`dofileread+0x6d
  kernel`kern_readv+0x3b
  kernel`sys_read+0x48
  kernel`syscall+0x2b4
  0xffc033b7

So, we see the ARP request through bpf, but we don’t see the reply, 
despite tcpdump capturing it. I have no idea how that’d happen, so 
I’d very much like someone more familiar with bpf to take a look at 
this problem.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD CI Weekly Report 2019-06-09

2019-06-15 Thread Kristof Provost

On 12 Jun 2019, at 16:49, Li-Wen Hsu wrote:

* https://ci.freebsd.org/job/FreeBSD-head-i386-test/
* Same as amd64:
* sys.netinet.socket_afinet.socket_afinet_bind_zero
* Others:
* sys.netpfil.pf.forward.v6
* sys.netpfil.pf.forward.v4
* sys.netpfil.pf.set_tos.v4


I’ve finally gotten around to taking a look at this, and it appears to 
not be a pf problem. forward:v4 already fails at its sanity check, 
before it configures pf.


It creates a vnet jail, telling it to route traffic through, and then we 
run a sanity check with pft_ping.py.
Scapy tries to resolve the MAC address of the gateway (jail, 192.0.2.1). 
The jail replies, but scapy never picks up the reply, so the traffic 
looks like this:


	13:19:29.953468 02:be:b4:57:9f:0a > ff:ff:ff:ff:ff:ff, ethertype ARP 
(0x0806), length 42: Request who-has 192.0.2.2 tell 192.0.2.1, length 28
	13:19:29.953572 02:be:b4:57:9f:0b > 00:a0:98:b2:48:59, ethertype ARP 
(0x0806), length 42: Reply 192.0.2.2 is-at 02:be:b4:57:9f:0b, length 28
	13:19:32.082843 02:be:b4:57:9f:0a > ff:ff:ff:ff:ff:ff, ethertype IPv4 
(0x0800), length 52: 192.0.2.1 > 198.51.100.3: ICMP echo request, id 0, 
seq 0, length 18


The jail doesn’t forward the broadcast ICMP echo request and the test 
fails.


My current guess is that it’s related to bpf. It’s interesting to 
note that it fails on i386, but succeeds on amd64.


--
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with r346530 [Re: svn commit: r346530 - in head/sys: netinet netinet6]

2019-04-22 Thread Kristof Provost

On 22 Apr 2019, at 12:25, Enji Cooper wrote:
Either the sys/netinet/ or sys/netipsec/ tests triggered the panic. 
Not sure which right now.


That looks to be happening during a vnet jail teardown, so it’s likely 
the sys/netipsec or sys/netpfil/pf tests.


I’ve done a quick test with the pf tests, and they provoke this panic:

	panic: mtx_lock() of destroyed mutex @ 
/usr/src/sys/netinet/ip_reass.c:628

cpuid = 0
time = 1555939645
KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0091d68530

vpanic() at vpanic+0x19d/frame 0xfe0091d68580
panic() at panic+0x43/frame 0xfe0091d685e0
__mtx_lock_flags() at __mtx_lock_flags+0x12e/frame 0xfe0091d68630
ipreass_cleanup() at ipreass_cleanup+0x86/frame 0xfe0091d68670
	if_detach_internal() at if_detach_internal+0x786/frame 
0xfe0091d686f0

if_detach() at if_detach+0x3d/frame 0xfe0091d68710
lo_clone_destroy() at lo_clone_destroy+0x16/frame 0xfe0091d68730
	if_clone_destroyif() at if_clone_destroyif+0x21f/frame 
0xfe0091d68780

if_clone_detach() at if_clone_detach+0xb8/frame 0xfe0091d687b0
vnet_loif_uninit() at vnet_loif_uninit+0x26/frame 0xfe0091d687d0
vnet_destroy() at vnet_destroy+0x124/frame 0xfe0091d68800
prison_deref() at prison_deref+0x29d/frame 0xfe0091d68840
sys_jail_remove() at sys_jail_remove+0x28f/frame 0xfe0091d68890
amd64_syscall() at amd64_syscall+0x276/frame 0xfe0091d689b0
	fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe0091d689b0
	--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x80031e12a, 
rsp = 0x7fffe848, rbp = 0x7fffe8d0 ---

KDB: enter: panic
[ thread pid 1223 tid 100501 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db>

To reproduce:

kldload pfsync
cd /usr/tests/sys/netpfil/pf
sudo kyua test

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Networking panic on 12 - found the cause

2019-02-12 Thread Kristof Provost
On 2019-02-12 13:54:21 (-0600), Eric van Gyzen  wrote:
> I see the same behavior on head (and stable/12).
> 
> (kgdb) f
> #16 0x80ce5331 in ether_output_frame (ifp=0xf80003672800,
> m=0xf8000c88b100) at /usr/src/sys/net/if_ethersubr.c:468
> 468   switch (pfil_run_hooks(V_link_pfil_head, , ifp, 
> PFIL_OUT,
> 
>0x80ce5321 <+81>:  mov%gs:0x0,%rax
>0x80ce532a <+90>:  mov0x500(%rax),%rax
> => 0x80ce5331 <+97>:  mov0x28(%rax),%rax
> 
> I think this is part of the V_link_pfil_head.  I'm not very familiar
> with vnet.  Does this need a CURVNET_SET(), maybe in garp_rexmit()?
> 
Yes. I posted a proposed patch in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235699

Basically we get called through a timer, so there's no vnet context. It
needs to be set, and then we can safely use any V_ variables.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12.0-BETA1 vnet with pf firewall

2018-10-30 Thread Kristof Provost

On 30 Oct 2018, at 14:29, Bjoern A. Zeeb wrote:

On 30 Oct 2018, at 12:23, Kristof Provost wrote:
I’m not too familiar with this part of the vnet code, but it looks 
to me like we’ve got more per-vnet variables that was originally 
anticipated, so we may need to just increase the allocated space.


Can you elfdump -a the two modules and see how big their set_vnet 
section sizes are?  I see:


pf.ko:  sh_size: 6664
ipl.ko: sh_size: 2992


I see exactly the same numbers.

VNET_MODMIN is two pages (8k).  So yes, that would exceed the module 
space.
Having 6.6k global variable space is a bit excessive?  Where does that 
come from?  multicast used to have a similar problem in the past that 
it could not be loaded as a module as it had a massive array there and 
we changed it to be malloced and that reduced it to a pointer.


0f38 l O set_vnet   0428 
vnet_entry_pfr_nulltable
That’s a default table. It’s large because it uses MAXPATHLEN for 
the pfrt_anchor string.


0b10 l O set_vnet   03d0 
vnet_entry_pf_default_rule
Default rule. Rules potentially contain names, tag names, interface 
names, … so it’s a large structure.


1370 l O set_vnet   0690 
vnet_entry_pf_main_anchor
Anchors use MAXPATHLEN for the anchor path, so that’s 1024 bytes right 
away.


 l O set_vnet   0120 
vnet_entry_pf_status



pf status. Mostly counters.

I’ll see about putting moving those into the heap on my todo list.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12.0-BETA1 vnet with pf firewall

2018-10-30 Thread Kristof Provost

On 29 Oct 2018, at 4:41, Kristof Provost wrote:
So we panic because we dereference a NULL pointer in strncmp(), which 
happens because nprogtab = 13 but ef->progtab[12] has NULL pointers.


It’s not clear to me why that happens, but it’s something to go 
on. I do wonder if this isn’t a bit of a red herring too. It might 
be an error in the error path (because we pass through 
linker_file_unload()). link_elf_load_file() increments ef->nprogtab 
for SHT_X86_64_UNWIND, so perhaps the error handling doesn’t cope 
with that.


It looks like the root of the problem (failure to load) is in this line 
of link_elf_load_file():


ef->progtab[pb].addr =
vnet_data_alloc(shdr[i].sh_size);

The allocation of the vnet data fails. Bumping VNET_MODMIN in 
sys/net/vnet.c makes the load of ipfilter and pf succeed.


I’m not too familiar with this part of the vnet code, but it looks to 
me like we’ve got more per-vnet variables that was originally 
anticipated, so we may need to just increase the allocated space.


Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12.0-BETA1 vnet with pf firewall

2018-10-28 Thread Kristof Provost

On 28 Oct 2018, at 14:39, Rodney W. Grimes wrote:

Bjoern A. Zeeb wrote:

On 28 Oct 2018, at 15:31, Ernie Luzar wrote:

Tested with host running ipfilter and vnet running pf. Tried 
loading
pf from host console or from vnet console using kldload pf.ko 
command

and get this error message;

linker_load_file: /boot/kernel/pf.ko-unsupported file type.

Looks like the 12.0 version of pf which is suppose to work in vnet
independent of what firewall is running on the host is not working.


You cannot load pf from inside a jail (with or without vnet).  
Kernel
modules are global objects loaded from the base system or you 
compile
the devices into the kernel;  it is their state which is 
virtualised.


If you load multiple firewalls they will all be available to the 
base
system and all jails+vnet.  Whichever you configure in which one is 
up

to you.  Just be careful as an unconfigured firewall might have a
default action affecting the outcome of the overall decision.

For example you could have:

a base system using ipfilter and setting pf to default accept 
everything
and a jail+vnet using pf and setting ipfilter there to accept 
everything.



Hope that clarifies some things.

/bz



Hello Bjoern.

What you said is correct for 10.x & 11.x. But I an talking about
12.0-beta1.  I have the ipfilter options enabled in rc.conf of the 
host
and on boot ipfilter starts just like it all ways does. Now to prep 
the

host for pf in a vnet jail, I issue from the host console the
"kldload pf.ko" command and get this error message;

linker_load_file: /boot/kernel/pf.ko-unsupported file type.

Something is wrong here. This is not suppose to happen according to 
your

post above.

Remember that in 12.0 vimage is included in the base system kernel.


Confirmed, if I boot a clean install and issue:
kldload ipfilter.ko
kldload pf.ko
my dmesg has:
IP Filter: v5.1.2 initialized.  Default = pass all, Logging = enabled
linker_load_file: /boot/kernel/pf.ko - unsupported file type


Yeah, something’s very, very broken somewhere.

On head loading both pf and ipfilter panics:

Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address   = 0x0
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80c8a1d0
stack pointer   = 0x28:0xfe0088955340
frame pointer   = 0x28:0xfe0088955340
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 940 (kldload)
[ thread pid 940 tid 100473 ]
Stopped at  strncmp+0x10:   movzbl  (%rdi,%rcx,1),%r8d
db> bt
Tracing pid 940 tid 100473 td 0xf80007599000
strncmp() at strncmp+0x10/frame 0xfe0088955340
	link_elf_lookup_set() at link_elf_lookup_set+0x64/frame 
0xfe0088955390
	sdt_kld_unload_try() at sdt_kld_unload_try+0x39/frame 
0xfe00889553d0
	linker_file_unload() at linker_file_unload+0xeb/frame 
0xfe0088955430
	link_elf_load_file() at link_elf_load_file+0x152/frame 
0xfe00889554f0
	linker_load_module() at linker_load_module+0x97a/frame 
0xfe0088955800

kern_kldload() at kern_kldload+0xf1/frame 0xfe0088955850
sys_kldload() at sys_kldload+0x5b/frame 0xfe0088955880
amd64_syscall() at amd64_syscall+0x278/frame 0xfe00889559b0
	fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe00889559b0
	--- syscall (304, FreeBSD ELF64, sys_kldload), rip = 0x8002d2f7a, rsp = 
0x7fffe588, rbp = 0x7fffeb00 ---


While I’d recommend very strongly against trying to mix firewalls we 
obviously shouldn’t panic.


This doesn’t appear to be specifically either firewalls fault though, 
as the panic happens during the linking of the module, not 
initialisation of the firewall. Also, it happens regardless of load 
order (so ipfilter first or pf first).


(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:366
	#2  0x8046576b in db_dump (dummy=, 
dummy2=, dummy3=, dummy4=) at 
/usr/src/sys/ddb/db_command.c:574
	#3  0x80465539 in db_command (last_cmdp=, 
cmd_table=, dopager=) at 
/usr/src/sys/ddb/db_command.c:481
	#4  0x804652b4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:534
	#5  0x804684cf in db_trap (type=, 
code=) at /usr/src/sys/ddb/db_main.c:252
	#6  0x80be71c7 in kdb_trap (type=12, code=0, 
tf=0xfe0088955280) at /usr/src/sys/kern/subr_kdb.c:693
	#7  0x81073f51 in trap_fatal (frame=0xfe0088955280, eva=0) 
at /usr/src/sys/amd64/amd64/trap.c:921
	#8  0x81074072 in trap_pfault (frame=0xfe0088955280, 
usermode=) at /usr/src/sys/amd64/amd64/trap.c:765
	#9  0x8107369a in 

Re: 12.0-BETA1 vnet with pf firewall

2018-10-28 Thread Kristof Provost


> On 28 Oct 2018, at 12:56, Ernie Luzar  wrote:
> 
> Bjoern A. Zeeb wrote:
>>> On 28 Oct 2018, at 15:31, Ernie Luzar wrote:
>>> Tested with host running ipfilter and vnet running pf. Tried loading pf 
>>> from host console or from vnet console using kldload pf.ko command and get 
>>> this error message;
>>> 
>>> linker_load_file: /boot/kernel/pf.ko-unsupported file type.
>>> 
>>> Looks like the 12.0 version of pf which is suppose to work in vnet 
>>> independent of what firewall is running on the host is not working.
>> You cannot load pf from inside a jail (with or without vnet).  Kernel 
>> modules are global objects loaded from the base system or you compile the 
>> devices into the kernel;  it is their state which is virtualised.
>> If you load multiple firewalls they will all be available to the base system 
>> and all jails+vnet.  Whichever you configure in which one is up to you.  
>> Just be careful as an unconfigured firewall might have a default action 
>> affecting the outcome of the overall decision.
>> For example you could have:
>> a base system using ipfilter and setting pf to default accept everything
>> and a jail+vnet using pf and setting ipfilter there to accept everything.
>> Hope that clarifies some things.
>> /bz
> 
> Hello Bjoern.
> 
> What you said is correct for 10.x & 11.x. But I an talking about 12.0-beta1.  
> I have the ipfilter options enabled in rc.conf of the host and on boot 
> ipfilter starts just like it all ways does. Now to prep the host for pf in a 
> vnet jail, I issue from the host console the
> "kldload pf.ko" command and get this error message;
> 
> linker_load_file: /boot/kernel/pf.ko-unsupported file type.
> 
> Something is wrong here. This is not suppose to happen according to your post 
> above.
> 
> Remember that in 12.0 vimage is included in the base system 

That sounds like something’s wrong with your install and the kernel module does 
not match the kernel. 

How did you install?

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: vnet & firewalls in 12.0

2018-10-18 Thread Kristof Provost

On 18 Oct 2018, at 11:33, Ernie Luzar wrote:
Wanting to get a head start on using 12.0 and vnet jails with in jail 
firewall.


1. Will Vimage be compiled as a module in the 12.0 kernel and be 
included in the base system release?


vimage is a kernel option, not a module. It affects the entire kernel, 
and cannot be loaded as a module. It’s either enabled or not (and 
it’s enabled in 12.0).


1.a. Has the boot time console log message about vimage being "highly 
experimental" been removed?



Yes. It was removed around the time it was enabled by default.

2. Has the pf firewall been fixed so it can now run in a vnet jail or 
multiple vnet jails with out concern for which firewall is running on 
the host?



Yes. The automated pf tests rely on vimage.


2.a. Is each vnet/pf log only viewable from it's vnet jail console?

Yes, assuming you mean pflog output. Log files can of course be read 
from the host.



2.b. Will pf/kernel module auto load on first call from a vnet jail?

No. The decision to load the pf module is made by the host. If the 
module is not loaded no jail will be able to use it. Jails may not load 
kernel modules, for obvious reasons.



2.c. Does vnet/pf NAT work?


Yes.

Best regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Bad DHCP Checksums over VLANs

2018-09-16 Thread Kristof Provost

On 16 Sep 2018, at 23:05, Eric van Gyzen wrote:

On 9/15/18 1:06 AM, Kurt Jaeger wrote:

Can you disable all the options of the NIC ?

ifconfig igb0 -rxcsum -txcsum -wol -tso4 -vlanmtu -vlanhwtag 
-vlanhwcsum -vlanhwtso


Try to disable everything that can be disabled, e.g. LRO etc.


Disabling vlanhwtag works around the problem.

Also note that only DHCP traffic has this problem.  If I assign an 
address manually, all traffic flows normally.  Maybe the problem is in 
the BPF send path.



I had a similar issue, where -vlanhwtag also fixed it.
That was on a I210 (gib) card (in a FreeNAS mini XL).

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ifnet use after free

2018-08-25 Thread Kristof Provost

You may be right about that. With memguard (on ifnet) it implicates pf:

pfi_cleanup_vnet() at pfi_cleanup_vnet+0xa4/frame 0xfe084f775320
vnet_pf_uninit() at vnet_pf_uninit+0x85f/frame 0xfe084f7757c0
vnet_destroy() at vnet_destroy+0x12c/frame 0xfe084f7757f0
prison_deref() at prison_deref+0x29d/frame 0xfe084f775830
sys_jail_remove() at sys_jail_remove+0x28a/frame 0xfe084f775880
amd64_syscall() at amd64_syscall+0x28c/frame 0xfe084f7759b0
fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe084f7759b0
--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x8003130da, 
rsp = 0x7fffe848, rbp = 0x7fffe8d0 ---


I’ll investigate that. Sorry for the noise.
Thanks for the pointer to memguard. Very useful.

Kristof

On 25 Aug 2018, at 19:44, Matthew Macy wrote:


I'll take a look. But it's likely to not be the OP's issue. For future
reference memguard on the memory type in question is extremely useful 
in

catching use after free.

-M

On Sat, Aug 25, 2018 at 05:51 Kristof Provost  wrote:


On 25 Aug 2018, at 0:47, Kristof Provost wrote:

On 25 Aug 2018, at 0:26, Matthew Macy wrote:

On Fri, Aug 24, 2018 at 15:25 Shawn Webb 
wrote:

Hey All,

Somewhere in the last month or so, a use after free was introduced. I
don't have the time right now to bisect the commits and figure out
which commit introduced the breakage. Attached is the core.txt (which
seems nonsensical because the dump is reporting on a different
thread). If the core.txt gets scrubbed, I've posted it here:
https://gist.github.com/796ea88cec19a1fd2a85f4913482286a

Do you have any guidance on how to reproduce? The hardenedbsd rev 
isn’t

useful - the svn commit that it’s based against is what is needed.

For what it’s worth, it’s not a hardenedbsd thing. I’ve been 
chasing the

same one (same offset, same allocation size, same most recent user).
Something gets set to zero/NULL. 8 bytes on amd64, so presumably a 
pointer.


I currently only trigger it on a development branch, but I’ll see 
if I can

clean that up into something I can share tomorrow.

In my test scenario it happens after shutdown of a vnet jail with a 
few
interfaces in it (including a pfsync interface which will disappear 
with

the jail), and new jails are started. It’s pretty reliable.

At a guess something’s wrong with the delayed cleanup of ifnets and 
vnet

shutdown.

I see this:

Memory modified after free 0xf800623ab000(2040) val=0 @ 
0xf800623ab398

panic: Most recently used by ifnet

cpuid = 7
time = 1535199812
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe008c8e13c0

vpanic() at vpanic+0x1a3/frame 0xfe008c8e1420
panic() at panic+0x43/frame 0xfe008c8e1480
mtrash_ctor() at mtrash_ctor+0x81/frame 0xfe008c8e14a0
uma_zalloc_arg() at uma_zalloc_arg+0x72c/frame 0xfe008c8e1510
malloc() at malloc+0x9a/frame 0xfe008c8e1560
if_alloc() at if_alloc+0x23/frame 0xfe008c8e1590
epair_clone_create() at epair_clone_create+0x239/frame 
0xfe008c8e1610
if_clone_createif() at if_clone_createif+0x4a/frame 
0xfe008c8e1660

ifioctl() at ifioctl+0x852/frame 0xfe008c8e1750
kern_ioctl() at kern_ioctl+0x2ba/frame 0xfe008c8e17b0
sys_ioctl() at sys_ioctl+0x15e/frame 0xfe008c8e1880
amd64_syscall() at amd64_syscall+0x28c/frame 0xfe008c8e19b0
fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe008c8e19b0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80047b74a, rsp = 
0x7fffe208, rbp = 0x7fffe250 ---

KDB: enter: panic
[ thread pid 1426 tid 100466 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db>

It does require a couple of bug fixes in pfsync to trigger. You can 
get

them from the pfsync_vnet branch in
https://github.com/kprovost/freebsd/tree/pfsync_vnet

After that:
kldload pfsync
pkg install scapy
cd /usr/tests/sys/netpfil/pf
kyua test

It should panic reliably.

Regards,
Kristof




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ifnet use after free

2018-08-25 Thread Kristof Provost



On 25 Aug 2018, at 0:47, Kristof Provost wrote:


On 25 Aug 2018, at 0:26, Matthew Macy wrote:
On Fri, Aug 24, 2018 at 15:25 Shawn Webb  
wrote:



Hey All,

Somewhere in the last month or so, a use after free was introduced. 
I

don't have the time right now to bisect the commits and figure out
which commit introduced the breakage. Attached is the core.txt 
(which

seems nonsensical because the dump is reporting on a different
thread). If the core.txt gets scrubbed, I've posted it here:
https://gist.github.com/796ea88cec19a1fd2a85f4913482286a



Do you have any guidance on how to reproduce? The hardenedbsd rev 
isn’t

useful - the svn commit that it’s based against is what is needed.

For what it’s worth, it’s not a hardenedbsd thing. I’ve been 
chasing the same one (same offset, same allocation size, same most 
recent user). Something gets set to zero/NULL. 8 bytes on amd64, so 
presumably a pointer.


I currently only trigger it on a development branch, but I’ll see if 
I can clean that up into something I can share tomorrow.


In my test scenario it happens after shutdown of a vnet jail with a 
few interfaces in it (including a pfsync interface which will 
disappear with the jail), and new jails are started. It’s pretty 
reliable.


At a guess something’s wrong with the delayed cleanup of ifnets and 
vnet shutdown.



I see this:

	Memory modified after free 0xf800623ab000(2040) val=0 @ 
0xf800623ab398

panic: Most recently used by ifnet

cpuid = 7
time = 1535199812
KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe008c8e13c0

vpanic() at vpanic+0x1a3/frame 0xfe008c8e1420
panic() at panic+0x43/frame 0xfe008c8e1480
mtrash_ctor() at mtrash_ctor+0x81/frame 0xfe008c8e14a0
uma_zalloc_arg() at uma_zalloc_arg+0x72c/frame 0xfe008c8e1510
malloc() at malloc+0x9a/frame 0xfe008c8e1560
if_alloc() at if_alloc+0x23/frame 0xfe008c8e1590
	epair_clone_create() at epair_clone_create+0x239/frame 
0xfe008c8e1610

if_clone_createif() at if_clone_createif+0x4a/frame 0xfe008c8e1660
ifioctl() at ifioctl+0x852/frame 0xfe008c8e1750
kern_ioctl() at kern_ioctl+0x2ba/frame 0xfe008c8e17b0
sys_ioctl() at sys_ioctl+0x15e/frame 0xfe008c8e1880
amd64_syscall() at amd64_syscall+0x28c/frame 0xfe008c8e19b0
	fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe008c8e19b0
	--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80047b74a, rsp = 
0x7fffe208, rbp = 0x7fffe250 ---

KDB: enter: panic
[ thread pid 1426 tid 100466 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db>

It does require a couple of bug fixes in pfsync to trigger. You can get 
them from the pfsync_vnet branch in 
https://github.com/kprovost/freebsd/tree/pfsync_vnet


After that:
kldload pfsync
pkg install scapy
cd /usr/tests/sys/netpfil/pf
kyua test

It should panic reliably.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CURRENT, ezjail] rc, rc.subr and other rc. scripts gone: ezjail fails on 12-ALPHA

2018-08-25 Thread Kristof Provost

On 25 Aug 2018, at 9:06, O. Hartmann wrote:
I'm using ezjail-admin from ports (most recent tree, up to date as of 
today, at Revision:
478001, FreeBSD is FreeBSD 12.0-ALPHA3 #455 r338309: Sat Aug 25 
07:10:45 CEST 2018 amd64,

the jails sources are at revision 338309).

Updates of the jail's base is taken from /usr/src or another path (in 
case of another
path, ezjail-admin update -i requires the setting of evn 
MAKEOBJDIRPREFIX= some/place).


Updating jails and creating new jails has worked for a while, but 
recently, newly created
jails fail to start because the initial start routine bringing up the 
jail dumps an error

about /bin/sh /etc/rc missing!

Investigating a complete fresh ezajil setup (on ZFS) reveals, that 
neither in fulljail,
newjail or basejail any of the initial rc-srcipts rc or rc.subr is 
present any more! I
stopped looking for other missing scripts since rc and rc.subr are 
crucial and essential
to the system, so I guess there has been introduced a sort of bug 
recently in the way
FreeBSD 12 is going to handle/keep/store rc scripts in the source tree 
or their

installation and ezjai didn't catch up so far.

I already filed a PR (see Bug 230822), but I'm unsure whether this is 
a "real" bug or I

did just miss some important changes and I didn't catch up.


Yep, it’s a real problem. I ran into it myself a few weeks ago.

Many of the scripts and files in sys/etc have been moved, for pkg base. 
This, combined with ezjail doing the install wrong breaks your jails.
Brad posted a patch to the ezjail mailing list. I can’t immediately 
find an archive linke, but this should fix it:


--- ezjail-admin2018-08-12 09:41:46.750946000 +0200
+++ /usr/local/bin/ezjail-admin 2018-08-12 09:42:42.86318 +0200
@@ -1053,7 +1053,7 @@

 # make and setup our world, then split basejail and newjail
	 cd "${ezjail_sourcetree}" && env DESTDIR="${ezjail_jailfull}" make 
${ezjail_installaction} || exerr "Error: The command 'make 
${ezjail_installaction}' failed.\n  Refer to the error report(s) above."
	-cd "${ezjail_sourcetree}/etc" && env DESTDIR="${ezjail_jailfull}" 
make distribution || exerr "Error: The command 'make distribution' 
failed.\n  Refer to the error report(s) above."
	+cd "${ezjail_sourcetree}" && env DESTDIR="${ezjail_jailfull}" make 
distribution || exerr "Error: The command 'make distribution' failed.\n  
Refer to the error report(s) above."

 ezjail_splitworld

   fi # installaction="none"

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ifnet use after free

2018-08-24 Thread Kristof Provost

On 25 Aug 2018, at 0:26, Matthew Macy wrote:
On Fri, Aug 24, 2018 at 15:25 Shawn Webb  
wrote:



Hey All,

Somewhere in the last month or so, a use after free was introduced. I
don't have the time right now to bisect the commits and figure out
which commit introduced the breakage. Attached is the core.txt (which
seems nonsensical because the dump is reporting on a different
thread). If the core.txt gets scrubbed, I've posted it here:
https://gist.github.com/796ea88cec19a1fd2a85f4913482286a



Do you have any guidance on how to reproduce? The hardenedbsd rev 
isn’t

useful - the svn commit that it’s based against is what is needed.

For what it’s worth, it’s not a hardenedbsd thing. I’ve been 
chasing the same one (same offset, same allocation size, same most 
recent user). Something gets set to zero/NULL. 8 bytes on amd64, so 
presumably a pointer.


I currently only trigger it on a development branch, but I’ll see if I 
can clean that up into something I can share tomorrow.


In my test scenario it happens after shutdown of a vnet jail with a few 
interfaces in it (including a pfsync interface which will disappear with 
the jail), and new jails are started. It’s pretty reliable.


At a guess something’s wrong with the delayed cleanup of ifnets and 
vnet shutdown.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildworld fails: fatal error: 'netpfil/pf/pf.h' file not found

2018-04-10 Thread Kristof Provost

On 9 Apr 2018, at 13:10, Vladimir Zakharov wrote:

On Mon, Apr 09, 2018, Kristof Provost wrote:

On 9 Apr 2018, at 10:50, Vladimir Zakharov wrote:

For several days buildworld fails for me with the following 
error. Cleaning

and
rebuilding didn't help.

===> tests/sys/netpfil/pf/ioctl (all)
--- validation ---
(cd /usr/src/tests/sys/netpfil/pf/ioctl && 
DEPENDFILE=.depend.validation

NO_SUBDIR=1 make -f /usr/src/tests/sys/netpfil/pf/ioctl/Makefile
_RECURSING_PROGS=t PROG=validation )
Building 
/home/obj/usr/src/amd64.amd64/tests/sys/netpfil/pf/ioctl/

validation.o
--- validation.o ---
In file included from 
/usr/src/tests/sys/netpfil/pf/ioctl/validation.c:35:
/home/obj/usr/src/amd64.amd64/tmp/usr/include/net/pfvar.h:49:10: 
fatal

error: 'netpfil/pf/pf.h' file not found
#include 
^


It should be fully fixed as of r332358.
Thanks for the report.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildworld fails: fatal error: 'netpfil/pf/pf.h' file not found

2018-04-09 Thread Kristof Provost

On 9 Apr 2018, at 10:50, Vladimir Zakharov wrote:
For several days buildworld fails for me with the following error. 
Cleaning and

rebuilding didn't help.

===> tests/sys/netpfil/pf/ioctl (all)
--- validation ---
(cd /usr/src/tests/sys/netpfil/pf/ioctl &&  
DEPENDFILE=.depend.validation  NO_SUBDIR=1 make -f 
/usr/src/tests/sys/netpfil/pf/ioctl/Makefile _RECURSING_PROGS=t  
PROG=validation )
Building 
/home/obj/usr/src/amd64.amd64/tests/sys/netpfil/pf/ioctl/validation.o

--- validation.o ---
In file included from 
/usr/src/tests/sys/netpfil/pf/ioctl/validation.c:35:
/home/obj/usr/src/amd64.amd64/tmp/usr/include/net/pfvar.h:49:10: fatal 
error: 'netpfil/pf/pf.h' file not found

#include 
 ^
1 error generated.
*** [validation.o] Error code 1

make[8]: stopped in /usr/src/tests/sys/netpfil/pf/ioctl


My /etc/src.conf (I have PF switched off):


Ah, that’s my fault. I didn’t consider people who’d switch off pf 
when I added the new ioctl tests.


You can work around the issue by removing the new tests yourself, or by 
building pf in anyway (it won’t do anything unless you load the module 
and activate it).


This should be a workaround for you until I can commit a better fix:

	diff --git a/tests/sys/netpfil/pf/Makefile 
b/tests/sys/netpfil/pf/Makefile

index c055e6840bd..259e1275d9c 100644
--- a/tests/sys/netpfil/pf/Makefile
+++ b/tests/sys/netpfil/pf/Makefile
@@ -3,7 +3,6 @@
 PACKAGE=   tests

 TESTSDIR=   ${TESTSBASE}/sys/netpfil/pf
-TESTS_SUBDIRS+=ioctl

 ATF_TESTS_SH+= pass_block \
forward \

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RFC: mallocarray()

2018-01-04 Thread Kristof Provost

Hi,

I’d like to make it easier to avoid integer overflow issues in the 
kernel.
It’d be a lot nicer to have a malloc function figure this out for us, 
so I’d like mallocarray().


https://reviews.freebsd.org/D13766

Are there any objections to this?

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VNET jail and dhclient

2017-11-16 Thread Kristof Provost
On 16 Nov 2017, at 14:04, KOT MATPOCKuH wrote:
> Hello, all!
>
> I'm got same problem...
>
Can you show how you call dhclient? What FreeBSD version are you running?

What’s the output of `sysctl kern.chroot_allow_open_directories`?

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VNET jail and dhclient

2017-10-10 Thread Kristof Provost

On 10 Oct 2017, at 23:10, Oleg Ginzburg wrote:
What is your FreeBSD version? This problem reproduced on FreeBSD 12 
only.

/var/empty is exist and trivial test:


I’m running r324317 on CURRENT, yes.

What arguments are you calling dhclient with?
Clearly there’s a difference between what you’re doing and what 
I’m doing.



I'm not sure if this fd leak (due to pidfile_remove at the end of
dhclient),  nevertheless closing pid fd in my jail/FreeBSD12 before 
chroot

solve dhclient issue.


I would not expect an open file descriptor to be a problem, unless 
perhaps you’ve got an open directory and 
kern.chroot_allow_open_directories is unset.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Attn: CI/Jenkins people; Run bhyve instance for testing pf

2017-07-21 Thread Kristof Provost

On 20 Jul 2017, at 18:24, Nikos Vassiliadis wrote:

It would be great if you use vnet jails for that. I am not
sure regarding the per-vnet pf functionality but I have seen
many bug fixes hitting the tree since last year. You can ask
on freebsd-virtualizat...@freebsd.org or freebsd...@freebsd.org
to learn more about it.


It’s starting to become usable, yes.


Pf within a jail should behave more or less like the "normal" one.
Plus you will be testing per-vnet functionality, which the project
needs anyhow, in one go.

It *should* behave the same, but the fact is that a setup like that 
tests vnet pf, not just pf.
Ideally we should have both setups, but the priority should be on the 
setup most people use today, which is not vnet enabled.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: zfs recv panic

2017-05-16 Thread Kristof Provost

On 16 May 2017, at 19:58, Andriy Gapon wrote:

On 16/05/2017 16:49, Kristof Provost wrote:

On 16 May 2017, at 15:41, Andriy Gapon wrote:

On 10/05/2017 12:37, Kristof Provost wrote:

I have a reproducible panic on CURRENT (r318136) doing
(jupiter) # zfs send -R -v zroot/var@before-kernel-2017-04-26 | nc 
dual 1234

(dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var

For clarity, the receiving machine is CURRENT r318136, the sending 
machine is

running a somewhat older CURRENT version.

The receiving machine panics a few seconds in:

receiving full stream of zroot/var@before-kernel-2017-04-03 into
tank/jupiter/var@before-kernel-2017-04-03
panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) 
(0x0 ==
0x1), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c,

line: 2007


could you please try to revert commits related to the compressed 
send and see if
that helps?  I assume that the sending machine does not have (does 
not use) the

feature while the target machine is capable of the feature.

The commits are: r317648 and r317414.  Mot that I really suspect 
that change,

but just to eliminate the possibility.


Those commits appear to be the trigger.
I’ve not changed the sender, but with those reverted I don’t see 
the panic any

more.


Thank you for testing.
Do you still have the old kernel / module and the crash dump?
It would interesting to poke around in frame 14.



This contains the kernel and crash files:
https://www.sigsegv.be/files/zfs_recv_kernel_crash.tar.bz2

I was running r318356 at the time of this panic.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: zfs recv panic

2017-05-16 Thread Kristof Provost

On 16 May 2017, at 15:41, Andriy Gapon wrote:

On 10/05/2017 12:37, Kristof Provost wrote:

I have a reproducible panic on CURRENT (r318136) doing
(jupiter) # zfs send -R -v zroot/var@before-kernel-2017-04-26 | nc 
dual 1234

(dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var

For clarity, the receiving machine is CURRENT r318136, the sending 
machine is

running a somewhat older CURRENT version.

The receiving machine panics a few seconds in:

receiving full stream of zroot/var@before-kernel-2017-04-03 into
tank/jupiter/var@before-kernel-2017-04-03
panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) 
(0x0 ==
0x1), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c,

line: 2007


could you please try to revert commits related to the compressed send 
and see if
that helps?  I assume that the sending machine does not have (does not 
use) the

feature while the target machine is capable of the feature.

The commits are: r317648 and r317414.  Mot that I really suspect that 
change,

but just to eliminate the possibility.


Those commits appear to be the trigger.
I’ve not changed the sender, but with those reverted I don’t see the 
panic any more.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

zfs recv panic

2017-05-10 Thread Kristof Provost

Hi,

I have a reproducible panic on CURRENT (r318136) doing
(jupiter) # zfs send -R -v zroot/var@before-kernel-2017-04-26 | nc dual 
1234

(dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var

For clarity, the receiving machine is CURRENT r318136, the sending 
machine is running a somewhat older CURRENT version.


The receiving machine panics a few seconds in:

receiving full stream of zroot/var@before-kernel-2017-04-03 into 
tank/jupiter/var@before-kernel-2017-04-03
panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) (0x0 
== 0x1), file: 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c, line: 
2007

cpuid = 0
time = 1494408122
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0120cad930

vpanic() at vpanic+0x19c/frame 0xfe0120cad9b0
panic() at panic+0x43/frame 0xfe0120cada10
assfail3() at assfail3+0x2c/frame 0xfe0120cada30
dbuf_assign_arcbuf() at dbuf_assign_arcbuf+0xf2/frame 0xfe0120cada80
dmu_assign_arcbuf() at dmu_assign_arcbuf+0x170/frame 0xfe0120cadad0
receive_writer_thread() at receive_writer_thread+0x6ac/frame 
0xfe0120cadb70

fork_exit() at fork_exit+0x84/frame 0xfe0120cadbb0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe0120cadbb0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 7 tid 100672 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db>


kgdb backtrace:
#0  doadump (textdump=0) at pcpu.h:232
#1  0x803a208b in db_dump (dummy=, 
dummy2=, dummy3=, 
dummy4=) at /usr/src/sys/ddb/db_command.c:546
#2  0x803a1e7f in db_command (cmd_table=) 
at /usr/src/sys/ddb/db_command.c:453
#3  0x803a1bb4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:506
#4  0x803a4c7f in db_trap (type=, 
code=) at /usr/src/sys/ddb/db_main.c:248
#5  0x80a93cb3 in kdb_trap (type=3, code=-61456, tf=optimized out>) at /usr/src/sys/kern/subr_kdb.c:654
#6  0x80ed3de6 in trap (frame=0xfe0120cad860) at 
/usr/src/sys/amd64/amd64/trap.c:537
#7  0x80eb62f1 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:236
#8  0x80a933eb in kdb_enter (why=0x8143d8f5 "panic", 
msg=) at cpufunc.h:63
#9  0x80a51cf9 in vpanic (fmt=, 
ap=0xfe0120cad9f0) at /usr/src/sys/kern/kern_shutdown.c:772
#10 0x80a51d63 in panic (fmt=) at 
/usr/src/sys/kern/kern_shutdown.c:710
#11 0x8262b26c in assfail3 (a=, lv=optimized out>, op=, rv=, 
f=, l=)
at 
/usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
#12 0x822ad892 in dbuf_assign_arcbuf (db=0xf8008f23e560, 
buf=0xf8008f09fcc0, tx=0xf8008a8d5200) at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2007
#13 0x822b87f0 in dmu_assign_arcbuf (handle=out>, offset=0, buf=0xf8008f09fcc0, tx=0xf8008a8d5200) at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1542
#14 0x822bf7fc in receive_writer_thread (arg=0xfe0120a1d168) 
at 
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:2284
#15 0x80a13704 in fork_exit (callout=0x822bf150 
, arg=0xfe0120a1d168, 
frame=0xfe0120cadbc0) at /usr/src/sys/kern/kern_fork.c:1038
#16 0x80eb682e in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:611

#17 0x in ?? ()

Let me know if there’s any other information I can provide, or things 
I can test.
Fortunately the target machine is not a production machine, so I can 
panic it as often as required.


Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: VNET branch destiny

2017-04-11 Thread Kristof Provost

On 10 Apr 2017, at 12:10, peter.b...@bsd4all.org wrote:
There have been issues with pf if I recall correctly. I currently have 
issues with stable, pf and vnet. There is an issue with pf table 
entries when an interface is moved to a different vnet.


Does anyone no if there is a specific fix for this that hasn’t been 
ported to stable? I haven’t had the time to test this on current.


I’m currently aware of at least some issues with pf and vnet, even in 
CURRENT.
Not that one though, so can you make sure there’s a bug report with as 
much detail as possible?

Please also cc me (k...@freebsd.org) on the report.

Thanks,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: IPV6 TCP6 Slow Connect

2016-02-12 Thread Kristof Provost

> On 12 Feb 2016, at 10:18, Larry Rosenman <l...@lerctr.org> wrote:
> 
> On 2016-02-11 20:50, Larry Rosenman wrote:
>> On 2016-02-11 14:40, Larry Rosenman wrote:
>>> On 2016-02-11 14:25, Kristof Provost wrote:
>>> On 11 Feb 2016, at 21:23, Larry Rosenman <l...@lerctr.org> wrote:
>>> From which system(s) perspective do you want the packet captures?
>>> (Firewall, FreeBSD, Windows)? I wouldn't expect it to make much of a 
>>> difference in this case.
>>> Let's start with whatever is easiest.
>>> Regards,
>>> Kristof
>> I'll try and get these tonight when I am home, after a meetup.
>> (will be late US/CST).
>> --
>> Larry Rosenman http://www.lerctr.org/~ler
>> Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
>> US Mail: 7011 W Parmer Ln, Apt 1115, Austin, TX 78729-6961
>> at http://www.lerctr.org/~ler/FreeBSD/win10.pcap
> the 403 issue is fixed (forgot to chmod the file)
>> http://www.lerctr.org/~ler/FreeBSD/fbsd11.pcap
> URL corrected.
> 
At first glance the only difference is that FreeBSD includes timestamps (and 
has a larger window scaling factor).
It might be worth turning that off to see if it makes a difference.

sysctl net.inet.tcp.rfc1323=1 should do the trick.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: IPV6 TCP6 Slow Connect

2016-02-12 Thread Kristof Provost

> On 12 Feb 2016, at 15:29, Larry Rosenman  wrote:
> 
> On 2016-02-12 08:13, Larry Rosenman wrote:
>> 
>> sysctl net.inet.tcp.rfc1323=0
>> makes it work
> Shouldn't the stack do the right thing here?  For the record, the other side
> is also FreeBSD (10.2-STABLE).
> 
Yes, but it’s possible that there’s a problem with the pf scrubbing of the 
window scaling or timestamp options.

I have a vague recollection of having looked at that in the past.
Bug 172648 also claims there is/was an issue with checksums in that case, but 
I’ve never been able to reproduce it.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: IPV6 TCP6 Slow Connect

2016-02-12 Thread Kristof Provost

> On 12 Feb 2016, at 15:33, Larry Rosenman <l...@lerctr.org> wrote:
> 
> On 2016-02-12 08:31, Kristof Provost wrote:
>>> On 12 Feb 2016, at 15:29, Larry Rosenman <l...@lerctr.org> wrote:
>>> On 2016-02-12 08:13, Larry Rosenman wrote:
>>>> sysctl net.inet.tcp.rfc1323=0
>>>> makes it work
>>> Shouldn't the stack do the right thing here?  For the record, the other side
>>> is also FreeBSD (10.2-STABLE).
>> Yes, but it’s possible that there’s a problem with the pf scrubbing of
>> the window scaling or timestamp options.
>> I have a vague recollection of having looked at that in the past.
>> Bug 172648 also claims there is/was an issue with checksums in that
>> case, but I’ve never been able to reproduce it.
>> Regards,
>> Kristof
> Ok.  Since I can reproduce this at will, and the 2 firewalls are pfSense, how 
> can I help?

I’ll still need to reproduce it locally to fix it, but it might be interesting 
to know if the packet is dropped by the router, or sent out again with an 
incorrect checksum.
Can you take a capture on the WAN interface and see if the TCP SYN makes it out 
(if it does, I’d expect the checksum to be wrong) or not?

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: IPV6 TCP6 Slow Connect

2016-02-11 Thread Kristof Provost

> On 11 Feb 2016, at 21:23, Larry Rosenman  wrote:
> 
> From which system(s) perspective do you want the packet captures?
> (Firewall, FreeBSD, Windows)?

I wouldn’t expect it to make much of a difference in this case.
Let’s start with whatever is easiest.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: IPV6 TCP6 Slow Connect

2016-02-11 Thread Kristof Provost
On 2016-02-10 20:38:02 (-0600), Larry Rosenman  wrote:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206904
> 
> I've also posted lots of info to freebsd-net, and not gotten any 
> response.
> 
> Summary:
> 
> Cable Modem-> EM0 on a pfSense Firewall (FreeBSD 10.1, pfSense 2.2.6)
>set to dhcp6, and ask for a /56 prefix
> 
> EM1->LAN, set to track interface, prefix id 0, radvd running advertising 
> the 00 /64
> LAN-> borg.lerctr.org using lagg0, SLAAC (rtsold).  Gets an address, 
> icmp6 works, tcp6 times out.
>  -> win10 SLAAC, tcp6 works fine, as does icmp6.
> 
For this I'd start by taking packet captures of both the FreeBSD tcp6
connection and the win10 tcp6 connection. Finding the difference between
the two will likely go a long way in finding the cause.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: forwarding didn't work if wlan0 is member of a bridge

2015-12-23 Thread Kristof Provost
On 2015-12-23 08:08:29 (-0700), Sergey Manucharian  wrote:
> I believe this is related to the fact that wifi adapter cannot have more
> that one MAC address. And that becomes true when it's a member of a
> bridge. There exist some tricky ways to overcome that though.
> 
That's true, but that only applies in station mode (i.e. as a wifi
client). If I'm reading the report right Olivier is using wifi0 as an
access point here.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: pf NAT and VNET Jails

2015-11-10 Thread Kristof Provost
On 2015-11-09 21:47:01 (-0500), Shawn Webb  wrote:
> I found the problem: it seems that the new Intel Haswell graphics
> support (which I've been running with) is at odds somehow with pf NAT.
> Removing Haswell graphics support means working pf NAT.
> 
That's ... very strange.

I've built the drm-i915-update-38 branch of 
http:github.com/freebsd/freebsd-base-graphics.git,
but still haven't managed to reproduce the problem.
It is if course entirely possible that it would only manifest if the
haswell graphics are actually in use. In that case there's little I can
do as I don't have haswell hardware I could test on.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with PF on current.

2015-11-06 Thread Kristof Provost

> On 06 Nov 2015, at 01:12, Daniel Dettlaff  wrote:
> I have interesting verbose output with backtrace (not panic) from one of my 
> VMs: http://s.verknowsys.com/f0d457ce9420399baaf531012c33eb81.png
> It’s triggered by autostarting jail on bridged vlan interface (no VNET 
> feature enabled)
> 
This seems to be a lock ordering issue between ZFS and devfs.
It appears to have been around for a while too. I found PR 142878 with the same 
witness warning.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r289932 causes pf reversion - breaks rules with broadcast destination

2015-11-06 Thread Kristof Provost
On 2015-11-05 12:39:22 (-0500), Tom Uffner  wrote:
> So, if my rule was "working" due to false positive in a comparison that has
> now been fixed, how many other address comparisons were affected by this
> error?
> 
> There are 36 occurrences of PF_ANEQ in pf.c and 2 in if_pfsync.c
> 
Most of them are an optimisation check. They're used in the NAT paths to
see if addresses need to be rewritten (and checksums updated) or not.
That's probably part of the reason it took so long to notice the bug in
the macro: in most cases a false positive only slowed things down a
little, it didn't actually produce an incorrect result.

I think I've reproduced your problem with very simple rules:
pass out
block out proto icmp
pass out log on vtnet0 proto icmp from any to vtnet0:broadcast
pass out log on vtnet0 proto icmp from any to 172.16.2.1

With those rules I can ping to ping 172.16.2.255 (vtnet0 has
172.16.2.2/24), but not to 172.16.2.1.
If I remove the broadcast rule I suddenly can ping to 172.16.2.1.

I suspect I've also found the source of the problem:
pf_addr_wrap_neq() uses PF_ANEQ(), but sets address family 0.
As a result of the fix that now means we always return false there.

Can you give this a quick test:

diff --git a/sys/netpfil/pf/pf.c b/sys/netpfil/pf/pf.c
index 1dfc37d..762b82e 100644
--- a/sys/netpfil/pf/pf.c
+++ b/sys/netpfil/pf/pf.c
@@ -1973,9 +1973,9 @@ pf_addr_wrap_neq(struct pf_addr_wrap *aw1, struct 
pf_addr_wrap *aw2)
switch (aw1->type) {
case PF_ADDR_ADDRMASK:
case PF_ADDR_RANGE:
-   if (PF_ANEQ(>v.a.addr, >v.a.addr, 0))
+   if (PF_ANEQ(>v.a.addr, >v.a.addr, AF_INET6))
return (1);
-   if (PF_ANEQ(>v.a.mask, >v.a.mask, 0))
+   if (PF_ANEQ(>v.a.mask, >v.a.mask, AF_INET6))
return (1);
return (0);
case PF_ADDR_DYNIFTL:

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r289932 causes pf reversion - breaks rules with broadcast destination

2015-11-05 Thread Kristof Provost
On 2015-11-04 20:31:35 (-0500), Tom Uffner  wrote:
> Commit r289932 causes pf rules with broadcast destinations (and some but not 
> all rules after them in pf.conf) to be silently ignored. This is bad.
> 
Thanks for the report.

What version did you test exactly?

There was an issue with r289932 that was fixed in r289940, so if you're
in between those two can you test with something after r289940?

Thanks,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: pf NAT and VNET Jails

2015-11-05 Thread Kristof Provost

> On 05 Nov 2015, at 17:25, Shawn Webb  wrote:
> I've figured it out. I've removed all rules and went with a barebones config.
> 
> Right now, the laptop I'm using for NAT has an outbound interface of wlan0
> with an IP of 129.6.251.181 (from DHCP). The following line works:
> 
> nat on wlan0 from any to any -> 129.6.251.181
> 
> The following line doesn't:
> 
> nat on wlan0 from any to any -> (wlan0)
> 
> Nor does this:
> 
> nat on wlan0 from any to any -> wlan0
> 
> From the Handbook, the lines that don't work are prefered especially the first
> non-working line, since using (wlan0) would cause pf to pick up wlan0's IP
> dynamically (which is good, since wlan0 is DHCP'd).
> 
> So it seems at some point of time, doing NAT dynamically broke.
> 

So far I’ve had no luck reproducing this.
With pf.conf:
nat on vtnet0 from any to any -> (vtnet0)
pass in
pass out

And setup code:
ifconfig bridge0 create
ifconfig epair0 create
ifconfig epair0a up
ifconfig epair0b up
ifconfig bridge0 addm epair0a

jail -c name=test host.hostname=test vnet persist
ifconfig epair0b vnet test

ifconfig bridge0 inet 10.0.0.1/24

jexec test ifconfig epair0b 10.0.0.2/23
jexec test route add default 10.0.0.1

# Activate routing
sysctl net.inet.ip.forwarding=1

pfctl -e
pfctl -g -f pf.conf

Then I run exec test ping 8.8.8.8, which works as expected.

My home routing is running CURRENT, used vnet jails and also doesn’t seem to be 
triggering the problem.

Perhaps we’re still missing a component of the problem, but right now I have no 
idea what that would be.

Hmm. Perhaps… do you happen to know in what order things are done during 
startup?
Perhaps it’s related to the fact that wlan0 is both wifi and DHCP, in the sense 
that pf is configured before the IP is assigned to the interface.

Can you try reloading pf with the (wlan0) rule? (Just pfctl -g -f /etc/pf.conf 
should do the trick).

Regards,
Kristof


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: r289932 causes pf reversion - breaks rules with broadcast destination

2015-11-05 Thread Kristof Provost

> On 05 Nov 2015, at 18:39, Tom Uffner  wrote:
> 
> Tom Uffner wrote:
>> Commit r289932 causes pf rules with broadcast destinations (and some but not
>> all rules after them in pf.conf) to be silently ignored. This is bad.
> 
>> I do not understand the pf code well enough to see why this change caused
>> the breakage, but I suspect that it might expose some deeper problem and
>> should not simply be reverted.
> 
> OK, so here is why I don't want to simply back this out and have a "working"
> firewall again:
> 
> Apparently PF_ANEQ was prone to false positives when comparing IPv4 addrs.
> This is what r289932 and r289940 fixed. For IPv4 it does not matter where
> in bits 32-127 the address mismatch occurs or what order the garbage data
> is tested. That is all the paren fix in r289940 changes. It might be relevant
> for v6, but doesn't matter here.
> 
Yes, that’s right. 

I haven’t yet had the time to look at your problem in any depth.
I’m currently working on a different pf issue, but this one is also high on my 
priority list. Hopefully I’ll get round to it in the next few days, but please 
do prod me 
if you hear nothing.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: pf NAT and VNET Jails

2015-11-02 Thread Kristof Provost

> On 02 Nov 2015, at 14:47, Shawn Webb  wrote:
> 
> On Sunday, 01 November 2015 07:16:34 AM Julian Elischer wrote:
>> On 11/1/15 2:50 AM, Shawn Webb wrote:
>>> I'm at r290228 on amd64. I'm not sure which revision I was on last when it
>>> last worked, but it seems VNET jails aren't working anymore.
>>> 
>>> I've got a bridge, bridge1, with an IP of 192.168.7.1. The VNET jails set
>>> their default route to 192.168.7.1. The host simply NATs outbound from
>>> 192.168.7.0/24 to the rest of the world. The various epairs get added to
>>> bridge1 and assigned to each jail. Pretty simple setup. That worked until
>>> today. When I do tcpdump on my public-facing NIC, I see that NAT isn't
>>> applied. When I run `ping 8.8.8.8` from the jail, the jail's
>>> 192.168.7.0/24
>>> address gets sent on the wire.
>>> 
>>> Let me know what I can do to help debug this further.
>> 
>> send the list your setup script/settings?
> 
> I'm using iocage to start up the jails. Here's a pasted output of `iocage get 
> all mutt-hardenedbsd`: http://ix.io/lLG

Can you add your pf.conf too?

I’ll try upgrading my machine to something beyond 290228 to see if I can 
reproduce it.
It’s on r289635 now, and seems to be fine. My VNET jails certainly get their 
traffic NATed.

Thanks,
Kristof

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: pf NAT and VNET Jails

2015-11-02 Thread Kristof Provost

> On 02 Nov 2015, at 15:07, Shawn Webb <shawn.w...@hardenedbsd.org> wrote:
> 
> On Monday, 02 November 2015 02:59:03 PM Kristof Provost wrote:
>> 
>> Can you add your pf.conf too?
>> 
>> I’ll try upgrading my machine to something beyond 290228 to see if I can
>> reproduce it. It’s on r289635 now, and seems to be fine. My VNET jails
>> certainly get their traffic NATed.
> 
> Sorry about that! I should've included it. It's pasted here: http://ix.io/lLI
> 
> It's probably not the most concise. This is a laptop that can have one of 
> three interfaces online: re0 (ethernet on the laptop), wlan0 (you can guess 
> what that is), or ue0 (usb tethering from my phone). I used to be able to 
> specify NATing like that and pf would automatically figure out which outgoing 
> device to use. Seems like that's broken now.
> 
I’ve updated my machine and things still seem to be working.
As you said, it’s probably related to the multiple nat entries.

I’ll have to make a test setup, which’ll take a bit of time, especially 
since I’m messing with  the host machine at the moment.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

sysctl -a panic on VIMAGE kernels

2015-08-09 Thread Kristof Provost
Hi,

I’ve run into a reproducible panic on a VIMAGE kernel with ‘sysctl -a’.

Relevant backtrace bits:
#8  0x80e7dd28 in trap (frame=0xfe01f16b26a0)
at /usr/src/sys/amd64/amd64/trap.c:426
#9  0x80e5e6a2 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:235
#10 0x80cea67d in uma_zone_get_cur (zone=0x0)
at /usr/src/sys/vm/uma_core.c:3006
#11 0x80cec029 in sysctl_handle_uma_zone_cur (
oidp=0x818a7c90, arg1=0xfe00010c0438, arg2=0,
req=0xfe01f16b2868) at /usr/src/sys/vm/uma_core.c:3580
#12 0x80a28614 in sysctl_root_handler_locked (oid=0x818a7c90,
arg1=0xfe00010c0438, arg2=0, req=0xfe01f16b2868)
at /usr/src/sys/kern/kern_sysctl.c:183
#13 0x80a27d70 in sysctl_root (arg1=value optimized out,
arg2=value optimized out) at /usr/src/sys/kern/kern_sysctl.c:1694
#14 0x80a28372 in userland_sysctl (td=0x0, name=0xfe01f16b2930,
namelen=value optimized out, old=value optimized out,
oldlenp=value optimized out, inkernel=value optimized out,
new=value optimized out, newlen=value optimized out,
retval=value optimized out, flags=0)
at /usr/src/sys/kern/kern_sysctl.c:1798
#15 0x80a28144 in sys___sysctl (td=0xf8000b1e49a0,
uap=0xfe01f16b2a40) at /usr/src/sys/kern/kern_sysctl.c:1724

In essence, what happens is that we end up in sysctl_handle_uma_zone_cur() and 
arg1 is a pointer to NULL, 
so we call uma_zone_get_cur(zone); with zone == NULL.

There’s been a bit of churn around tcp_reass_zone, and I think the latest 
version is wrong.
It marks the sysctl as CTLFLAG_VNET, but the exposed variable is not 
VNET_DEFINE().

The following fixes it for me:

diff --git a/sys/netinet/tcp_reass.c b/sys/netinet/tcp_reass.c
index 77d8940..3913ef3 100644
--- a/sys/netinet/tcp_reass.c
+++ b/sys/netinet/tcp_reass.c
@@ -84,7 +84,7 @@ SYSCTL_INT(_net_inet_tcp_reass, OID_AUTO, maxsegments, 
CTLFLAG_RDTUN,
 Global maximum number of TCP Segments in Reassembly Queue);

 static uma_zone_t tcp_reass_zone;
-SYSCTL_UMA_CUR(_net_inet_tcp_reass, OID_AUTO, cursegments, CTLFLAG_VNET,
+SYSCTL_UMA_CUR(_net_inet_tcp_reass, OID_AUTO, cursegments, 0,
 tcp_reass_zone,
 Global number of TCP Segments currently in Reassembly Queue”);

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: sysctl -a panic on VIMAGE kernels

2015-08-09 Thread Kristof Provost
On 2015-08-09 13:36:35 (+0300), Gleb Smirnoff gleb...@freebsd.org wrote:
 On Sun, Aug 09, 2015 at 12:28:22PM +0200, Kristof Provost wrote:
 K The following fixes it for me:
 K 
 K diff --git a/sys/netinet/tcp_reass.c b/sys/netinet/tcp_reass.c
 K index 77d8940..3913ef3 100644
 K --- a/sys/netinet/tcp_reass.c
 K +++ b/sys/netinet/tcp_reass.c
 K @@ -84,7 +84,7 @@ SYSCTL_INT(_net_inet_tcp_reass, OID_AUTO, maxsegments, 
 CTLFLAG_RDTUN,
 K  Global maximum number of TCP Segments in Reassembly Queue);
 K 
 K  static uma_zone_t tcp_reass_zone;
 K -SYSCTL_UMA_CUR(_net_inet_tcp_reass, OID_AUTO, cursegments, CTLFLAG_VNET,
 K +SYSCTL_UMA_CUR(_net_inet_tcp_reass, OID_AUTO, cursegments, 0,
 K  tcp_reass_zone,
 K  Global number of TCP Segments currently in Reassembly Queue”);
 
 Right, if a variable isn't virtualized, the CTLFLAG_VNET must be removed.
 
 Patrick, how is your progress wuth improved reassembly?
 
Any opposition to me committing the above patch? It'll at least make us
stop panic()ing and I don't think it'll make Patrick's life any harder.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: pf crash on -current

2015-02-24 Thread Kristof Provost
On 2015-02-24 08:05:47 (+0100), Kristof Provost kris...@sigsegv.be wrote:
 On 2015-02-23 17:23:55 (-0800), Davide Italiano dav...@freebsd.org wrote:
  The bt you posted suggest this could be stack overflow, probably due
  to infinite recursion.
  Also, as a wild guess, just looking at the stacktrace, I think this
  might be related to the recent ipv6 fragment changes. Try to back them
  out, and see if things gets more stable ( r278831 and r278843).
  
 That's almost certainly what it is.
 
After a bit of fiddling around I've managed to reproduce this locally.

Essentially we get caught in a loop of defragmenting and refragmenting:
Fragmented packets come in on one interface and get collected until we
can defragment it. At that point the defragmented packet is handed back
to the ip stack (at the pfil point in ip6_input(). Normal processing
continues.
Eventually we figure out that the packet has to be forwarded and we end
up at the pfil hook in ip6_forward(). After doing the inspection on the
defragmented packet we see that the packet has been defragmented and
because we're forwarding we have to refragment it. That's indicated by
the presence of the PF_REASSEMBLED tag.

In pf_refragment6() we remove that tag, split the packet up again and
then ip6_forward() the individual fragments.
Those fragments hit the pfil hook on the way out, so they're
collected until we can reconstruct the full packet, at which point we're
right back where we left off and things continue until we run out of
stack.

There are two reasons Allan is seeing this and no one else has so far.

The first is that he's scrubbing both on input and output. My own tests
have always been done with 'scrub in all fragment reassemble', rather
than 'scrub all fragment reassemble' so I didn't see this problem.

The second is that he's got an internal interface with a higher MTU,
so the refragmentation actually works for him.
There's an open problem where ip6_forward() drops the defragmented
packet before the pfil(PFIL_OUT) hook because it's too big for the
output interface.
If the last patch of my series (https://reviews.freebsd.org/D1815) had
been merged as well more people would have been affected.

One possible fix for Allan's problem would be to tag the fragments after
refragmentation so that pf ignores them. After all, the defragmented
packet has already been inspected so there's no point in checking the
fragments again.

I have the feeling there's a way to fix this problem and the issue D1815
tries to fix in one go though. I'll need to think about it a bit more.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: pf crash on -current

2015-02-23 Thread Kristof Provost
On 2015-02-23 17:23:55 (-0800), Davide Italiano dav...@freebsd.org wrote:
 On Mon, Feb 23, 2015 at 5:17 PM, Allan Jude allanj...@freebsd.org wrote:
  Upgraded my router today, because it was approaching the 24 uptime days
  of doom
 
  Now, it likes to die on me, a lot
 
 
 
 The bt you posted suggest this could be stack overflow, probably due
 to infinite recursion.
 Also, as a wild guess, just looking at the stacktrace, I think this
 might be related to the recent ipv6 fragment changes. Try to back them
 out, and see if things gets more stable ( r278831 and r278843).
 
That's almost certainly what it is.

Allan, can you give me a bit more information about your setup?
Specifically the pf rules, the network interfaces, the IP(v6)
addresses and the routes?

Thanks,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[PATCH] minstat: default width is terminal width, not 74

2014-12-18 Thread Kristof Provost
The man page states that:
'-w widthWidth of ASCII-art plot in characters, default is 74.'

This is not entirely correct. The mini-help is more accurate:
'-w : width of graph/test output (default 74 or terminal width)'

In other words: the man page fails to explain that ministat will default
to the terminal width, not 74. It will only fall back to 74 if 'COLUMNS'
is not set and ioctl(TIOCGWINSZ) fails.
---
 usr.bin/ministat/ministat.1 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/usr.bin/ministat/ministat.1 b/usr.bin/ministat/ministat.1
index ea31c23..4550a09 100644
--- a/usr.bin/ministat/ministat.1
+++ b/usr.bin/ministat/ministat.1
@@ -68,7 +68,7 @@ See
 .Xr strtok 3
 for details.
 .It Fl w Ar width
-Width of ASCII-art plot in characters, default is 74.
+Width of ASCII-art plot in characters, default is terminal width or 74.
 .El
 .Pp
 A sample output could look like this:
-- 
2.1.3

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zfs kernel panic, known incompatibilities with clang CPUTYPE/COPTFLAGS?

2013-06-29 Thread Kristof Provost
On 2013-06-29 18:49:16 (+0200), Martin Matuska m...@freebsd.org wrote:
 This was an obvious error by me - I forgot to register zfs_ioc_jail and
 zfs_ioc_unjail using the new functions.
 Amazing noone noticed this by now until it got merged down to stable/8.
 
 In addition, I see no need to log these operations to the zpool history
 as they cause no on-disk changes, so I have disabled logging for these
 calls.
 Please test the patch from current in r252380.
 
 http://svnweb.freebsd.org/base?view=revisionrevision=252380
 
This fixes the panic for me (on stable/9).

Thanks,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zfs kernel panic, known incompatibilities with clang CPUTYPE/COPTFLAGS?

2013-06-27 Thread Kristof Provost
On 2013-06-24 22:08:01 (+0200), Alexander Leidinger alexan...@leidinger.net 
wrote:
 On Mon, 24 Jun 2013 12:15:18 +0200
 Kristof Provost kris...@sigsegv.be wrote:
 
  For what it's worth, I'm running into exactly the same problem.
  (amd64, stable/9). I have no custom settings in /etc/make.conf
  or /etc/src.conf
 
 I had a short discussion with the maintainer of our stress-test-suite,
 he was able to create a test-case which triggers the problem.
 
I've been bisecting for a little bit, and while I'm not 100% sure yet,
there is one likely culprit right now: r249643.
It's an MFC with a number of ZFS changes relating to a refactoring of
the ioctl() interface. 

It is, unfortunately, also a rather large commit.

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zfs kernel panic, known incompatibilities with clang CPUTYPE/COPTFLAGS?

2013-06-24 Thread Kristof Provost
On 2013-06-14 23:07:02 (+0200), Alexander Leidinger alexan...@leidinger.net 
wrote:
 The bt in the minidump is useless, but I made a bt directly
 in the kernel debugger:
 ---snip---
 Fatal trap 12: page fault while in kernel mode
 cpuid = 2; apic id = 02
 fault virtual address   = 0x0
 fault code  = supervisor read instruction, page not present
 instruction pointer = 0x20:0x0
 stack pointer   = 0x28:0xff839e79d930
 frame pointer   = 0x28:0xff839e79d9e0
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 2183 (zfs)
 
 db bt
 Tracing pid 2356
 uart_sab82532_class() at 0
 devfs_ioctl_f() at devfs_ioctl_f+0xf0
 kern_ioctl() at kern_ioctl+0x1d7
 sys_ioctl() at sys_ioctl+0x142
 ---snip---
 
For what it's worth, I'm running into exactly the same problem. (amd64,
stable/9). I have no custom settings in /etc/make.conf or /etc/src.conf

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


OpenRD-CL support

2012-04-08 Thread Kristof Provost
Hi,

Based on the work from arm/156814 I've got a working config and device
tree for the OpenRD-CL.

It successfully boots over NFS, both network interfaces as well as the
cesa (crypto accelerator) work.

The patch:

diff --git a/sys/arm/conf/OPENRD-CL b/sys/arm/conf/OPENRD-CL
new file mode 100644
index 000..25707ed
--- /dev/null
+++ b/sys/arm/conf/OPENRD-CL
@@ -0,0 +1,81 @@
+#
+# Custom kernel for OpenRD Client/Ultimate devices.
+#
+# $FreeBSD$
+#
+
+ident  OPENRD-CL
+include../mv/kirkwood/std.sheevaplug
+
+optionsSOC_MV_KIRKWOOD
+makeoptionsMODULES_OVERRIDE=
+
+makeoptionsDEBUG=-g#Build kernel with gdb(1) debug symbols
+makeoptionsWERROR=-Werror
+makeoptions INVARIANTS
+
+optionsSCHED_4BSD  #4BSD scheduler
+optionsINET#InterNETworking
+optionsINET6   #IPv6 communications protocols
+optionsFFS #Berkeley Fast Filesystem
+optionsNFSCL   #New Network Filesystem Client
+optionsNFSLOCKD#Network Lock Manager
+optionsNFS_ROOT#NFS usable as /, requires NFSCL
+optionsBOOTP
+optionsBOOTP_NFSROOT
+optionsBOOTP_NFSV3
+optionsBOOTP_WIRED_TO=mge0
+
+# Root fs on USB device
+#options   ROOTDEVNAME=\ufs:/dev/da0a\
+
+optionsSYSVSHM #SYSV-style shared memory
+optionsSYSVMSG #SYSV-style message queues
+optionsSYSVSEM #SYSV-style semaphores
+options_KPOSIX_PRIORITY_SCHEDULING #Posix P1003_1B real-time extensions
+optionsMUTEX_NOINLINE
+optionsRWLOCK_NOINLINE
+optionsNO_FFS_SNAPSHOT
+optionsNO_SWAPPING
+
+# Debugging
+optionsALT_BREAK_TO_DEBUGGER
+optionsDDB
+optionsKDB
+
+# Pseudo devices
+device random
+device pty
+device loop
+
+# Serial ports
+device uart
+
+# Networking
+device ether
+device mge # Marvell Gigabit Ethernet controller
+device mii
+device e1000phy
+device bpf
+optionsHZ=1000
+optionsDEVICE_POLLING
+device vlan
+
+device cesa# Marvell security engine
+device crypto
+device cryptodev
+
+# USB
+optionsUSB_DEBUG   # enable debug msgs
+device usb
+device ehci
+device umass
+device scbus
+device pass
+device da
+
+# Flattened Device Tree
+optionsFDT
+optionsFDT_DTB_STATIC
+makeoptionsFDT_DTS_FILE=openrd-cl.dts
+
diff --git a/sys/boot/fdt/dts/openrd-cl.dts b/sys/boot/fdt/dts/openrd-cl.dts
new file mode 100644
index 000..6d11779
--- /dev/null
+++ b/sys/boot/fdt/dts/openrd-cl.dts
@@ -0,0 +1,340 @@
+/*
+ * Copyright (c) 2009-2010 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Semihalf under sponsorship from
+ * the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * OpenRD-Client/Ultimate Device Tree Source.
+ *
+ * $FreeBSD$
+ */
+
+/dts-v1/;
+
+/ {
+   model = mrvl,OpenRD-CL;
+   compatible = OpenRD-CL;
+   #address-cells = 1;
+   #size-cells = 1;
+
+   aliases {
+   ethernet0 = enet0;
+   ethernet1 = enet1;
+   mpp = MPP;
+   pci0 = pci0;
+   serial0 = serial0;
+   serial1 = serial1;
+   soc = SOC;
+   sram = SRAM;
+   };
+
+   cpus {
+   #address-cells = 1;
+   #size-cells = 0;
+
+   cpu@0 

Build error in bin/sh/jobs.c if DEBUG=2

2012-04-01 Thread Kristof Provost
While chasing down an odd issue with alignment faults I activated
debugging in bin/sh.
bin/sh/Makefile has a commented out line (# DEBUG_FLAGS+= -g -DDEBUG=2
-fno-inline) to do this so that's what I did.

This fails to compile in bin/sh/jobs.c in vforkexecshell().
The debug TRACE() tries to print variables which don't exist.

The patch below fixes the compilation problem, but I'm unsure if it's
printing the relevant information.

Regards,
Kristof


diff --git a/bin/sh/jobs.c b/bin/sh/jobs.c
index 335d2ca..9027b8c 100644
--- a/bin/sh/jobs.c
+++ b/bin/sh/jobs.c
@@ -893,8 +893,7 @@ vforkexecshell(struct job *jp, char **argv, char **envp, 
const char *path, int i
struct jmploc jmploc;
struct jmploc *savehandler;
 
-   TRACE((vforkexecshell(%%%td, %p, %d) called\n, jp - jobtab, (void *)n,
-   mode));
+   TRACE((vforkexecshell(%%%td, %d) called\n, jp - jobtab, idx));
INTOFF;
flushall();
savehandler = handler;
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SIOCGIFADDR broken on 9.0-RC1?

2011-11-15 Thread Kristof Provost
On 2011-11-15 18:10:01 (+0100), GR free...@gomor.org wrote:
 more insights since my last post. Here is a small code to trigger the bug 
 (end of email).
 When you run it on 9.0-RC1, it gets an alias address instead of the main inet 
 address:
 
 % ./get-ip re0  
 inet: 192.168.2.10
 # Main address being 192.168.1.148
 
 On 8.2-RELEASE, all goes well:
 % ./get-ip re0
 inet: PUBLIC_IP4
 
 Is something broken, or a behaviour has changed since 8.2-RELEASE?
 

I think the relevant bit of the code is found in sys/netinet/in.c.

If your ioctl doesn't specify an IP address we end up in this bit:
TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link) {
iap = ifatoia(ifa);
if (iap-ia_addr.sin_family == AF_INET) {
if (td != NULL 
prison_check_ip4(td-td_ucred,
iap-ia_addr.sin_addr) != 0)
continue;
ia = iap;
break;
}
}

The 'ia' pointer is later used to return the IP address. 

In other words: it returns the first address on the interface
of type IF_INET (which isn't assigned to a jail). 

I think the order of the addresses is not fixed, or rather it depends on 
the order in which you assign addresses. In the handling of SIOCSIFADDR
new addresses are just appended:

TAILQ_INSERT_TAIL(ifp-if_addrhead, ifa, ifa_link);

I don't believe this has changed since 8.0. Is it possible something
changed in the network initialisation, leading to the addresses being
assigned in a different order?

Eagerly awaiting to be told I'm wrong,
Kristof

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [OpenRD Ultimate] e1000phy(88E1149/88E1121) has a initialize issue

2010-06-20 Thread Kristof Provost
On 2010-06-20 21:03:51 (+0900), Norikatsu Shigemura n...@freebsd.org wrote:
 On Sun, 13 Jun 2010 22:13:31 +0200
 Kristof Provost kris...@sigsegv.be wrote:
 I have a OpenRD Ultimate, which has two GbE ports - if_mge(4).  But
 I couldn't use mge1 like following.  So I tried to investigate.
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
   - - - - - - - -
   Jun 13 05:02:14 sidearms kernel: mge1: watchdog timeout
   Jun 13 05:02:14 sidearms kernel: mge1: Timeout on link-up
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
   - - - - - - - -
  I believe the mge(4) driver incorrectly configures the PHY address for
  the second interface. Can you give the attached patch a try?
 
   Thank you. I think so, too.  And, by FDT, I suggest following
   patch.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 - - - - - -
 /* Tell the MAC where to find the PHY so autoneg works */
 -   miisc = LIST_FIRST(sc-mii-mii_phys);
 -   MGE_WRITE(sc, MGE_REG_PHYDEV, miisc-mii_phy);
 +   MGE_WRITE(sc, MGE_REG_PHYDEV, sc-phyaddr);
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 - - - - - -
 
I think that's correct, but I haven't been able to test it on my board
yet. Does this work for you on a board with two GbE ports? If so I'll
try to get someone to commit it.

Regards,
Kristof

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [OpenRD Ultimate] e1000phy(88E1149/88E1121) has a initialize issue

2010-06-13 Thread Kristof Provost
On 2010-06-13 23:37:23 (+0900), Norikatsu Shigemura n...@freebsd.org wrote:
 Hi yongari!
 
   I have a OpenRD Ultimate, which has two GbE ports - if_mge(4).  But
   I couldn't use mge1 like following.  So I tried to investigate.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 - - - - - -
 Jun 13 05:02:14 sidearms kernel: mge1: watchdog timeout
 Jun 13 05:02:14 sidearms kernel: mge1: Timeout on link-up
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 - - - - - -
 
I believe the mge(4) driver incorrectly configures the PHY address for
the second interface. Can you give the attached patch a try?

I'm not familiar with the PHY code so I won't comment on those changes.

Regards,
Kristof

Index: sys/dev/mge/if_mge.c
===
--- sys/dev/mge/if_mge.c	(revision 208113)
+++ sys/dev/mge/if_mge.c	(working copy)
@@ -606,7 +606,6 @@
 mge_attach(device_t dev)
 {
 	struct mge_softc *sc;
-	struct mii_softc *miisc;
 	struct ifnet *ifp;
 	uint8_t hwaddr[ETHER_ADDR_LEN];
 	int i, error ;
@@ -690,9 +689,9 @@
 	}
 	sc-mii = device_get_softc(sc-miibus);
 
-	/* Tell the MAC where to find the PHY so autoneg works */
-	miisc = LIST_FIRST(sc-mii-mii_phys);
-	MGE_WRITE(sc, MGE_REG_PHYDEV, miisc-mii_phy);
+	/* Tell the MAC where to find the PHY so autoneg works 
+	 * We assume a static mapping (see mge_miibus_readreg) */
+	MGE_WRITE(sc, MGE_REG_PHYDEV, device_get_unit(dev) + MII_ADDR_BASE);
 
 	/* Attach interrupt handlers */
 	for (i = 0; i  2; ++i) {
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

  1   2   >