Re: zfs native encryption best practices on RELENG13

2021-04-23 Thread Xin Li via freebsd-stable
On 4/23/21 13:53, mike tancsa wrote:
> Starting to play around with RELENG_13 and wanted explore ZFS' built in
> encryption.  Is there a best practices doc on how to do full disk
> encryption anywhere thats not GELI based  ?  There are lots for 
> GELI,
> but nothing I could find for native OpenZFS encryption on FreeBSD
> 
> i.e box gets rebooted, enter in passphrase to allow it to boot kind of
> thing from the boot loader prompt ?

I think loader do not support the native OpenZFS encryption yet.
However, you can encrypt non-essential datasets on a boot pool (that is,
if com.datto:encryption is "active" AND the bootfs dataset is not
encrypted, you can still boot from it).

BTW instead of entering passphrase at loader prompt, if / is not
encrypted, it's also possible to do something like
https://lists.freebsd.org/pipermail/freebsd-security/2012-August/006547.html
.

Personally I'd probably go with GELI (or other kind of full disk
encryption) regardless if OpenZFS's native encryption is used because my
primary goal is to be able to just throw away bad disks when they are
removed from production [1].  If the pool is not fully encrypted, there
is always a chance that the sensitive data have landed some unencrypted
datasets and never gets fully overwritten.

[1] Also keep in mind: https://xkcd.com/538/

Cheers,



OpenPGP_signature
Description: OpenPGP digital signature


Re: [pf] stable/12: block by OS broken

2021-02-17 Thread Xin Li via freebsd-stable
On 2/17/21 22:57, Xin Li wrote:
> On 2/17/21 22:35, Kristof Provost wrote:
>> On 18 Feb 2021, at 6:01, Xin Li wrote:
>>
>> Hi,
>>
>> It appears that some change between 939430f2377 (December 31) and
>> b4bf7bdeb70 (today) on stable/12 have broken pf in a way that the
>> following rule:
>>
>> block in quick proto tcp from any os "Linux" to any port ssh
>>
>> would get interpreted as:
>>
>> block drop in quick proto tcp from any to any port = 22
>>
>> (and block all SSH connection instead of just the ones initiated from
>> Linux).
>>
>> Thanks for the report. I think I see the problem.
>>
>> Can you test this patch?
>>
>> |diff --git a/sys/netpfil/pf/pf_ioctl.c b/sys/netpfil/pf/pf_ioctl.c
>> index 593a38d4a360..458c6af3fa5e 100644 --- a/sys/netpfil/pf/pf_ioctl.c
>> +++ b/sys/netpfil/pf/pf_ioctl.c @@ -1623,7 +1623,7 @@
>> pf_rule_to_krule(const struct pf_rule *rule, struct pf_krule *krule) /*
>> Don't allow userspace to set evaulations, packets or bytes. */ /* kif,
>> anchor, overload_tbl are not copied over. */ - krule->os_fingerprint =
>> krule->os_fingerprint; + krule->os_fingerprint = rule->os_fingerprint;
>> krule->rtableid = rule->rtableid; bcopy(rule->timeout, krule->timeout,
>> sizeof(krule->timeout)); |
>>
>> With any luck we’ll be able to include the fix in 13.0.
> 
> Thanks, I'll try this on a -CURRENT box which is exhibiting the same
> issue and report back as soon as possible.

And I can confirm that this fixed the issue on -CURRENT, thanks for the
quick fix!

Cheers,



OpenPGP_signature
Description: OpenPGP digital signature


Re: [pf] stable/12: block by OS broken

2021-02-17 Thread Xin Li via freebsd-stable
On 2/17/21 22:35, Kristof Provost wrote:
> On 18 Feb 2021, at 6:01, Xin Li wrote:
> 
> Hi,
> 
> It appears that some change between 939430f2377 (December 31) and
> b4bf7bdeb70 (today) on stable/12 have broken pf in a way that the
> following rule:
> 
> block in quick proto tcp from any os "Linux" to any port ssh
> 
> would get interpreted as:
> 
> block drop in quick proto tcp from any to any port = 22
> 
> (and block all SSH connection instead of just the ones initiated from
> Linux).
> 
> Thanks for the report. I think I see the problem.
> 
> Can you test this patch?
> 
> |diff --git a/sys/netpfil/pf/pf_ioctl.c b/sys/netpfil/pf/pf_ioctl.c
> index 593a38d4a360..458c6af3fa5e 100644 --- a/sys/netpfil/pf/pf_ioctl.c
> +++ b/sys/netpfil/pf/pf_ioctl.c @@ -1623,7 +1623,7 @@
> pf_rule_to_krule(const struct pf_rule *rule, struct pf_krule *krule) /*
> Don't allow userspace to set evaulations, packets or bytes. */ /* kif,
> anchor, overload_tbl are not copied over. */ - krule->os_fingerprint =
> krule->os_fingerprint; + krule->os_fingerprint = rule->os_fingerprint;
> krule->rtableid = rule->rtableid; bcopy(rule->timeout, krule->timeout,
> sizeof(krule->timeout)); |
> 
> With any luck we’ll be able to include the fix in 13.0.

Thanks, I'll try this on a -CURRENT box which is exhibiting the same
issue and report back as soon as possible.

Cheers,




OpenPGP_signature
Description: OpenPGP digital signature


[pf] stable/12: block by OS broken

2021-02-17 Thread Xin Li via freebsd-stable
Hi,

It appears that some change between 939430f2377 (December 31) and
b4bf7bdeb70 (today) on stable/12 have broken pf in a way that the
following rule:

block in quick proto tcp from any os "Linux" to any port ssh

would get interpreted as:

block drop in quick proto tcp from any to any port = 22

(and block all SSH connection instead of just the ones initiated from
Linux).

Cheers,



OpenPGP_signature
Description: OpenPGP digital signature


Re: CFT: if_bridge performance improvements

2020-04-24 Thread Xin Li via freebsd-stable
On 4/24/20 06:42, Kristof Provost wrote:
> On 22 Apr 2020, at 18:15, Xin Li wrote:
>> On 4/22/20 01:45, Kristof Provost wrote:
>>> On 22 Apr 2020, at 10:20, Xin Li wrote:
 Hi,

 On 4/14/20 02:51, Kristof Provost wrote:
> Hi,
>
> Thanks to support from The FreeBSD Foundation I’ve been able to
> work on
> improving the throughput of if_bridge.
> It changes the (data path) locking to use the NET_EPOCH
> infrastructure.
> Benchmarking shows substantial improvements (x5 in test setups).
>
> This work is ready for wider testing now.
>
> It’s under review here: https://reviews.freebsd.org/D24250
>
> Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
> Patches for stable/12:
> https://people.freebsd.org/~kp/if_bridge/stable_12/
>
> I’m not currently aware of any panics or issues resulting from these
> patches.

 I have observed the following panic with latest stable/12 after
 applying
 the stable_12 patchset, it appears like a race condition related NULL
 pointer deference, but I haven't took a deeper look yet.

 The box have 7 igb(4) NICs, with several bridge and VLAN configured
 acting as a router.  Please let me know if you need additional
 information; I can try -CURRENT as well, but it would take some time as
 the box is relatively slow (it's a ZFS based system so I can create a
 separate boot environment for -CURRENT if needed, but that would take
 some time as I might have to upgrade the packages, should there be any
 ABI breakages).

>>> Thanks for the report. I don’t immediately see how this could happen.
>>>
>>> Are you running an L2 firewall on that bridge by any chance? An earlier
>>> version of the patch had issues with a stray unlock in that code path.
>>
>> I don't think I have a L2 firewall (I assume means filtering based on
>> MAC address like what can be done with e.g. ipfw?  The bridges were
>> created on vlan interfaces though, do they count as L2 firewall?), the
>> system is using pf with a few NAT rules:
>>
> 
> That backtrace looks identical to the one Peter reported, up to and
> including the offset in the bridge_input() function.
> Given that there’s no likely way to end up with a NULL mutex either I
> have to assume that it’s a case of trying to unlock a locked mutex, and
> the most likely reason is that you ran into the same problem Peter ran
> into.
> 
> The current version of the patch should resolve it.

Thanks, I'd like to report that after applying the patch from Peter the
system seems to survive without problem.

Cheers,




signature.asc
Description: OpenPGP digital signature


Re: CFT: if_bridge performance improvements

2020-04-22 Thread Xin Li via freebsd-stable
On 4/22/20 01:45, Kristof Provost wrote:
> On 22 Apr 2020, at 10:20, Xin Li wrote:
>> Hi,
>>
>> On 4/14/20 02:51, Kristof Provost wrote:
>>> Hi,
>>>
>>> Thanks to support from The FreeBSD Foundation I’ve been able to work on
>>> improving the throughput of if_bridge.
>>> It changes the (data path) locking to use the NET_EPOCH infrastructure.
>>> Benchmarking shows substantial improvements (x5 in test setups).
>>>
>>> This work is ready for wider testing now.
>>>
>>> It’s under review here: https://reviews.freebsd.org/D24250
>>>
>>> Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
>>> Patches for stable/12:
>>> https://people.freebsd.org/~kp/if_bridge/stable_12/
>>>
>>> I’m not currently aware of any panics or issues resulting from these
>>> patches.
>>
>> I have observed the following panic with latest stable/12 after applying
>> the stable_12 patchset, it appears like a race condition related NULL
>> pointer deference, but I haven't took a deeper look yet.
>>
>> The box have 7 igb(4) NICs, with several bridge and VLAN configured
>> acting as a router.  Please let me know if you need additional
>> information; I can try -CURRENT as well, but it would take some time as
>> the box is relatively slow (it's a ZFS based system so I can create a
>> separate boot environment for -CURRENT if needed, but that would take
>> some time as I might have to upgrade the packages, should there be any
>> ABI breakages).
>>
> Thanks for the report. I don’t immediately see how this could happen.
> 
> Are you running an L2 firewall on that bridge by any chance? An earlier
> version of the patch had issues with a stray unlock in that code path.

I don't think I have a L2 firewall (I assume means filtering based on
MAC address like what can be done with e.g. ipfw?  The bridges were
created on vlan interfaces though, do they count as L2 firewall?), the
system is using pf with a few NAT rules:

$ sudo pfctl -s rules
anchor "miniupnpd" all
pass in quick inet6 proto tcp from  to any flags S/SA keep state
block drop in quick inet6 proto tcp from !  to  flags S/SA
block drop in quick proto tcp from any os "Linux" to any port = ssh
pass out on igb6 inet proto tcp from (igb6) to any port = domain flags
S/SA keep state queue dns
pass out on igb6 inet proto udp from (igb6) to any port = domain keep
state queue dns
pass in on igb6 proto tcp from any to (igb6) port = http flags S/SA
modulate state queue(web, ack)
pass in on igb6 proto tcp from any to (igb6) port = https flags S/SA
modulate state queue(web, ack)
pass out on igb6 inet proto tcp from (igb6) to any flags S/SA modulate
state queue bulk
block drop in quick on igb6 proto tcp from  to any port = ssh
label "ssh bruteforce"
block drop in on igb6 from  to any

Cheers,



signature.asc
Description: OpenPGP digital signature


Re: CFT: if_bridge performance improvements

2020-04-22 Thread Xin Li via freebsd-stable
Hi,

On 4/14/20 02:51, Kristof Provost wrote:
> Hi,
> 
> Thanks to support from The FreeBSD Foundation I’ve been able to work on
> improving the throughput of if_bridge.
> It changes the (data path) locking to use the NET_EPOCH infrastructure.
> Benchmarking shows substantial improvements (x5 in test setups).
> 
> This work is ready for wider testing now.
> 
> It’s under review here: https://reviews.freebsd.org/D24250
> 
> Patch for CURRENT: https://reviews.freebsd.org/D24250?download=true
> Patches for stable/12: https://people.freebsd.org/~kp/if_bridge/stable_12/
> 
> I’m not currently aware of any panics or issues resulting from these
> patches.

I have observed the following panic with latest stable/12 after applying
the stable_12 patchset, it appears like a race condition related NULL
pointer deference, but I haven't took a deeper look yet.

The box have 7 igb(4) NICs, with several bridge and VLAN configured
acting as a router.  Please let me know if you need additional
information; I can try -CURRENT as well, but it would take some time as
the box is relatively slow (it's a ZFS based system so I can create a
separate boot environment for -CURRENT if needed, but that would take
some time as I might have to upgrade the packages, should there be any
ABI breakages).

===

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x20
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80c286d5
stack pointer   = 0x28:0x824cb840
frame pointer   = 0x28:0x824cb850
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 0 (if_io_tqg_0)
trap number = 12
panic: page fault
cpuid = 0
time = 1587541913
KDB: stack backtrace:
#0 0x80c117a5 at kdb_backtrace+0x65
#1 0x80bc588e at vpanic+0x17e
#2 0x80bc5703 at panic+0x43
#3 0x810d2310 at trap_pfault+0
#4 0x810d235f at trap_pfault+0x4f
#5 0x810d19b8 at trap+0x288
#6 0x810aae1c at calltrap+0x8
#7 0x80ba5c96 at __mtx_unlock_sleep+0xb6
#8 0x8248f4c7 at bridge_input+0x877
#9 0x80cd5c47 at ether_nh_input+0x207
#10 0x80cf1e4a at netisr_dispatch_src+0xca
#11 0x80cd4f0b at ether_input+0x4b
#12 0x80cdf1a3 at vlan_input+0x1f3
#13 0x80cd4ae1 at ether_demux+0x121
#14 0x80cd5d7b at ether_nh_input+0x33b
#15 0x80cf1e4a at netisr_dispatch_src+0xca
#16 0x80cd4f0b at ether_input+0x4b
#17 0x80cee41c at iflib_rxeof+0xadc
Uptime: 6m6s
Dumping 848 out of 16313
MB:..2%..12%..21%..31%..42%..51%..61%..72%..82%..91%


Backtrace:

(kgdb) #0  doadump () at src/sys/amd64/include/pcpu_aux.h:55
#1  0x80bc54a5 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:451
#2  0x80bc58e6 in vpanic (fmt=,
ap=) at /usr/src/sys/kern/kern_shutdown.c:880
#3  0x80bc5703 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:807
#4  0x810d2310 in trap_fatal (frame=,
eva=) at /usr/src/sys/amd64/amd64/trap.c:925
#5  0x810d235f in trap_pfault (frame=0x824cb780,
usermode=, signo=,
ucode=) at src/sys/amd64/include/pcpu_aux.h:55
#6  0x810d19b8 in trap (frame=0x824cb780)
at /usr/src/sys/amd64/amd64/trap.c:407
#7  0x810aae1c in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:289
#8  0x80c286d5 in turnstile_broadcast (ts=0x0, queue=0)
at /usr/src/sys/kern/subr_turnstile.c:880
#9  0x80ba5c96 in __mtx_unlock_sleep (c=0xf80013351430, v=0)
at /usr/src/sys/kern/kern_mutex.c:1041
#10 0x8248f4c7 in bridge_input (ifp=,
m=) at src/sys/amd64/include/atomic.h:221
#11 0x80cd5c47 in ether_nh_input (m=)
at /usr/src/sys/net/if_ethersubr.c:631
#12 0x80cf1e4a in netisr_dispatch_src (proto=5,
source=, m=)
at /usr/src/sys/net/netisr.c:1124
#13 0x80cd4f0b in ether_input (ifp=0xf800060dc000, m=0x0)
at /usr/src/sys/net/if_ethersubr.c:787
#14 0x80cdf1a3 in vlan_input (ifp=0xf800036d6800,
m=0xf8001d65fc00) at /usr/src/sys/net/if_vlan.c:1291
#15 0x80cd4ae1 in ether_demux (ifp=0xf800036d6800,
m=) at /usr/src/sys/net/if_ethersubr.c:832
#16 0x80cd5d7b in ether_nh_input (m=)
at /usr/src/sys/net/if_ethersubr.c:667
#17 0x80cf1e4a in netisr_dispatch_src (proto=5,
source=, m=)
at /usr/src/sys/net/netisr.c:1124
#18 0x80cd4f0b in ether_input (ifp=0xf800036d6800,
m=0xf80013939c00) at /usr/src/sys/net/if_ethersubr.c:787
#19 0x80cee41c in iflib_rxeof (rxq=,
budget=) at /usr/src/sys/net/iflib.c:2873
#20 0x80ce87b3 in _task_fn_rx (context=0xf800036d6000)
at