Re: wireguard reconfiguration reliability

2024-03-21 Thread Paul B. Henson
On Thu, Mar 21, 2024 at 12:23:06PM +0300, Vitaliy Makkoveev wrote:

> wg(4) diff was committed to -current. Does the problem exist in upcoming
> 7.5?

Oh, I didn't know a fix had been committed, the referenced thread didn't
mention a final one. Thanks, I'll take a look.



Re: wireguard reconfiguration reliability

2024-03-20 Thread Paul B. Henson
On Wed, Mar 20, 2024 at 09:56:06PM +0100, Kirill Miazine wrote:

> Like in this thread, I guess:
> 
> https://marc.info/?t=16964239631=1=2

Yes, that is likely the issue we're hitting. Seems last message is from
10/2023 and the issue wasn't resolved :(, so I guess it's a known
problem with no solution on the horizon.

Next time I'll try your workaround of batching the commands up (ifconfig
wg1 down; ifconfig wg1 delete; ifconfig wg1 destroy) rather than running
one at a time and keep my fingers crossed I win the race condition :).

Thanks for the help...



Re: wireguard reconfiguration reliability

2024-03-20 Thread Paul B. Henson

On 3/20/2024 9:21 AM, Zack Newman wrote:


clients in rdomain(4) 0. Last week I ran ifconfig wg1 destroy, replaced
the wgkey and wgpsk for one of the three wgpeers in the second interface,
and ran sh /etc/netstart wg1. Once I did this, the server seemingly froze:


That's similar to what we see, although generally the entire server 
doesn't die, just the ifconfig command wedges and can't be killed, and 
the box can't be rebooted cleanly.


Thanks for the feedback…



Re: openbsd vm with SR-IOV vf nic

2024-03-20 Thread Paul B. Henson

On 3/20/2024 2:46 AM, Jonathan Matthew wrote:


mcx(4) supports virtual functions, mostly because they're identical to
physical functions from the driver's perspective, so all we had to do
was add the device IDs.


Ah, that wasn't readily apparent; I didn't see anything in the man page 
mentioning SR-IOV or virtual functions, nor in the source code when I 
went to take a peek now. I guess if you're familiar with Mellanox you'd 
be aware the vf was basically the same as the physical card and 
presumably supported by the same driver.




bnxt(4) could support virtual functions pretty easily, since they're
largely the same as physical functions, but some work would be required
there.


Cool, thanks for the pointers…



Re: wireguard reconfiguration reliability

2024-03-20 Thread Paul B. Henson

On 3/20/2024 1:44 AM, Kirill Miazine wrote:


actually I checked, and I do use wgpka on clients, but not on the
server -- I don't remember why I didn't...


In our case the server is on an Internet accessible address, whereas the 
clients are behind a NAT firewall. We also have keepalives enabled on 
the clients (to maintain their NAT mapping) but not on the server (as if 
the client isn't sending its keepalives the server isn't going to get 
through anyway).


A scenario where it stops but then works again as soon as traffic is 
sent does kind of sound like a firewall or NAT timeout issue?  We don't 
have that problem, if we leave it completely alone it generally works 
indefinitely with no issues. It's just when we try to modify the 
configuration that things sometimes go sideways.


Thanks for the data point…



openbsd vm with SR-IOV vf nic

2024-03-19 Thread Paul B. Henson
Is it very common for people to be running openbsd boxes under
virtualization and using an SR-IOV vf nic? I'm curious what cards people
are using.

It looks like the only available driver is iavf, for the Intel 700
cards? Are there any other drivers I missed?

We have some systems with Intel X550 cards in them, based on the 82599
chipset, which openbsd doesn't currently support. Yuichiro NAITO ported
a driver from netbsd:

https://marc.info/?l=openbsd-tech=168722323125036=2

We tested it under 7.3, and then an updated version for 7.4, and it's
been working great. At one point yasuoka@ had said he would review and
merge it, but it looks like that hasn't happened yet and I haven't heard
back the last couple of times I tried to ask him about it (I assume he's
busy with other things and don't want to bug him any more).

So I was just wondering if there are any other available drivers I might
have missed for other cards we might have, if anybody else was
interested in X550/82599 vf support, and if maybe any other dev might be
willing to take a look at it and possibly commit it.

Thanks much...



wireguard reconfiguration reliability

2024-03-19 Thread Paul B. Henson
We're using wireguard to set up VPN connections from various systems
deployed on-prem at customer sites to central openbsd boxes to route
internal traffic between the remote boxes and the internal network.

After a fresh reboot with a given configuration, everything works great.
The problem we have is when we later add or remove a remote system and
try to reconfigure the wireguard interface on the central servers.

Sometimes the new system just won't work, or oddly the new system works
fine but an existing system that was working breaks 8-/. When that
happens, we generally have to reboot it, at which point everything
works.

Occasionally ifconfig on the wg interface just wedges completely. When
that happens, it won't reboot cleaning, we have to hard reset it.

Has anyone else seen this type of behavior? I'm not sure how common it
is to have regular ongoing changes to wireguard like we are doing, so it
might not pop up often.

Thanks much...



Intel 10G X550T sr-iov virtual function driver

2023-04-28 Thread Paul B. Henson
I recently migrated an OpenBSD vm running under qemu/kvm to a new server
which has an Intel 10G X550T NIC (Intel Corporation Ethernet Converged
Network Adapter X550-T2) and am passing a vf though to the vm.
Unfortunately, it appears openbsd doesn't have a driver for this
virtualized device?

The dmesg output shows:

vendor "Intel", unknown product 0x1565 (class network subclass ethernet,
rev 0x0 0)

I see in the current PCI device list:

https://github.com/openbsd/src/blob/master/sys/dev/pci/pcidevs

there is support for the native card itself (INTEL X550T id 0x1563) but
nothing for the virtual function 0x1565.

Are there any plans to support this card as a virtual function in a vm?

Thanks much...



Re: /etc/bsd.re-config - change a device?

2021-11-30 Thread Paul B. Henson
On Tue, Nov 30, 2021 at 11:13:26PM -0500, Nick Holland wrote:

> hint: snapshots that do what you need beat releases that don't.

Granted; or I could just apply that patch to the 7.0 stable source and
copy in the new config binary :). I doubt if there will be any binary patches
that would overwrite it before 7.1 comes out.

Saving me the trouble of tweaking the kernel by hand on the rare
occasion a kernel patch comes by probably isn't worth running a
snapshot, at least in my case.

Thanks...



Re: /etc/bsd.re-config - change a device?

2021-11-30 Thread Paul B. Henson

Thanks much for the info guys; something to look forward to in 7.1 :).

On 11/30/2021 4:17 AM, Stuart Henderson wrote:

On 2021-11-30, Paul de Weerd  wrote:

On Tue, Nov 30, 2021 at 08:46:34AM -, Stuart Henderson wrote:
| On 2021-11-29, Paul B. Henson  wrote:
| > I'm upgrading to OpenBSD 7 and I was happy to see the new support for
| > /etc/bsd.re-config to allow modified kernels to be automatically
| > rebuilt. However, one of the changes I need to make is updating the IRQ
| > on com2, as my bios assigns it a non-standard value 8-/.
| >
| > I can't figure out how to do that? Is it supported? When I put "change
| > com2" in /etc/bsd.re-config, config interactively asks me:
| >
| > change [n]
| >
| > I tried "change com2 y" and "change com2", then "y" on the next line,
| > but the first gave an error and the second still prompted interactively.
| >
| > Are the only changes supported by /etc/bsd.re-config those that don't
| > need further input?
|
| Currently yes. jcs@ has a diff to change this but it needs review.

I believe this has been committed on November 20:

https://marc.info/?l=openbsd-cvs=163737802014911=2

However, that means that it won't work in OpenBSD 7.0, you will need
to run something newer (which, at the moment, means -current /
snapshots).


Ah good catch, thanks.





/etc/bsd.re-config - change a device?

2021-11-29 Thread Paul B. Henson
I'm upgrading to OpenBSD 7 and I was happy to see the new support for
/etc/bsd.re-config to allow modified kernels to be automatically
rebuilt. However, one of the changes I need to make is updating the IRQ
on com2, as my bios assigns it a non-standard value 8-/.

I can't figure out how to do that? Is it supported? When I put "change
com2" in /etc/bsd.re-config, config interactively asks me:

change [n]

I tried "change com2 y" and "change com2", then "y" on the next line,
but the first gave an error and the second still prompted interactively.

Are the only changes supported by /etc/bsd.re-config those that don't
need further input?

Thanks...



Re: umb0 broke in 6.9

2021-06-16 Thread Paul B. Henson

On 6/14/2021 4:54 PM, Stuart Henderson wrote:


find when the problem started .. with 6.9 userland you can probably get
away with just booting the relevant older kernel for a test for probably
most/maybe all of the way back to 6.8.


So I booted the 6.8 kernel, and everything seemed to be mostly working, 
but the umb interface still wasn't initialized properly :(. I was 
thinking I'd have to do a fresh install of 6.8 and start the test from 
there, but then I considered that one thing I still hadn't done was a 
cold power cycle. I have a remote console on the serial port of the 
system, but it doesn't have built-in remote power control and it's not 
hooked up to a remote control power switch, so it's not that convenient 
to deal with power.


I coordinated a cold power cycle and booted up the 6.8 kernel, and the 
umb interface worked :). I then booted the 6.9 kernel, and it also 
worked? By default the kernel allocates the device to the umsm driver 
(it would be nice if the umb driver took priority instead), so the first 
6.9 boot after the install used that driver until I disabled it and 
rebooted. I thought perhaps the 6.9 version of that driver put the card 
in a bad state, so I tried booting the 6.9 kernel with it enabled, and 
then booting it again with it disabled. But the umb interface was still 
working after that test.


So it seems that somehow the upgrade process put the hardware in a bad 
state where it would not initialize, and a cold power cycle seems to 
have sorted that out. I wasn't able to reproduce the issue doing some 
testing, so I guess I will write it off as "that's odd" and just be 
happy it seems to be working reliably now :).


Thanks much for your assistance looking at it…



Re: umb0 broke in 6.9

2021-06-14 Thread Paul B. Henson
On Mon, Jun 14, 2021 at 08:07:15AM -, Stuart Henderson wrote:

> just add "#define UMB_DEBUG" to if_umb.c and send the full dmesg output.

Hmm, that's didn't work, I also needed to update umb_debug = 1 in the
code? After that, I got a little output, full dmesg included below but
the umb part looks like:

umb0 at uhub0 port 3 configuration 1 interface 12 "Sierra Wireless,
Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A"
rev 2.10/0.06 addr 2
umb0: NCM align=4 div=4 rem=0
umb0: Only NTB16 format supported.
umb0: -> snd MBIM_OPEN_MSG (tid 1)
umb0: vers 1.0
umb0: stop: reached state DOWN
umb0: init: opening ...
umb0: -> snd MBIM_OPEN_MSG (tid 2)
umb0: init: opening ...
umb0: -> snd MBIM_OPEN_MSG (tid 3)
umb0: stop: reached state DOWN

This seems kind of like the original problem I had with the card when it
was attached to the internal USB2 minipci slot rather than to the
external USB3 one:

http://openbsd-archive.7691.n7.nabble.com/umb-device-SIM-has-no-PIN-td331358.html

Maybe a change in the USB code broke it?


OpenBSD 6.9-stable (GENERIC.MP) #12: Mon Jun 14 15:54:43 PDT 2021
r...@obsd-bld.pbhware.com:/sys/arch/amd64/compile/GENERIC.MP
real mem = 4261011456 (4063MB)
avail mem = 4116484096 (3925MB)
User Kernel Config
UKC> disable Humsm
361 umsm* disabled
UKC> quit
Continuing...
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xcff9f020 (7 entries)
bios0: vendor coreboot version "v4.6.3" date 20171030
bios0: PC Engines PC Engines apu3
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S2 S4 S5
acpi0: tables DSDT FACP SSDT TCPA APIC HEST SSDT SSDT HPET
acpi0: wakeup devices PWRB(S4) PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) PBR8(S4) 
UOH1(S3) UOH2(S3) UOH3(S3) UOH4(S3) UOH5(S3) UOH6(S3) XHC0(S4)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD GX-412TC SOC, 998.40 MHz, 16-30-01
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD GX-412TC SOC, 998.13 MHz, 16-30-01
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD GX-412TC SOC, 998.13 MHz, 16-30-01
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu2: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu2: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu2: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu2: disabling user TSC (skew=-144)
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: AMD GX-412TC SOC, 998.13 MHz, 16-30-01
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu3: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu3: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu3: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins
ioapic1 at mainbus0: apid 5 pa 

6.9 kernel compile fails

2021-06-14 Thread Paul B. Henson
I'm trying to compile a kernel with some debugging enabled for an problem
I've having with umb, and now my problem has turning into an error
compiling the kernel :). After getting the error on my updated from 6.8 code
base, I whacked it and did a fresh checkout, but it still shows up:

-bash-5.1$ pwd
/sys/arch/amd64/compile/GENERIC.MP
-bash-5.1$ make
make: don't know how to make /usr/src/sys/dev/pci/drm/i915/dvo_ch7017.c
(prerequisite of: dvo_ch7017.o)
Stop in /sys/arch/amd64/compile/GENERIC.MP

It looks like that file is at:

/usr/src/sys/dev/pci/drm/i915/display/dvo_ch7017.c

not where it's looking:

/usr/src/sys/dev/pci/drm/i915/dvo_ch7017.c

I created a symlink and then it complained about a missing header file,
so I made another symlink, and it complained about another C file, etc
etc, until I finally just ran:

ln -s display/* .

A whole bunch of stuff compiled, then it complained about:

make: don't know how to make
/usr/src/sys/dev/pci/drm/i915/i915_gem_clflush.c (prerequisite of:
i915_gem_clflush.o)

so queue:

ln -s gem/* .
ln -s gt/* .
ln -s uc/* .

and it trundled along for a while, then:

make: don't know how to make /usr/src/sys/dev/isa/asmc.c (prerequisite
of: asmc.o)

Finally, after:

ln -s ../acpi/asmc.c .

it finished compiling and the resultant kernel seems to work.

It seems odd the stable kernel source would be broken, but I'm not sure
what I might have done wrong? It's a fresh checkout, and there's not
much to compiling it. The box doing to compiling was updated from 6.8, I
haven't tried on a box with a fresh 6.9 install.

Thanks...





umb0 broke in 6.9

2021-06-13 Thread Paul B. Henson
I just upgraded a box that has a cell data card in it and it no longer
seems to work :(. The card is:

umb0 at uhub0 port 3 configuration 1 interface 12 "Sierra Wireless,
Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A"
rev 2.10/0.06 addr 2

The contents of /etc/hostname.umb0 are just:

apn r.ispsn

The interface shows:

umb0: flags=8811 mtu 1500
index 6 priority 6 llprio 3
roaming disabled registration unknown
state down cell-class none
SIM not initialized PIN required
APN r.ispsn
status: down

There is no PIN on the SIM. It was working fine right before the upgrade.
The only umb change I see in the changelog is:

Added vid/pid table to umb(4) allowing matching to alternate configurations.

I'm not sure what that means or if my config needs something changed to
work again? Any suggestions appreciated. The card is in an external
minipci adapter connected via USB3. The server is a PC Engines apu3 which
actually has an internal minipci connector, but I couldn't get that to
work as internally it was connected via USB2 and there were issues with
that chipset. I vaguely recall it was actually failing something like
this 8-/.

Thanks...



Re: OpenLDAP under 6.8 - no intermediate certs in chain

2020-11-16 Thread Paul B. Henson

On 11/16/2020 6:52 AM, Stuart Henderson wrote:


...actually I have now added a workaround to the databases/openldap port
in 6.8-stable to disable TLS 1.3, so either rebuild or wait for -stable
packages and it should fix things.


Cool, I was actually already building from source in order to enable 
modules. I updated my ports tree and rebuilt, looks good now, thanks 
much for the quick fix.


It still does behave a little bit differently; under 6.7 it was 
including the root CA in the chain sent by the server, under 6.8 it is 
only including the intermediate, not the root. Which I actually prefer, 
as sending the root is a waste of time, the client needs to have that 
itself anyway in order to validate the chain in the first place.




Re: OpenLDAP under 6.8 - no intermediate certs in chain

2020-11-16 Thread Paul B. Henson

On 11/16/2020 2:30 AM, Stuart Henderson wrote:


Yes OpenLDAP is broken with TLS 1.3 server-side unless you have that
commit (or build LibreSSL with TLS 1.3 server support disabled). As far
as I can tell there's no method to disable TLS 1.3 via config.


Hmm, yah, you can disable old versions, but I don't think there is any 
way to disable newer ones.




Re: OpenLDAP under 6.8 - no intermediate certs in chain

2020-11-16 Thread Paul B. Henson

On 11/15/2020 10:18 PM, Brad Smith wrote:

I remember seeing this commit recently. Not sure if this is your problem 
or not.


https://marc.info/?l=openbsd-cvs=160511882917510=2


That definitely looks like it, thanks for the pointer.



OpenLDAP under 6.8 - no intermediate certs in chain

2020-11-15 Thread Paul B. Henson
I just updated one of my servers running 6.7 to 6.8, and am having a
problem with openldap. I have the intermediate cert and root CA in a
file referenced by the openldap config:

TLSCACertificateFile/etc/openldap/cabundle.crt

Under 6.7 with the openldap port from that version, this results in the
chain being served:

Certificate chain
 0 s:CN = ldap-netsvc.pbhware.com
   i:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
 1 s:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
   i:O = Digital Signature Trust Co., CN = DST Root CA X3
 2 s:O = Digital Signature Trust Co., CN = DST Root CA X3
   i:O = Digital Signature Trust Co., CN = DST Root CA X3

However, under 6.8 with the newer openldap 2.4.53 port, only the server
cert itself is being served, not the intermediate or root:

Certificate chain
 0 s:CN = ldap-netsvc.pbhware.com
   i:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3

This of course causes clients to fail to validate the server cert :(.

I'm running openldap 2.4.53 on other operating systems and as far as I
know there's no change in behavior with it. So I'm guessing there's an
interoperability issue between openbsd libressl and openldap that's
causing this problem?

Do I need to configure something differently? Any other suggestions?

Thanks much...



Re: pfsync interface in carp group

2020-06-09 Thread Paul B. Henson

On 6/9/2020 1:42 PM, Markus Wernig wrote:


Neither jumbo frames nor multicast will prevent group demotion when the
other side of a crosslink cable goes physically down. Only not having
the sync interface in the carp group will.


True. But I think he was just discussing general best practices, not 
things specific to the issue I raised. Using jumbo frames would reduce 
the packet overhead if the number of states to be sent in a particular 
transmission would have exceeded the default MTU. In my scenario, it 
looks like I don't have enough traffic for that to occur, at least not 
on a regular basis.




Re: pfsync interface in carp group

2020-06-09 Thread Paul B. Henson

On 6/9/2020 7:36 AM, Stuart Henderson wrote:


IME the best setup for pfsync between 2 machines is to use a dedicated
cross-connect (preferably configured for jumbo frames). Obviously that's
not possible with >2 machines though.


Hmm, I had never considered using jumbo frames. It looks like based on 
the traffic level on my systems, the packets are generally below the 
default 1500 MTU anyway though, so it probably wouldn't help.


12:16:27.564940 lisa-bart.pbhware.com: PFSYNCv6 len 896
12:16:28.023806 lisa-bart.pbhware.com: PFSYNCv6 len 712
12:16:28.195774 bart-lisa.pbhware.com: PFSYNCv6 len 276
12:16:28.207817 lisa-bart.pbhware.com: PFSYNCv6 len 528


I'm undecided what's best to do with the group by default. But I never
use syncpeer for that config, just the default with multicast, which
I think is quite common - changing group based on whether or not
syncpeer is used doesn't make sense to me.


I guess multicast would work too for a direct peer relationship, but it 
just seemed more accurate to explicitly configure the two peers.


Some documentation regarding the interaction of pfsync and the carp 
group might be helpful, along with a suggestion to remove the carp group 
from the pfsync interface in certain deployment scenarios. It would also 
be nice to document the dependency on the two rule sets being exactly 
identical in order to properly replicate rule specific state timeouts 
between them. It took me a while to sort out why that was failing. Maybe 
I will try to write something up; the source for the pfsync man page is 
in CVS, where is the source for webpages such as:


https://www.openbsd.org/faq/pf/carp.html#pfsyncop

Thanks…



Re: pfsync interface in carp group

2020-06-08 Thread Paul B. Henson

On 6/8/2020 6:29 AM, Philipp Buehler wrote:


did you follow some "howto" and set net.inet.carp.preempt=1?


Well, if you consider the official openBSD documentation a "how-to", 
then yes :).


In the example in https://www.openbsd.org/faq/pf/carp.html under the 
section "Combining CARP and pfsync for Failover" it says:


! enable preemption and group interface failover
# sysctl net.inet.carp.preempt=1
# echo 'net.inet.carp.preempt=1' >> /etc/sysctl.conf

As well as in the example in 'man pfsync':

The following must also be added to /etc/sysctl.conf:

   net.inet.carp.preempt=1


One of my firewalls has newer hardware and more power than the other, it 
is the primary. If I reboot it and the load fails over to the secondary, 
I want the load to automatically come back to the primary once it is 
available again.


Thanks…



Re: pfsync interface in carp group

2020-06-08 Thread Paul B. Henson

On 6/7/2020 5:21 PM, Markus Wernig wrote:


I don't see that behaviour on my carp pair. Are you using a cross-link
cable between the two firewalls? (You shouldn't, in my experience.)


Yes, I am using a direct link between the two physical firewalls. It 
seems to be the configuration recommended by the documentation?


https://www.openbsd.org/faq/pf/carp.html

"The firewalls are connected back-to-back using a crossover cable on em1."

As well as in 'man pfsync':

"Only run the pfsync protocol on a trusted network - ideally a network 
dedicated to pfsync messages such as a crossover cable between two 
firewalls."


"A crossover cable connects the two firewalls via their sis2 interfaces."

Is this no longer a best practice?



pfsync interface in carp group

2020-06-07 Thread Paul B. Henson
I've had a pair of redundant firewalls using pfsync for years. I've 
noticed in the past that whenever I rebooted the secondary firewall, the 
carp interfaces on the primary would flip to backup and then back to 
master as the secondary one rebooted. I never really noticed any issues 
with it, so I just ignored it.


Since upgrading to 6.7 though, if I have an active ssh connection to the 
carp IP address on the primary when I reboot the secondary and these 
interface flip-flops occur, my connection is dropped, which is undesirable.


It looks like this is happening because by default the pfsync interface 
is in the carp group, so when it goes down, it demotes all of the carp 
interfaces on the system. I can see why this would be useful for a setup 
using multicast and more than two firewalls, as if you are not 
synchronizing states, you are probably not the best choice to be the 
active owner of the virtual IP addresses.


However, for only two firewalls, when you're using the syncpeer 
directive for the pfsync interface, it seems it would be better not to 
default to belonging to the carp group? With only two firewalls, if one 
of them has broken synchronization, so does the other, so is there any 
real point in trying to migrate away from the one that's currently master?


I updated my configuration to remove the pfsync interface from the carp 
group and now when I reboot there are no issues with the carp interfaces 
changing state or connections being dropped. Would it make sense to not 
automatically include the pfsync interface in the carp group if it is 
using the syncpeer directive?


Thanks…



Re: pfsync and rule specific state timeouts

2020-06-07 Thread Paul B. Henson

On 6/5/2020 11:15 PM, obs...@loopw.com wrote:


1)  “egress” can be used to reference the external nic in a rule,
instead of having a specific IP.  Egress is defined as the nic with
the default route. pass in quick log on egress inet proto tcp to
(egress) port 22


Ah, I think I seen that in the past but did not remember it offhand. 
Thanks; although these boxes run OSPF and the default route changes 
depend on the network state, so I'm not sure that this would work.



2)  Both of the firewall IP addresses can be in a rule if egress is
not suitable for your topology, something like this will sync over
cleanly with pfsync: pass in quick log on $ext_if inet proto tcp to {
$fw1_ext $fw2_ext } port 22


I thought about doing that, but I ended up just making a table with a 
single IP address in it, each router having the appropriate IP address 
in the table, and the rule referencing the table being exactly the same 
on both. Everything is working properly now.


I do still wonder if this requirement is documented anywhere? I've been 
looking, and could not find it. It was very confusing trying to sort out 
why my states were mysteriously disappearing, I ended up having to add 
some extra debugging code in the kernel to figure out what was happening.


Thanks…



pfsync and rule specific state timeouts

2020-06-05 Thread Paul B. Henson
Where is it documented that in order for pfsync to properly synchronize 
rule specific state timeouts that the rule sets on the systems being 
synchronized must be *exactly* the same?


I have a pair of redundant firewalls synchronizing state, and recently 
added a couple rules that increase the default timeout for a UDP connection:


pass out quick on $ext_if proto udp tagged VOIP_UDP keep state 
(udp.multiple 360)
pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP 
keep state (udp.multiple 360)


Despite the timeout being set to six minutes, the states kept 
disappearing after approximately a minute of idle time. After spending a 
lot of time trying to debug it, I finally figured out that the states 
replicated to the backup firewall received the default one minute 
timeout rather than the six minute timeout specified by the rule, and 
when they expired on the backup firewall, they were deleted from the 
primary firewall.


After further debugging, I discovered that pfsync on the receiving 
system only applies the rule specific timeout if the entire rule set is 
exactly identical on both systems. While my rule set was functionally 
identical on both systems, it was not exactly the same, having rules 
such as:


pass in quick on $ext_if proto tcp from any to $ext_if port ssh

which had the primary IP address on each system substituted, resulting 
in a rule set that was "different".


This seems overly strict. What if two systems being used as redundant 
firewalls had different network cards? This would make the names of the 
interfaces different, resulting in rule sets that were not the same, 
preventing per-rule state timeouts from being properly applied.


I can understand you wouldn't want to apply the wrong timeout, but it 
seems that validating a per rule checksum rather than an entire rule set 
checksum would be more flexible. Both the rule number and the rule 
content on both of these systems for these rules are exactly the same. 
It is just other rules that have a different IP address given that each 
system has its own separate IP address in addition to the virtual carp 
address...




state replication bug in pfsync?

2020-06-04 Thread Paul B. Henson
I've been trying to diagnose a mysterious issue where a UDP state 
disappears before it's supposed to expire. I finally tracked it down to 
pfsync. On the primary server, the state entries look like:


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:02:21, expires in 00:04:59, 34:34 pkts, 17887:20606 bytes, 
rule 64
all udp 96.251.22.157:58308 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:02:21, expires in 00:04:59, 34:34 pkts, 17887:20606 bytes, 
rule 49


They shouldn't expire for five minutes. However, the same states, at the 
same time, on the backup server:


Thu Jun  4 18:17:27 PDT 2020
all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:02:22, expires in 00:00:00, 0:0 pkts, 0:0 bytes
all udp 96.251.22.157:58308 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE


expire. And then the synchronization from the backup to the primary 
removes them.


These two systems share a carp vip, and other than the macro defining 
the local IP address of each individual system, pf.conf is exactly the 
same on both.


How come when the state is transferred to the backup after initially 
being created on the primary, the state on the backup has the default 
timeout for udp multiple rather than the custom one defined in my rules:


match out on $ext_if from 10.128.0.0/16 nat-to $ext_vip
pass out quick on $ext_if proto udp tagged VOIP_UDP keep state 
(udp.multiple 360)
pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP 
keep state (udp.multiple 360)



That doesn't seem right. Am I missing something?

Thanks much.



Re: lost pf state - disappeared before expiration?

2020-05-18 Thread Paul B. Henson

On 5/17/2020 8:40 PM, Strahil Nikolov wrote:

> What is your  conf  having as  a timeout ?

Both of the rules explicitly override the default timeout with a six 
minute rule level timeout:



pass in quick on vlan110 proto udp from any to port = 9430 tag
VOIP_UDP keep state (udp.multiple 360)

pass out quick on $ext_if proto udp tagged VOIP_UDP keep state 
(udp.multiple 360)


Which is being successfully applied, as shown by the states, which start 
out with a six minute expiration:


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE 

   age 00:00:02, expires in 00:06:00, 24:23 pkts, 12163:13840 bytes, 
rule 63
all udp 96.251.22.157:55205 (10.128.110.73:9430) -> 198.148.6.55:9430 
MULTIPLE:MULTIPLE
   age 00:00:02, expires in 00:06:00, 24:23 pkts, 12163:13840 bytes, 
rule 48, source-track


However, once a minute has passed, and the expiration shows five minutes 
left:



age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes,
rule 63 all udp 96.251.22.157:55205 (10.128.110.73:9430) ->
198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in
00:05:00, 29:29 pkts, 14166:18501 bytes, rule 48, source-track


Both of the rules simply disappear. Interestingly, I believe the default 
multiple:multiple timeout is one minute. Which makes me wonder if for 
some reason the default timeout is being applied to these rules which 
have an explicit longer timeout? That seems buggy, unless there is 
something wrong with my configuration. Even so, for a state that says it 
has five minutes left to go away doesn't seem right.


Thanks for the input…



lost pf state - disappeared before expiration?

2020-05-17 Thread Paul B. Henson
I'm trying to set a longer timeout on a udp state, and for some reason it
seems to be disappearing before the expiration 8-/.

There are 3 rules involved:

pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep 
state (udp.multiple 360)

pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 
360)

match out on $ext_if from 10.128.0.0/16 nat-to { $ext_vip } sticky-address

I turned on pf debugging, when the connection is created I see:


May 17 15:36:39 lisa /bsd: pf: key search, in on vlan110: UDP wire: (0) 
10.128.110.73:9430 198.148.6.55:9430
May 17 15:36:39 lisa /bsd: pf: key setup: UDP wire: (0) 10.128.110.73:9430 
198.148.6.55:9430 stack: (0) -
May 17 15:36:39 lisa /bsd: pf: key search, out on em2: UDP wire: (0) 
198.148.6.55:9430 10.128.110.73:9430
May 17 15:36:39 lisa /bsd: pf: key setup: UDP wire: (0) 198.148.6.55:9430 
96.251.22.157:63529 stack: (0) 198.148.6.55:9430 10.128.110.73:9430

and there are state entries:

all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 63
all udp 96.251.22.157:55205 (10.128.110.73:9430) -> 198.148.6.55:9430   
MULTIPLE:MULTIPLE
   age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 48, 
source-track

However, right after the 5 minute mark the states disappear. The last pf log
entries are;

May 17 15:38:47 lisa /bsd: pf: key search, in on vlan110: UDP wire: (0) 
10.128.110.73:9430 198.148.6.55:9430
May 17 15:38:47 lisa /bsd: pf: key search, out on em2: UDP wire: (0) 
198.148.6.55:9430 10.128.110.73:9430

I was hoping to see something about expiration in the pf debug logs but
this is all that appears to be available.

Any idea why these states would go away when there is 5 minutes left
before the expiration?

Thanks much...



mysteriously disappearing pf state entries

2020-05-08 Thread Paul B. Henson
I'm running OpenBSD 6.6 operating as an inter-VLAN and border router 
using pf. Recently I wanted to use a nondefault state timeout for some 
UDP traffic traversing from my voip subnet to a provider off site.


Within pf, there are three rules involved. The first is for traffic 
coming from the voip subnet, which gets a six minute state timeout and a 
tag:


pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP 
keep state (udp.multiple 360)


Then there is a NAT rule:

match out on $ext_if from 10.128.0.0/16 nat-to { $ext_vip } sticky-address

And a rule giving the traffic going out to the Internet a six minute 
timeout as well:


pass out quick on $ext_if proto udp tagged VOIP_UDP keep state 
(udp.multiple 360)



This initially looks like it worked; after the initial connection:

bash-5.0# pfctl -v -s state | grep -A1 '110.73:9430'
all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:00, expires in 00:06:00, 7:5 pkts, 4451:2203 bytes, rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:00, expires in 00:06:00, 7:5 pkts, 4451:2203 bytes, rule 
48, source-track


There are two states, one with the internal addressing and one for the 
NAT translation, both with six minute timeouts. As time goes by:


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:08, expires in 00:05:54, 31:31 pkts, 16469:18285 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:08, expires in 00:05:54, 31:31 pkts, 16469:18285 bytes, 
rule 48, source-track


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:20, expires in 00:05:42, 31:31 pkts, 16469:18285 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:20, expires in 00:05:42, 31:31 pkts, 16469:18285 bytes, 
rule 48, source-track


More packets are seen, resetting the timeout:

all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:23, expires in 00:05:58, 32:32 pkts, 16872:19073 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:23, expires in 00:05:58, 32:32 pkts, 16872:19073 bytes, 
rule 48, source-track


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:38, expires in 00:05:43, 32:32 pkts, 16872:19073 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:38, expires in 00:05:43, 32:32 pkts, 16872:19073 bytes, 
rule 48, source-track


again:

all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:41, expires in 00:06:00, 33:33 pkts, 17275:19931 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:41, expires in 00:06:00, 33:33 pkts, 17275:19931 bytes, 
rule 48, source-track


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:00:58, expires in 00:05:43, 33:33 pkts, 17275:19931 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:00:58, expires in 00:05:43, 33:33 pkts, 17275:19931 bytes, 
rule 48, source-track


etc, etc:

all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:02:26, expires in 00:05:52, 37:37 pkts, 18863:23594 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:02:26, expires in 00:05:52, 37:37 pkts, 18863:23594 bytes, 
rule 48, source-track


Until finally, there are no more packets for a while:

all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:02:36, expires in 00:05:54, 47:46 pkts, 24551:29876 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:02:36, expires in 00:05:54, 47:46 pkts, 24551:29876 bytes, 
rule 48, source-track


all udp 198.148.6.55:9430 <- 10.128.110.73:9430   MULTIPLE:MULTIPLE
   age 00:03:31, expires in 00:04:59, 47:46 pkts, 24551:29876 bytes, 
rule 63
all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 
   MULTIPLE:MULTIPLE
   age 00:03:31, expires in 00:04:59, 47:46 pkts, 24551:29876 bytes, 
rule 48, source-track


After this, the next time I look a couple seconds later, the state is 
gone? It reproducibly seems to disappear a minute after the last traffic 
is seen on the connection. Yet the timeout says 5 minutes are left?


Why would the state be removed when it still had five minutes left 
before it expired? I know if it were a TCP state, it might go away 
before the timeout expires if the connection is shut down. But this is a 
UDP state. What would cause it to go away before the timeout expiration? 
Is there something 

isc bind - error sending response: would block

2018-11-16 Thread Paul B. Henson
I recently updated a couple servers that were running OpenBSD 6.3 with bind
9.11.3 to OpenBSD 6.4 and bind 9.11.4pl2. Since then, I'm been getting a large
number of "error sending response: would block" log messages:

Nov 15 11:03:58 lisa named[79587]: client @0x6f2f02bc440 10.128.30.77#65198 
(p64-keyvalueservice.icloud.com): view internal: error sending response: would 
block

Nov 15 11:07:42 lisa named[79587]: client @0x6f325b7a440 10.128.0.19#1851 
(alt1.gmail-smtp-in.l.google.com): view internal: error sending response: would 
block

I reviewed the article at https://kb.isc.org/docs/aa-00717 ; but it's not clear
if this just a warning message, and it tries again and successfully responds
to the client, or is it's a hard error and the client never gets a response? I
wasn't getting any errors before the upgrade, and I don't think the load on
these servers is anywhere near high enough to cause them to be overloaded.

Any thoughts on what might be going on? New bug in bind? Change in OpenBSD?
So far I haven't gotten a response on the bind mailing list.

Thanks...



Re: smtpd new "relay as" syntax?

2018-10-31 Thread Paul B. Henson
On Wed, Oct 31, 2018 at 08:07:09PM -0400, TronDD wrote:

> Mail-from in the action options, I believe.

Ah, yes; that seems to work, thanks. The previous implementation was
documented as:

If the as parameter is specified, smtpd(8) will rewrite
the sender advertised in the SMTP session.  address may
be a user, a domain prefixed with `@', or an email
address, causing smtpd(8) to rewrite the user-part, the
domain-part, or the entire address, respectively.

whereas this just said:

mail-from mailaddr
Use mailaddr as the MAIL FROM address within the SMTP
transaction.

It wasn't clear it would do the same rewriting functionality, I thought
at first it just took a single email address.



smtpd new "relay as" syntax?

2018-10-31 Thread Paul B. Henson
I just upgraded to OpenBSD 6.4, and I'm trying to figure out how to do
this with the new syntax:

accept from local for any relay via smtp://smtp.domain.com as "@domain.com"

This would rewrite the outbound message to masquerade as being from the
TLD rather than a specific machine. Right now I've got:

action local_relay relay host smtp.domain.com
match from local for any action local_relay

But this doesn't do the rewriting. The only thing I see in the man page
talks about 'senders  [masquerade]' which seems to be for
authenticated users.

Am I missing something obvious?

Thanks...



Re: opensmtpd / ldap unreliable

2018-05-26 Thread Paul B. Henson
On Sat, May 26, 2018 at 08:16:28AM +0200, Gilles Chehade wrote:

> please do so we have more people able to test

Done, thanks.

What are your thoughts design-wise on dealing with ldap not being
available at startup? Should layer 7 issues (ldap auth failed, etc) be
handled differently than transport level issues (connection
refused/timed out)?



Re: opensmtpd / ldap unreliable

2018-05-24 Thread Paul B. Henson
> From: Gilles Chehade
> Sent: Wednesday, May 23, 2018 1:20 PM
> 
> That's bad but could easily be fixed if you want to help us

So I dropped in the latest table-ldap from git, and it still failed
authentications after an LDAP server outage. It looks like the check is only
in the table_ldap_check function? I'm not sure what that's for, but it
doesn't seem to be called at all when doing authentication. I added a
similar check into the table_ldap_lookup function, and also had to reorder
the functions  in the file a bit due to errors like this:

table_ldap.c:92:15: warning: implicit declaration of function 'ldap_open' is
invalid in C99 
  [-Wimplicit-function-declaration]   

Afterwards, opensmtpd successfully reconnected to LDAP and performed
authentication after an LDAP outage :).

users[14726]: debug: table_ldap: ldap_query:
filter=(&(objectClass=uidObject)(uid=henson)), ret=0
users[14726]: debug: table-ldap: reconnecting
users[14726]: info: table-ldap: closed previous connection
users[14726]: debug: ldap server accepted credentials
users[14726]: debug: table_ldap: ldap_query:
filter=(&(objectClass=uidObject)(uid=henson)), ret=1


Here's what my changes currently are. I can submit a pull request on github
if you'd like. Thanks.

diff --git a/extras/tables/table-ldap/table_ldap.c
b/extras/tables/table-ldap/table_ldap.c
index 88c9ffd..9d20526 100644
--- a/extras/tables/table-ldap/table_ldap.c
+++ b/extras/tables/table-ldap/table_ldap.c
@@ -74,45 +74,6 @@ table_ldap_update(void)
return 1;
 }
 
-static int
-table_ldap_check(int service, struct dict *params, const char *key)
-{
-   int ret;
-
-   switch(service) {
-   case K_ALIAS:
-   case K_DOMAIN:
-   case K_CREDENTIALS:
-   case K_USERINFO:
-   case K_MAILADDR:
-   if ((ret = ldap_run_query(service, key, NULL, 0)) >= 0) {
-   return ret;
-   }
-   log_debug("debug: table-ldap: reconnecting");
-   if (!(ret = ldap_open())) {
-   log_warnx("warn: table-ldap: failed to connect");
-   }
-   return ret;
-   default:
-   return -1;
-   }
-}
-
-static int
-table_ldap_lookup(int service, struct dict *params, const char *key, char
*dst, size_t sz)
-{
-   switch(service) {
-   case K_ALIAS:
-   case K_DOMAIN:
-   case K_CREDENTIALS:
-   case K_USERINFO:
-   case K_MAILADDR:
-   return ldap_run_query(service, key, dst, sz);
-   default:
-   return -1;
-   }
-}
-
 static int
 table_ldap_fetch(int service, struct dict *params, char *dst, size_t sz)
 {
@@ -361,6 +322,32 @@ err:
return 0;
 }
 
+static int
+table_ldap_lookup(int service, struct dict *params, const char *key, char
*dst, size_t sz)
+{
+   int ret;
+
+   switch(service) {
+   case K_ALIAS:
+   case K_DOMAIN:
+   case K_CREDENTIALS:
+   case K_USERINFO:
+   case K_MAILADDR:
+   if ((ret = ldap_run_query(service, key, dst, sz)) > 0) {
+   return ret;
+   }
+   log_debug("debug: table-ldap: reconnecting");
+   if (!(ret = ldap_open())) {
+   log_warnx("warn: table-ldap: failed to connect");
+   return ret;
+   }
+   return ldap_run_query(service, key, dst, sz);
+   default:
+   return -1;
+   }
+}
+
+
 static int
 ldap_query(const char *filter, char **attributes, char ***outp, size_t n)
 {
@@ -498,6 +485,31 @@ end:
return ret;
 }
 
+static int
+table_ldap_check(int service, struct dict *params, const char *key)
+{
+   int ret;
+
+   switch(service) {
+   case K_ALIAS:
+   case K_DOMAIN:
+   case K_CREDENTIALS:
+   case K_USERINFO:
+   case K_MAILADDR:
+   if ((ret = ldap_run_query(service, key, NULL, 0)) >= 0) {
+   return ret;
+   }
+   log_debug("debug: table-ldap: reconnecting");
+   if (!(ret = ldap_open())) {
+   log_warnx("warn: table-ldap: failed to connect");
+   }
+   return ret;
+   default:
+   return -1;
+   }
+}
+
+
 int
 main(int argc, char **argv)
 {




Re: opensmtpd / ldap unreliable

2018-05-23 Thread Paul B. Henson
> From: Gilles Chehade
> Sent: Wednesday, May 23, 2018 1:20 PM
> 
> That's bad but could easily be fixed if you want to help us

Definitely; I'll pull the latest github head down and see if that fixes the
LDAP connection recovery after startup issue, and then I can try any
suggestions to make it more reliable at startup or possibly fiddle with that
code myself.

> That would be a bad idea... it's experimental :-p

I did see that mentioned circa 2013, but I guess I kind of hoped it had
moved beyond that by now :).

Thanks much.



Re: opensmtpd / ldap unreliable

2018-05-23 Thread Paul B. Henson

> From: justina colmena
> Sent: Tuesday, May 22, 2018 9:08 PM
> 
> Are they being started in the wrong order at boot time?

The LDAP server in use is not running on the local openBSD system. It might not 
be available due to an underlying network issue or some other problem that 
temporarily prevents successful connections/queries.

> What you ask is a very general question: If A depends on B, and B is
> missing, how do expect A to behave?

In this specific case, I expect A to complain it was unable to contact B, to 
continue initializing, return temporary failures for any operation which 
requires B, and reattempt a connection to B on a regular basis until it is 
successful. From a reliability and full tolerance perspective, falling over and 
dying doesn't seem a very good choice for the circumstances.




opensmtpd / ldap unreliable

2018-05-22 Thread Paul B. Henson
So I recently converted my opensmtpd server to use ldap as the backend
for user authentication. It seems it's a bit untolerant to ldap issues?

If the ldap server isn't available when opensmtpd is started, it says it
started:

# /etc/rc.d/smtpd start
smtpd(ok)

But it isn't there:

# ps -aux | grep smtpd
root 89090  0.0  0.0   304  1208 p6  S+p5:52PM0:00.00 grep smtpd

And it's not really obvious why:

May 22 17:52:51 bart smtpd[46044]: info: OpenSMTPD 6.0.4 starting
May 22 17:52:51 bart smtpd[23325]: warn: table-proc: pipe closed
May 22 17:52:51 bart smtpd[23325]: lookup: table-proc: exiting
May 22 17:52:51 bart smtpd[73239]: smtpd: process lka socket closed

Starting in debug mode:

# smtpd -d
info: OpenSMTPD 6.0.4 starting
users[43283]: debug: reading key "url" -> "ldap://localhost:3389;
users[43283]: debug: reading key "basedn" ->
users[43283]: debug: reading key "username" ->
users[43283]: debug: reading key "password" ->
users[43283]: debug: reading key "credentials_filter" -> 
"(&(objectClass=uidObject)(uid=%s))"
users[43283]: debug: parsing attribute "credentials_attributes" (2) -> 
"uid,description"
users[43283]: debug: done reading config
users[43283]: warn: aldap_parse
users[43283]: fatal: failed to connect
warn: table-proc: pipe closed
lookup: table-proc: exiting
smtpd: process lka socket closed

You can see it looks like it fails to connect to the ldap server at
startup and just dies.

Further, if the ldap server is up at startup, but ever restarts or has
the connection broken, authentication just fails:

May 21 13:22:10 bart smtpd[42132]: warn: user credentials lookup fail for 
users:henson

The opensmtpd process needs to be restarted before authentication works
again.

In debug mode, it shows:

users[7295]: debug: table_ldap: ldap_query:
filter=(&(objectClass=uidObject)(uid=henson)), ret=0
5e46e2fabbf8d72e smtp event=authentication user=henson
address=134.71.249.41 host=134.71.249.41 result=permfail

Is it expected that the ldap support is currently not production ready?
I see in a presentation from back in 2013 that ldap was classified
experimental at the time, but it's not clear if that's still the case.

I see in the repo at

https://github.com/OpenSMTPD/OpenSMTPD-extras/blob/master/extras/tables/table-ldap/table_ldap.c

there's a change to add ldap reconnection support:

https://github.com/OpenSMTPD/OpenSMTPD-extras/commit/04e4c521b34d1987af915ff97dcb0d87daf122b0#diff-369c0fcbfbc85bf2cdad7dba1131b872

but it's dated 7/27/2017, and the last github release seems to be
201601072302 (although the openbsd port appears to be 201703132115, I
guess it's not downloading it from github?).

It looks like the code in head still fails to start if the ldap server
isn't available when opensmtpd is started though.

Is anybody using opensmtpd with ldap in production? If so, how are you
working around this issue?

Thanks...



Re: pcengines apu boards

2018-01-28 Thread Paul B. Henson
On Wed, Jan 17, 2018 at 12:56:04PM +0100, Christopher Zimmermann wrote:

> I have the same problem and have tried to hunt the bug, but failed so
> far. Have you already identified the quirks linux and freebsd use to
> fix this problem?

No :(, I worked on it for a while but kernel hacking isn't my
speciality. I don't think the specific quirks I was initially trying to
port would have fixed it anyway, as they seemed mainly aimed at data
transfers and I couldn't even get the miniPCI card to get hot plugged
and detected while testing with an external miniPCI to USB adapter
plugged into the internal EHCI header.

I ended up just using the external adapter plugged into the xHCI ports
exposed outside the case. Annoying not to be able to just have it inside
the case, but it works like a champ in this configuration.



Re: rdomain/rtable

2017-12-24 Thread Paul B. Henson
Thanks for the info. I don't want to move any interfaces to a
non-default routing domain, I just want to be able to run a process with
a different default route. I can make that work, via the route -T 10
exec you mention after setting a default route in that domain.

But I can't seem to get traffic for my local subnet sent out my
internal interface, even after I add a route to it in the non-default
routing domain. Dunno, maybe I'm missing something.

I set it up like:

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
default24.x.x.x  UGS02 - 8 umb0
10.0/1610.128.0.20UGS00 - 8 em0

But 'ping 10.128.0.20' shows the packets going out umb0, not em0?

Thanks again.

On Sat, Dec 23, 2017 at 05:07:37PM +0100, Sebastian Benoit wrote:
> 
> When you create a new routing domain, for example by adding an interface to
> a routing domain (e.g. ifconfig umb0 rdomain 10), you create a new routing
> table 10. It will be empty until you add an address on umb0 or, for example
> add your default route.
> 
> This routing table will be used to forward packets that are "in that routing
> domain" (the packet is marked with the rdomain or rather the rtable it will
> use). How does the packet get marked?
> 
> Three ways:
> 
> * with pf, as you have discovered. As the manpage documents, the
> mark needs to be set before route lookup is done.
> 
> * when a paket comes in on an interface in rdomain 10, it will stay in
> rdomain 10 (unless pf changes it).
> 
> * a packet is generated on the local machine by a process that "is in that
> routing domain". I.e. processes are also marked with a rdomain.
> 
> To start a process in a specific rdomain (10), use "route -T 10 exec
> command", for example
> 
>   route -T 10 exec ping -n ip
> 
> or even
> 
>   route -T 10 exec ksh
> 
> Processes spawned by that shell will inherit the rdomain.
> 
> Note that i used -n in the ping example. DNS resolving using the resolvers
> in resolv.conf might not work, as long as those resolvers are not reachable
> in rdomain 10.
> 
> Hope this helps ...



Re: Solved IPMI, but I can't get onto network to outside

2017-12-21 Thread Paul B. Henson
On Thu, Dec 21, 2017 at 12:52:33PM -0700, Chris Bennett wrote:

> > > IP: 104.217.196.248/29
> > > Gateway: 104.217.196.249
> > > Netmask: 255.255.255.248
> > >
> > 
> > What is your network interface?
> > 
> 
> I have two, em0 and em1
> 
> em0:
> inet 104.217.196.248 255.255.255.248
> 
> And I admit I really don't see what IP addresses I get
> with 104.217.196.248/29.

That's not the IP address you're supposed to use, that's the subnet
they've allocated you.

See:

http://www.subnet-calculator.com/subnet.php

104.217.196.248 is the network address, you can't assign that to an
actual host. The usable IP addresses in that subnet are
104.217.196.249-104.217.196.254, 104.217.196.249 is your gateway, so
that leaves you 104.217.196.250-104.217.196.254 to assign to your
systems. 104.217.196.255 is the broadcast address for the subnet.

Update your hostname.em0 to use 104.217.196.250 and make sure your
/etc/mygate file contains 104.217.196.249.



rdomain/rtable

2017-12-19 Thread Paul B. Henson
I've got a box with an LTE cellular modem in it whose purpose is to provide
a backup connection to the Internet if the hardwire service goes down. It's
running OSPF to connect to the rest of the network, and the only time any
traffic should go over the cellular link (which is slower and bandwidth
capped) is if the hardwire interconnection is down, including ideally
traffic generated from the system itself.

I have that part working, by adding in a local static default route to the
cellular gateway with less priority than the OSPF default route. However,
for testing purposes, I'd like to be able to poke out the cellular link on
an as-needed basis without having to switch the entire box over to using it.
Virtual routing tables looked perfect for this purpose, as I could just
spawn a single process with a different default route, we do something
similar with network name spaces under Linux.

However, I can't quite get it to work. What I'd really like is to be able to
make a copy of the current system routing table, then change one thing about
it. However, a new rdomain shows up with no routes or interfaces in the
routing table. I can add the new default route pointing out the cellular
link, and get traffic to go out there. But I haven't sorted out how to make
all the traffic for my internal network still go through the internal link
rather than get sent out the default route. While ideally all the OSPF
routes would propagate to the other routing domain I tried just adding a
static to the /16 for our internal address space:

Internet:
DestinationGatewayFlags   Refs  Use   Mtu  Prio
Iface
default24.x.x.x  UGS06 - 8 umb0
10.0/1610.128.0.21UGS00 - 8 em0

That doesn't work; the documentation says you need to get pf to pass packets
across routing domains. However, it says:

rtable number
Used to select an alternate routing table for the routing lookup.
Only effective before the route lookup happened, i.e. when
filtering inbound.

Unfortunately, for traffic originating from the system itself, there isn't
really an "inbound" interface? So I'm not sure what pf rule would make this
work. Is it just not possible, or am I missing something?

Thanks much.



Re: help updating EHCI driver

2017-12-07 Thread Paul B. Henson
> From: Martin Pieuchot
> Sent: Thursday, December 7, 2017 3:18 AM
> 
> Which issue are you having?

Sorry, there was more context in an earlier thread. Basically, I have a pc 
engines APU3 board which has AMD Hudson-2 EHCI USB ports on it. If devices are 
plugged in when the system boots and the ports are initialized, the operating 
system sees they are there. However, if you hot plug a device after it is 
booted, or remove a device that was plugged in at boot, the system does not 
notice the change in state. Also, the Sierra wireless LTE modem I was trying to 
use does not function, the driver sends an open message over the USB bus to the 
device and then nothing ever comes back. Once the system is booted, the 
interrupt account for the ECHI ports from 'vmstat -I' never increases. The xHCI 
USB ports on the same system seemed to work fine, detecting hot plug/remove 
events, and properly initializing the wireless modem.

> What makes you think that the quirks below
> will help?  What do you mean with 'work fine on those systems'?  If they
> work fine, which issues are you having?

The same board when booted up under either Linux or FreeBSD appears to have 
fully functional EHCI USB ports; they detect hot plug/remove events, and under 
linux the wireless modem is initialized and can successfully pass traffic 
(FreeBSD doesn't have a driver for it, so I was unable to test it there).

I honestly don't know if these specific quirks will resolve the issue under 
OpenBSD, all I know is that there is something Linux/FreeBSD is doing different 
regarding the USB hardware, and porting these quirks seemed a good place to 
start. I'm not really a low level hardware/device driver guy, so I'm flying a 
bit blind. Someone told me there are known issues with amd USB ports in 
general, such as ath based USB wireless cards not working, so these might help 
that problem even if it doesn't fix mine.

> It depends how the controller is connected to the host.  If you look at
> the PCI glue driver, dev/pci/ehci_pci.c you'll see
> 
> 115:  /* Map I/O registers */
> 116:  if (pci_mapreg_map(pa, PCI_CBMEM, PCI_MAPREG_TYPE_MEM, 0,
> 117: >sc.iot, >sc.ioh, NULL, >sc.sc_size, 0))
> 
> Then in the EHCI driver, dev/usb/ehci.c, these are accessed via the
> EREAD/EWRITE/EOREAD/EOWRITE macros.

Ah, ok; it appears that pci_mapreg_map defines the start of the region as:
 
ex = pa->pa_ioex;
if (ex != NULL) {
start = max(PCI_IO_START, ex->ex_start);

with:

#define PCI_IO_START  0

So the starting memory address is either 0 or pa->pa_ioex from the struct 
pci_attach_args that was passed into ehci_pci_attach.

Given the existing reads:

sc->sc.sc_offs = EREAD1(>sc, EHCI_CAPLENGTH);
define EHCI_CAPLENGTH 0x00

I'm pretty sure it's not zero, so it must be the one from  pa_ioex.

> But maybe you just want to use pci_conf_read()/pci_conf_write()?

Hmm, given my lack of detailed knowledge of this area I can't say for sure. 
However, there are two different things being done in the linux code I am 
referring to:

outb_p(0xe0, 0xcd6);

This I believe writes the byte 0xe0 to the  I/O port at address 0xcd6, where as 
this:

pci_write_config_dword(amd_chipset.nb_dev, 0xe4, val);

Writes the contents of the variable val to the PCI configuration register 
located at 0xe4. Those are two different operations, right?

So wherever the linux code writes to an absolute I/O port for the USB device, 
such as 0xcd6, I can subtract the beginning of the mapped region as stored in 
pa_ioex from it to arrive at the appropriate offset value to use with 
EREAD/WRITE?

> Is low power mode enabled on OpenBSD?

Based on the comment in the linux code:

"The hardware normally enables the A-link power management feature, which
lets the system lower the power consumption in idle states."

I believe that is the default behavior of the hardware unless you explicitly do 
something otherwise with it.

 > The other quirk involves never having an empty frame list; I have
> > implemented the logic to detect when that is required, but haven't even
> > come close to wrapping my head around actually implementing the quirk
> > itself.
> 
> For which transfer type is this quirk required, isochronous only?

The explanation of this quirk is:

"EHCI controller on AMD SB700/SB800/Hudson-2/3 platforms may
read/write memory space which does not belong to it when
there is NULL pointer with T-bit set to 1 in the frame list
table. To avoid the issue, the frame list link pointer
should always contain a valid pointer to a inactive qh."

I am also unfortunately not that expert in the underlying USB hardware level 
protocol, but the quirk is referenced in two functions, scan_isoc:

if (!ehci->use_dummy_qh ||
q.itd->hw_next != EHCI_LIST_END(ehci))
*hw_p = q.itd->hw_next;
else
 

Re: 3g modem support

2017-12-06 Thread Paul B. Henson
> From: Marko Cupac
> Sent: Wednesday, December 6, 2017 2:47 AM
> 
> ...which suggests some Sierra Wireless modems, none of which are
> available for purchase in the country I live in.

I've got the MC7455, which I believe is basically the same as the EM7455. 
Presumably this might be one of the cards you say you can't get though. I 
haven't been able to thoroughly test it given my issues with the APU3, but a 
friend of mine played with it a bit under vmware and it seems functional under 
both the umsm driver with PPP and the umb driver in MBIM mode, although in 
order to use the latter you need to disable the umsm module as it claims the 
device with a higher priority.

I ended up ordering one of these:

https://www.amazon.com/gp/product/B01JGCSPEA/ref=oh_aui_detailpage_o00_s00?ie=UTF8=1

and will most likely connect the miniPCI card to the external xHCI ports on the 
APU3 which at least at initial glance seems functional. Kind of kludgy, but 
better than the other fallback plan of using linux for this deployment :).




help updating EHCI driver

2017-12-05 Thread Paul B. Henson
I'm trying to port some quirks for AMD USB chipsets from other operating
systems to OpenBSD to hopefully resolve issues I am having with the pc
engines APU3 EHCI ports, as they seem to work fine on those systems.
I've got a pretty rough draft of one of them, which disables low-power
mode during transfers, but would appreciate a little clarification on
device I/O as I'm not generally a device driver developer.

Under Linux, the kernel uses absolute addresses when it's doing port I/O
to a device, so that's what I am referencing in their implementation. In
OpenBSD I see that a driver maps a handle to a region of memory and then
uses offsets from the base of that region for port I/O. It looks like
the EHCI driver code has already mapped that region and the handle is
available for me to use, but I don't see where that mapping was made or
how to figure out what the base was in order to turn the absolute
addresses I have into appropriate offsets to use with the openbsd API?

Then, for some of the chipsets, in addition to poking at the USB device
itself to twiddle the low-power mode, you also have to muck with the
northbridge configuration. I think I gathered the device information,
although I don't know that was the correct way to do so; but I need to
map the I/O region for it to a handle so I can modify it. If a driver for
one device needs to write to a different device is it supposed to call
bus_space_map on its own to get a mapping, or can it somehow get access
to the existing one already in place for that device?

Finally, low power mode is supposed to be disabled whenever there are
asynchronous transfers occurring, and then re-enabled once they
complete. I'm not sure I've put the calls in the right place, and I know
I haven't handled the case where transfers fail or are canceled  rather
than complete.

The other quirk involves never having an empty frame list; I have
implemented the logic to detect when that is required, but haven't even
come close to wrapping my head around actually implementing the quirk
itself.

In any case, here is my current laughable diff, advice and corrections
most appreciated.


Index: pci/ehci_pci.c
===
RCS file: /cvs/src/sys/dev/pci/ehci_pci.c,v
retrieving revision 1.30
diff -u -p -r1.30 ehci_pci.c
--- pci/ehci_pci.c  20 Jul 2016 09:48:06 -  1.30
+++ pci/ehci_pci.c  6 Dec 2017 02:46:24 -
@@ -66,6 +66,8 @@ struct ehci_pci_softc {
 };
 
 int ehci_sb700_match(struct pci_attach_args *pa);
+int ehci_amd_pll_quirk_match(struct pci_attach_args *pa);
+int ehci_amd_pll_quirk_match_nb(struct pci_attach_args *pa);
 
 #define EHCI_SBx00_WORKAROUND_REG  0x50
 #define EHCI_SBx00_WORKAROUND_ENABLE   (1 << 3)
@@ -111,6 +113,7 @@ ehci_pci_attach(struct device *parent, s
char *devname = sc->sc.sc_bus.bdev.dv_xname;
usbd_status r;
int s;
+   struct pci_attach_args amd_pa;
 
/* Map I/O registers */
if (pci_mapreg_map(pa, PCI_CBMEM, PCI_MAPREG_TYPE_MEM, 0,
@@ -131,6 +134,86 @@ ehci_pci_attach(struct device *parent, s
 
/* Handle quirks */
switch (PCI_VENDOR(pa->pa_id)) {
+   case PCI_VENDOR_AMD:
+   /* AMD errata indicates 8111 chipset EHCI is broken */
+   if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_8111_EHCI) {
+   printf("%s: AMD 8111 EHCI broken, skipping", devname);
+   goto disestablish_ret;
+   }
+   if (pci_find_device(_pa, ehci_amd_pll_quirk_match)) {
+   sc->sc.amd_chipset.rev = PCI_REVISION(amd_pa.pa_class);
+   if (PCI_PRODUCT(amd_pa.pa_id) == 
PCI_PRODUCT_ATI_SBX00_SMB) {
+   if (sc->sc.amd_chipset.rev >= 0x10 &&
+   sc->sc.amd_chipset.rev <= 0x1f)
+   sc->sc.amd_chipset.gen = 
AMD_CHIPSET_SB600;
+   else if (sc->sc.amd_chipset.rev >= 0x30 &&
+sc->sc.amd_chipset.rev <= 0x3f)
+   sc->sc.amd_chipset.gen = 
AMD_CHIPSET_SB700;
+   else if (sc->sc.amd_chipset.rev >= 0x40 &&
+sc->sc.amd_chipset.rev <= 0x4f)
+   sc->sc.amd_chipset.gen = 
AMD_CHIPSET_SB800;
+   else
+   sc->sc.amd_chipset.gen = 
AMD_CHIPSET_UNKNOWN;
+
+   }
+   else if (PCI_PRODUCT(amd_pa.pa_id) == 
PCI_PRODUCT_AMD_HUDSON2_SMB) {
+   if (sc->sc.amd_chipset.rev >= 0x11 &&
+   sc->sc.amd_chipset.rev <= 0x14)
+   sc->sc.amd_chipset.gen = 
AMD_CHIPSET_HUDSON2;
+   else if (sc->sc.amd_chipset.rev >= 0x15 &&
+

Re: pcengines apu boards

2017-12-04 Thread Paul B. Henson
> From: Marko Cupac
> Sent: Monday, December 4, 2017 3:54 AM
> 
> I have just ordered one APU3b4, as I wanted to test mobile provider as
> a backup link. I see it probably won't be any good as OpenBSD router
> (yet), but at least I'll be able to test and give feedback.

Assuming you're planning to use an internal Mini PCI card, unless you have more 
luck than me, it's not going to work :(. I'm hoping I will be able to fix the 
EHCI driver to be more happy with the AMD USB chipset, but this point I'm still 
fumbling with it :).




Re: pcengines apu boards

2017-12-02 Thread Paul B. Henson
On Sat, Dec 02, 2017 at 10:40:14PM +1000, Douglas Ray wrote:

> On the APU3a4 the internal USB headers were broken.
> I had email from pcengines (March 2017) saying this would
> be addressed in the APU3b series., but we went for APU2.

I have a APU3b series, they fixed the incorrect pinout on the internal
usb headers. The internal ECHI ports work fine under both linux and
freebsd connected to a USB backplate I'm testing with. It's definitely a
disagreement between the AMD EHCI USB chipset and OpenBSD . I'm
going to see if I can port some of the workarounds and quirks for that
chipset from linux/freebsd to the openbsd driver and see if I have any
luck getting it working; drivers aren't my strong suite but we'll see
what happens. In the worst case I guess I'll use an external miniPCI to
USB adapter and connect my LTE modem to the external xHCI ports, they
seem to work fine under OpenBSD.

Thanks...



Re: broken EHCI USB on AMD chipset?

2017-12-01 Thread Paul B. Henson
> From: Stefan Sperling
> Sent: Friday, December 1, 2017 10:35 AM
> 
> Problems with ehci(4) on AMD SB700 are known.
> For instance, athn(4) USB devices don't work on such ports.

Interesting; that's a similar device to the LTE network modem I'm working
with.

> Could you try adding missing workarounds to our EHCI driver to fix
> your problem? That would probably help with other known issues, too.

Hmm, sadly low level device drivers aren't my area of expertise :(. I was
trying to compare the Linux driver, but it is structured quite differently
than the openbsd one. Now that I see that the FreeBSD one works, at least as
far as hot plug/remove, it is more similar to the openbsd driver, I'll see
if I can pick anything out of it that I can make sense out of to add to
openbsd.

I did find a section of code in OpenBSD's echi_pci.c with a comment of
"Enable workaround for dropped interrupts as required" which was being
applied to ATI chipsets; I was excited for a moment as that seems to be
exactly the problem being experienced and AMD bought ATI, so I hoped perhaps
enabling that for my AMD chipset would do something, but unfortunately all
it did was result in interrupt timed out messages .

There's another function called ehci_sb700_match that's looking for an ATI
chipset, which controls whether or not to "apply the ATI SB600/SB700
workaround", those are also the names of the AMD controllers, I'm going to
look to see if perhaps those should be applied or not and if they do
anything.

If my shooting in the dark comes up with anything promising I'll bring it
back to show somebody who knows what they're doing :).

Thanks.



Re: broken EHCI USB on AMD chipset?

2017-11-30 Thread Paul B. Henson
On Tue, Nov 28, 2017 at 08:03:05PM -0800, Paul B. Henson wrote:

> The EHCI ports seem to work fine under Linux, including the LTE modem
> when attached to them, so this seems to be an issue with openbsd, not
> faulty hardware per se.

I tested FreeBSD on this box as well, it detected the EHCI ports as:

usbus1: EHCI version 1.0
usbus1 on ehci0
usbus1: 480Mbps High Speed USB v2.0
usbus2: EHCI version 1.0
usbus2 on ehci1
usbus2: 480Mbps High Speed USB v2.0
ugen1.1:  at usbus1
ugen2.1:  at usbus2
uhub0:  on usbus1
 on usbus2
ugen1.2:  at usbus1
uhub3: 
on usb
us1
ugen2.2:  at usbus2
uhub4: 
on usb
us2

As far as I can tell the ports work ok under FreeBSD, detecting hot plug
and removal of devices, and the interrupt count from vmstat -i increases
when doing so. FreeBSD doesn't support the Sierra Wireless card I have
but I'm guessing it would work.

So it just seems to be an issue with OpenBSD and this board or USB
chipset or something. I turned on debugging in the ehci and uhub code,
but when I plug something in nothing whatsoever happens, so that wasn't
very useful. Any suggestions on other debugging to enable or any other
approach to figure out what's going on here?

Thanks...



Re: pcengines apu boards

2017-11-30 Thread Paul B. Henson
> From: Eike Lantzsch
> Sent: Thursday, November 30, 2017 3:12 PM
> 
> here: APU2C4 with one SATA drive of 6TB and one 4TB via USB3 and an

Hmm, I didn't think the apu2 had USB3, but double checking the specs I see
it does. My friend that said he had an APU2 must actually have an original
APU, as his board doesn't have USB3. Yeah, the external xHCI USB3 ports work
fine on my APU3, it's the EHCI ones that are screwed up, they are only
available via two internal headers or if you use the Mini PCI slot. There
probably aren't very many people that are routing the internal USB headers
to external connectors, so unless somebody is using a USB Mini PCI expansion
card on an APU2/3, they probably aren't using the EHCI controller.

Thanks for the info.



Re: pcengines apu boards

2017-11-30 Thread Paul B. Henson
> From: Bryan Everly
> Sent: Thursday, November 30, 2017 2:46 PM
> 
> I'm running my primary firewall at home on an apu2...

Cool. Have you ever tried using an internal Mini PCI card in it?



Re: pcengines apu boards

2017-11-30 Thread Paul B. Henson
> From: Base Pr1me
> Sent: Thursday, November 30, 2017 2:08 PM
> 
> I run 5 apu2 devices with no problems. I don't have any apu3 devices ... yet.

Thanks for the feedback. Do you by any chance have any USB type Mini PCI cards 
installed internally? I initially noticed the issue with a mini PCI LTE modem 
card. Then I realized it was a more generic USB problem; I believe the apu2 has 
USB1 and USB2 ports, the apu3 has two USB3 ports externally, and then the mini 
PCI and a couple of internal headers are USB2. The USB3 ports, using the xHCI 
driver, work fine, I suppose in the worst case I could use an external Mini PCI 
to USB adapter and plug the card in outside of the case, but that just seems so 
kludgy .

I actually found a friend locally who had a apu2 board, he couldn't get the LTE 
card to work on the internal mini PCI slot, which also appeared to be EHCI 
based, and it would sometimes work and sometimes not plugged into the external 
USB ports. It was really weird, when plugged into the same external port, 
sometimes the device would show up on the EHCI bus (and not work) and sometimes 
it would show up on the OHCI bus (and work). He didn't seem to have any trouble 
with USB flash drives on the EHCI bus on his apu2 though.




pcengines apu boards

2017-11-30 Thread Paul B. Henson
I was wondering if anybody is successfully running openbsd on pcengines apu
boards? I have one of their APU3 series, specifically a apu3b4 with OpenBSD
6.2 on it but I can't get the USB2 EHCI ports functioning correctly (for one
thing, they don't detect a hot plugged device), I'm not sure if it's an
issue with the ehci driver and the amd ehci chipset or possibly something in
the bios acpi tables. But just as a data point, it would be interesting to
know if the problem is specific to my board or endemic to the design, so if
anyone has an APU series board with fully functional USB2 ports on the ehci
controller, I would much appreciate hearing which board it is, which
specific AMD chipset is driving the controller, and what bios version you
are running (and what OpenBSD version too).

Thanks much.



broken EHCI USB on AMD chipset?

2017-11-28 Thread Paul B. Henson
I have a pcengines APU 3 system, which has both USB3 and USB2 ports:

ehci0 at pci0 dev 18 function 0 "AMD Hudson-2 USB2" rev 0x39: apic 4 int 18
ehci1 at pci0 dev 19 function 0 "AMD Hudson-2 USB2" rev 0x39: apic 4 int 18

xhci0 at pci0 dev 16 function 0 "AMD Bolton xHCI" rev 0x11: msi

The USB2 ports seem to be broken. I initially was having trouble getting
an LTE modem to work, but then noticed more general underlying issues.
If a USB device is connected when the system boots, it will find it;
however, if you hot plug a USB device after the system is up, it doesn't
notice. Further, if you unplug a device while the system is up, it
doesn't notice it was removed. It appears that the system isn't
receiving interrupts for the EHCI USB devices, the interrupt count from
vmstat -i :

irq101/ehci0   190
irq101/ehci1   460

does not change when I plug in or remove a device, and I think the
reason the LTE modem is not working is because the cell modem driver never
receives a response from the commands sent to the modem, presumably because
the interrupt notifying the USB driver the data is ready to read is
never seen/handled.

The xhci usb3 ports work fine, they hot plug/remove devices correctly,
the lte modem works when plugged into them, and the vmstat interrupt
count for irq99/xhci0 increases when devices are using those ports.

The EHCI ports seem to work fine under Linux, including the LTE modem
when attached to them, so this seems to be an issue with openbsd, not
faulty hardware per se. The Linux driver does have a couple of
workarounds in their EHCI driver for AMD chipsets, I'm not sure if
either of them are relevant for this; one involves disabling low power
mode during transfers and the other says:

"EHCI controller on AMD SB700/SB800/Hudson-2/3 platforms may
read/write memory space which does not belong to it when
there is NULL pointer with T-bit set to 1 in the frame list
table. To avoid the issue, the frame list link pointer
should always contain a valid pointer to a inactive qh"

I don't see anything specifically discussing flaky interrupts. Any
thoughts on what might be going on here with USB and how it fix it?

Thanks...




Re: umb device, SIM has no PIN?

2017-11-24 Thread Paul B. Henson
On Fri, Nov 24, 2017 at 11:08:25AM +, Stuart Henderson wrote:

> > booted under openbsd. The umb driver doesn't support accessing the card
> > directly for debugging and diagnostics?
> 
> Correct, you can't get at those from OpenBSD atm.

That's a bummer; guess you wouldn't care too much if things were working
:), but when you're trying to sort out why they're not it sure would be
nice. Then if you're placing an external antenna the signal strength
readings are cool.

> I don't have it handy to check now, but IIRC that's similar to what I
> see on MC8805 after adding the ID for fcc auth.

Interestingly, I tried the card in an external miniPCI to USB adapter,
and it worked fine? Without any driver changes or adding the fcc auth
ID. But when the card is installed directly in the system it doesn't
work :(. But it works fine under Linux, so it's not that the system
hardware is broken or incompatible.

It's a PC Engines APU 3, maybe the OpenBSD USB drivers for this board
aren't working quite right? With the external adapter, the driver sends
the MBIM_OPEN_MSG, gets an interrupt, receives a response, parses the
response, and moves on. Installed on the board, the driver sends the
MBIM_OPEN_MSG, and then nothing happens. No interrupt, no response,
nothing.

Jul 23 18:00:35 maggie /bsd: uhub1 at usb1 configuration 1 interface 0 "AMD EHCI
 root hub" rev 2.00/1.00 addr 1
Jul 23 18:00:35 maggie /bsd: uhub2 at uhub1 port 1 configuration 1 interface 0 
" Advanced Micro Devices product 0x7900" rev 2.00/0.18 addr 2
Jul 23 18:00:35 maggie /bsd: umb0 at uhub2 port 3 configuration 1 interface 12 
" Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-.  
Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3

It looks like the card is on a stacked hub or something when installed
in the box? When it's plugged in on the external adapter:

Jul 23 18:15:38 maggie /bsd: umb1 at uhub0 port 4 configuration 1 interface 12 
" Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-.  
Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 2

It's connected directly to a controller. They seem to init the same:

Jul 23 18:00:35 maggie /bsd: umb0: umb_attach
Jul 23 18:00:35 maggie /bsd: umb0: ctrl_len=4096, maxpktlen=1422, cap=0x20
Jul 23 18:00:35 maggie /bsd: umb0: ctrl-ifno#12: ep-ctrl=5, data-ifno#13: 
ep-rx= 4, ep-tx=3
Jul 23 18:00:35 maggie /bsd: umb0: rx/tx size 16384/16384
Jul 23 18:00:35 maggie /bsd: umb0: umb_open
Jul 23 18:00:35 maggie /bsd: umb0: umb_ctrl_msg
Jul 23 18:00:35 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 1)
Jul 23 18:00:35 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 1)
Jul 23 18:00:35 maggie /bsd:0:   01 00 00 00 10 00 00 00 01 00 00 00 00 10 
0 0 00

Jul 23 18:21:20 maggie /bsd: umb1: umb_attach
Jul 23 18:21:20 maggie /bsd: umb1: ctrl_len=4096, maxpktlen=1422, cap=0x20
Jul 23 18:21:20 maggie /bsd: umb1: ctrl-ifno#12: ep-ctrl=5, data-ifno#13: 
ep-rx= 4, ep-tx=3
Jul 23 18:21:20 maggie /bsd: umb1: rx/tx size 16384/16384
Jul 23 18:21:20 maggie /bsd: umb1: umb_open
Jul 23 18:21:20 maggie /bsd: umb1: umb_ctrl_msg
Jul 23 18:21:20 maggie /bsd: umb1: -> snd MBIM_OPEN_MSG (tid 1)
Jul 23 18:21:20 maggie /bsd: umb1: sent MBIM_OPEN_MSG (tid 1)
Jul 23 18:21:20 maggie /bsd:0:   01 00 00 00 10 00 00 00 01 00 00 00 00 10 
0 0 00

But it's only when connected externally that the card actually generates
an interrupt and sends a response:

Jul 23 18:15:48 maggie /bsd: umb1: umb_intr
Jul 23 18:15:48 maggie /bsd: umb1: umb_intr: response available
Jul 23 18:15:48 maggie /bsd: umb1: umb_get_response_task
Jul 23 18:15:48 maggie /bsd: umb1: umb_decode_response
Jul 23 18:15:48 maggie /bsd: umb1: got response: len 16


Any thoughts on how to diagnose what might be a USB driver issue as
opposed to an LTE card issue 8-/?

Thanks...



Re: umb device, SIM has no PIN?

2017-11-23 Thread Paul B. Henson

> The card is a Sierra Wireless MC7455; to get it working with the umb

Looking at the source code, I see that there's an workaround for the
EM7455 card, something about requiring an "FCC Authentication" command?
>From what I understand the MC7455 is the same as the EM7455 other than
form factor, so I added it to the list for that workaround and also
turned on debugging in the driver. Here's what it has to say now:

Jul 23 18:12:41 maggie /bsd: umb0 at uhub2 port 3 configuration 1
interface 12 "Sierra Wireless, Inc
orporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev
2.10/0.06 addr 3
Jul 23 18:12:41 maggie /bsd: umb0: ctrl_len=4096, maxpktlen=1422,
cap=0x20
Jul 23 18:12:41 maggie /bsd: umb0: ctrl-ifno#12: ep-ctrl=5,
data-ifno#13: ep-rx=4, ep-tx=3
Jul 23 18:12:41 maggie /bsd: umb0: rx/tx size 16384/16384
Jul 23 18:12:41 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 1)
Jul 23 18:12:41 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 1)
Jul 23 18:12:41 maggie /bsd:0:   01 00 00 00 10 00 00 00 01 00 00 00
00 10 00 00
Jul 23 18:12:41 maggie /bsd: umb0: vers 1.0
Jul 23 18:12:41 maggie /bsd: ugen0 at uhub2 port 3 configuration 1
"Sierra Wireless, Incorporated Si
erra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06
addr 3


Jul 23 18:13:31 maggie /bsd: umb0: stop: reached state DOWN  
Jul 23 18:13:59 maggie /bsd: umb0: init: opening ...
Jul 23 18:13:59 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 2)
Jul 23 18:13:59 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 2)
Jul 23 18:13:59 maggie /bsd:0:   01 00 00 00 10 00 00 00 02 00 00 00
00 10 00 00
Jul 23 18:14:29 maggie /bsd: umb0: state change timeout
Jul 23 18:14:29 maggie /bsd: umb0: init: opening ...
Jul 23 18:14:29 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 3)
Jul 23 18:14:29 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 3)
Jul 23 18:14:29 maggie /bsd:0:   01 00 00 00 10 00 00 00 03 00 00 00
00 10 00 00
Jul 23 18:14:59 maggie /bsd: umb0: state change timeout

Not sure where to go from here.



umb device, SIM has no PIN?

2017-11-22 Thread Paul B. Henson
I'm trying to get an LTE card working in MBIM mode with the umb device
driver, but it just keeps saying "SIM not initialized PIN required". The
SIM isn't PIN locked, as far as I know the SIM has no PIN. I've tested
the card and SIM under linux on the exact same system and was able to
get it working fine just by supplying the APN.

The card is a Sierra Wireless MC7455; to get it working with the umb
driver I did have to disable the umsm driver as for some reason that one
claimed it first. Once that driver was disabled the umb driver seemed
happy with it:

umb0 at uhub2 port 3 configuration 1 interface 12 "Sierra Wireless, 
Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 
2.10/0.06 add r 3
ugen0 at uhub2 port 3 configuration 1 "Sierra Wireless, Incorporated Sierra 
Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3

After boot, the interface looked like:

umb0: flags=8810 mtu 1500
index 6 priority 0 llprio 3
roaming disabled registration unknown
state down cell-class none
SIM not initialized PIN required
status: down

I set the APN and tried to bring it up:

umb0: flags=8811 mtu 1500
index 6 priority 0 llprio 3
roaming disabled registration unknown
state down cell-class none
SIM not initialized PIN required
APN r.ispsn
status: down

But it still just says the SIM is not initialized. After a minute or two,
it starts logging these to the console:

umb0: state change timeout
umb0: state change timeout
umb0: state change timeout
umb0: state change timeout


Am I missing something? This card isn't listed explicitly as being
compatible, is there a problem with the driver and this particular card?

Under linux, the serial control interfaces were available as USB devices
so you could poke at the card with AT commands, I don't see any listed
booted under openbsd. The umb driver doesn't support accessing the card
directly for debugging and diagnostics?

Thanks...




Re: kernel reordering and config -e

2017-11-22 Thread Paul B. Henson
On Wed, Nov 22, 2017 at 04:45:59PM +, Kevin Chadwick wrote:

> I believe the second scenario would need /dev/mem access making it a
> larger change than it first appears (config with a new option could
> possibly save the original kernel file and compare the two kernel
> files).

Ah, I didn't mean that; I meant save your interactive 'config -e'
session in a file that could be played back later. IE, you run 'config
-e - /etc/ukc.conf ...', then type 'change x', 'disable y' etc,
and then when you 'quit', config would write a transcript of your
changes to /etc/ukc.conf such that 'config -e -

Re: kernel reordering and config -e

2017-11-21 Thread Paul B. Henson
On Tue, Nov 21, 2017 at 09:49:37AM +, Dimitris Papastamos wrote:

> This is what I do in rc.shutdown to handle this case:
> 
> /usr/bin/printf "disable inteldrm*\nquit\n" | /usr/sbin/config -ef /bsd
> /bin/sha256 -h /var/db/kernel.SHA256 /bsd

Cool, thanks for the suggestion; that should be good as long as the box
doesn't panic or otherwise have an unclean shutdown.



Re: kernel reordering and config -e

2017-11-21 Thread Paul B. Henson
On Mon, Nov 20, 2017 at 02:01:56PM -0700, Theo de Raadt wrote:

> If someone wants to solve this fully there have been some proposals
> for keeping track of the instruction sequence, and attempting to
> reapply it upon each relink in the build directory.  There just hasn't
> been any scripting changes to do that from anyone, and it isn't on my
> radar as important.

Ah, rather than make binary changes to the object files that get linked,
just redo the changes to the resultant kernel binary every time it is
generated. That's definitely simpler to do with the existing tools.

I see someone made a suggestion that you replied to with a classic
"where's the patch" :), I don't think he was suggesting someone else do
it but more looking for guidance on what you'd considerable acceptable
before spending time on it.

For example, would the basic "user manually constructs a text file with
config commands by hand that then just gets passed to config on stdin"
approach he mentioned be good enough to commit? Or would you want
something more integrated into config where it would have a new command
that would generate a file based on the current session, and a new
option to process changes from a file rather than interactively? It
looks like it would be difficult to detect errors in the first scenario,
and I don't know if that would be an issue.

Thanks...



Re: kernel reordering and config -e

2017-11-20 Thread Paul B. Henson
On Mon, Nov 20, 2017 at 08:37:43AM +, Roderick wrote:

> Commenting out the line "/usr/libexec/reorder_kernel &" at the
> end of rc?
> 
> I suspect it is not forseen not to benefice of KARL.

No, actually, if the hash of the kernel is different than expected, the
reorder_kernel aborts and doesn't generate a new one. So you don't need
to do anything explicitly after the config -e to avoid your change being
wiped out. What I did was update the saved hash with one matching my
modified kernel (not quite understanding what was going on yet) which
caused KARL to wipe my changes out with the default.



Sierra Wireless MC7455 LTE cell network card

2017-11-19 Thread Paul B. Henson
I'm trying to get the subject card to work under OpenBSD 6.2; it works
fine under Linux so I know the card itself and its SIM etc are correctly
configured and functional.

The card is set to MBIM mode, and I'd like to use the umb driver rather
than the umsm driver as not to have to muck with PPP. It seems this card
is detected first by the umsm driver though, as I had to disable that
driver for the card to be picked up by umb. The umb man page says
"Devices which fail to provide a conforming MBIM implementation will
probably be attached as some other driver", does this indicate the
MC7455 (as opposed to the EM7455, which is explicitly listed as
compatible) isn't recognized as an MBIM device? It seems to work fine in
MBIM mode under linux, and the umb driver does find it once umsm is
disabled.

Is there any way to access the serial interface of the device under
openbsd in order to execute diagostic AT commands? Under linux in
addition to the network device the card also generates a few USB serial
devices, one of which can be used to run commands on it. I saw such
devices with the umsm driver, but I don't see any with the umb driver. I
haven't gotten any farther than installing the card and getting the umb
driver to recognize it at this point, but it would be nice to be able to
poke at it and see what the card has to say for itself.

Thanks...



Re: kernel reordering and config -e

2017-11-19 Thread Paul B. Henson
On Mon, Nov 20, 2017 at 06:50:30AM +0100, Sebastien Marie wrote:

> When it did that, it uses the object (I didn't recall the exact name)
> with the previous mentioned array, with *default* configuration. So the
> previous modification done with config(8) is cleared.

Yeah, I figured that out after I updated the saved KARL hash and then my
box came up with no serial console :).

> For me, there is currently no way to ask config(8) to alter the right
> file in /usr/share/relink/kernel to "ship" the modification in all
> future generated KARL kernels.

I thought that might be the case; maybe someday config(8) will be
extended to work with the object files as well as the kernel binary
itself to allow that.

> - makes your changes in /usr/src/sys, build and install a new no-GENERIC
>   kernel (and do it at each upgrade)

If I do that, can the resultant object files (which will have my com2
irq change) be used with KARL? Hmm, it seems like all I really need to
do is compile a new com_isa.o and drop it in to the existing directory?
Or replace whichever object file contains the constant I need to change;
it's not like I'm modifying code or making any drastic changes... Hmm,
I'll have to compile a new kernel and poke at it; it'll just be a matter
of remembering to redo it after patches, but I already had to redo the
config -e anyway.

Thanks...



kernel reordering and config -e

2017-11-19 Thread Paul B. Henson
I just updated a server to 6.2; unfortunately this box has an oddball
SOL com2 on irq10 so I need to run 'config -e' on the kernel to update
it and make the serial console work. I noticed afterwards in the boot
messages it was complaining about kernel reordering failures, and
thinking I was fixing it, I updated the file /var/db/kernel.SHA256 with
the hash of my modified kernel. I quickly discovered that resulted in a
successfully reordered kernel with a stock com2 irq :(.

I didn't see anything in the config man page or faq about interaction
between kernel reordering and config on a binary kernel. In hindsight I
see that the hash check is to keep from replacing a locally modified
kernel.  Is there a supported way to both fix hardcoded settings on a
stock kernel and use reordering? Or do you need to update your settings
in the config and compile a kernel from scratch? If you do, does
/usr/share/compile automatically get populated with your new kernel
objects and reordering just starts working, or do you need to do
something manually to get it running with a locally compiled kernel?

Thanks...



Re: OpenBSDI 6.1 some Warnings when using OpenLDAP Tools

2017-08-10 Thread Paul B. Henson
On Wed, Aug 09, 2017 at 09:06:19AM +0200, Markus Rosjat wrote:

> this is more an info then a problem though since it seems to work.
> When I use the slap tool like slapcat I get a size mismatch warning like 
> this

Heh, we were just talking about that:

https://marc.info/?l=openbsd-misc=150199443929908=2



Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program

2017-08-05 Thread Paul B. Henson
On Sat, Aug 05, 2017 at 12:35:24AM +, Stuart Henderson wrote:

> The ports@ list is a better venue for ports-related queries,
> please see this: https://marc.info/?l=openbsd-ports=150157643516239=2

Ah, ok, thanks for the pointer.

> This is not preventing programs from running.

Hmm, I could've sworn I got that message and then slapd failed to start.
Dunno, maybe I got confused. Once I'm done working with openldap mdb I'll
start over from scratch and try again and see what happens.

Thanks for the info...



Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program

2017-08-03 Thread Paul B. Henson
On Thu, Aug 03, 2017 at 05:33:15PM -0400, Predrag Punosevac wrote:

> It is well known issue.
> 
> https://marc.info/?l=openbsd-misc=149271724912565=2
> 
> It seems to be benign at least for my use case.

Yah, I saw that discussion from back in April, but then it just stopped
with no resolution. I'm not sure what your use case is, but as far as I
can tell, it's preventing programs linked against libicuuc.so from
running? So not too benign for me 8-/. But fortunately downgrading to
the 6.0 version of the port seems to have worked around the issue.

Thanks...



Re: openldap port mdb support

2017-08-03 Thread Paul B. Henson
On Mon, Jul 10, 2017 at 07:34:11AM +, Stuart Henderson wrote:

> Feel free to try it, I believe the required patch to force MDB_WRITEMAP
> is still in there..but I don't think there were any major changes upstream
> since the last attempt so I wouldn't hold out too much hope for it working
> straight off.

Hmm, as you said, trying to use mdb resulted in crashes. My initial debugging
led to the cause of this as a NULL mdb environment, and ironically the
root cause of that turned out to be the OpenBSD specific MDB_WRITEMAP
patch 8-/.

if ( !(flags & MDB_WRITEMAP) ) {
Debug( LDAP_DEBUG_ANY,
LDAP_XSTRING(mdb_db_open) ": database \"%s\" does not 
have writemap. "
"This is required on systems without unified buffer 
cache.\n",
be->be_suffix[0].bv_val, rc, 0 );
goto fail;
}

There are two problems with it; first, it accesses the local flags variable
before it is initialized to mdb->mi_dbenv_flags shortly thereafter, so the
value checked is random and the if block nondeterministically triggers, and
second, it doesn't assign a failure value to rc before it jumps to fail, so
the function returns successfully but with a closed be, and the code keeps
going but later segfaults because of the NULL mdb environment.

I updated the patch and moved the check to be after the flags initialization:

flags = mdb->mi_dbenv_flags;

and added an assignment to rc on failure:

rc = MDB_INCOMPATIBLE;

I then tweaked the mdb test suite to always enable MDB_WRITEMAP, and so far
it's been running for 20 minutes with no errors, crashes, or failures.

Right now it's compiled "-O0 -ggdb", if everything keeps looking good, I'll
recompile it normally and do more testing.



Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program

2017-08-03 Thread Paul B. Henson
On Wed, Aug 02, 2017 at 05:37:40PM -0700, Paul B. Henson wrote:
> I'm trying to compile openldap from ports under 6.1, and running it
> fails with the error:
> 
> slapd:/usr/local/lib/libicuuc.so.12.0: /usr/local/lib/libicudata.so.12.0
> : WARNING: symbol(icudt58_dat) size mismatch, relink your program

I ended up checking out the 6.0 version of textproc/icu (57.1) into my 6.1
ports tree and compiling that, which seems to work fine. There must just
be some weird issue with 58.2 under OpenBSD.



WARNING: symbol(icudt58_dat) size mismatch, relink your program

2017-08-02 Thread Paul B. Henson
I'm trying to compile openldap from ports under 6.1, and running it
fails with the error:

slapd:/usr/local/lib/libicuuc.so.12.0: /usr/local/lib/libicudata.so.12.0
: WARNING: symbol(icudt58_dat) size mismatch, relink your program

I see there was some dicussion of this back around April, but no
resolution, and I didn't see anything since then. Evidentally it impacts
anything that uses textproc/icu from what I could tell. I poked around
with it a bit but nothing jumped out as to why it's doing this. The
symbol seems to be defined in libicudata.so and accessed by libicuuc.so.
The actual object file in the distibution that contains it is
dynamically generated. I have the exact same version running ok on a
linux box so it doesn't seem to be an issue with the code itself.

Has anyone figured out what's going on with this code under openbsd
that's causing it to fail like this?

Thanks...



openldap port mdb support

2017-07-10 Thread Paul B. Henson
mdb has been disabled in the openldap port since it looks like
2015/02/16, I was wondering if anyone has tried it since then to see if
maybe the issues with it have been resolved? The other backends are
deprecated upstream, it would be nice to get mdb working under openbsd.

I'm going to try enabling it and running through the tests and see how
things turn out but I was just curious if anyone else had worked with it
in the past couple of years.

Thanks...



Re: ipmi driver broken

2017-06-29 Thread Paul B. Henson
> From: Ted Unangst
> Sent: Wednesday, June 28, 2017 8:50 PM
> 
> i'm afraid i won't make a very good ipmi maintainer, but i think i applied the
> patch in the right spot.

Cool, thanks; much appreciated.



Re: ipmi driver broken

2017-06-29 Thread Paul B. Henson
> From: Theo de Raadt
> Sent: Wednesday, June 28, 2017 8:41 PM
> 
> If you want it working, you will need to get it fixed.  On all
> machines, so that we can renable it.

I definitely don't want to be one of those entitled people demanding work
from developers without providing anything that you trounce upon ;). But
that's a bit of a big ask, make it work on all machines? I've got four
different models of supermicro servers that I certainly can do testing on,
although as I said, on these particular servers as far as I can tell (other
than the watchdog) the driver seems to work fine.

> Let me explain how we work.

I understand; really, I'm not asking you guys to invest a significant amount
of effort in improving the driver, or even technically "fixing" any new
issues or problems with it. I was only kindly requesting that you put back a
line that appears to have accidentally been deleted a few revisions ago that
broke it. So unless you're intentionally sabotaging it in preparation for
the ritual sacrifice :)?

It's too bad nobody else finds value in it; it provides sensors that aren't
otherwise available, provides access to the system event log for event data,
allows access to the management interface without needing to go through the
network, and ideally would allow access to the hardware watchdog.
Unfortunately I don't have expertise in low level hardware device driver
development so while I could be a tester I can't be a primary maintainer. So
if you guys end up scrapping it, I will be sad but that's the way it is. But
until then, given it works for me, it doesn't hurt to use it :). Or to ask
for one line to be put back so it would work in the shipped kernel; unless I
suppose said request results in it getting scrapped ;).

Thanks.




Re: ipmi driver broken

2017-06-28 Thread Paul B. Henson
On Wed, Jun 28, 2017 at 06:31:34PM -0400, Predrag Punosevac wrote:

> My understanding is that ipmi driver used by ipmitool is disabled
> intensionally due to the security problems. IPMI pose a grave security
> risk.

IPMI on the SP is available whether or not the openbsd driver is enabled
or in use; my understanding as to why it's disabled by default it that
it's not necessarily considered stable. I've never had an issue with it,
at least not for the limited use I make of it.

> As you probably know OpenBSD comes with its own sensoring
> framework. You probably want to check out

Yes; I actually want the ipmi driver loaded so it can supply data to
said framework:

hw.sensors.ipmi0.temp0=34.00 degC (System Temp), OK
hw.sensors.ipmi0.temp1=40.00 degC (Peripheral Temp), OK
hw.sensors.ipmi0.fan0=4875 RPM (FAN 1), OK
hw.sensors.ipmi0.fan1=3000 RPM (FAN 2), OK
hw.sensors.ipmi0.fan2=3150 RPM (FAN 3), OK
hw.sensors.ipmi0.fan3=5100 RPM (FAN 4), OK
hw.sensors.ipmi0.fan4=3300 RPM (FAN A), OK
hw.sensors.ipmi0.volt0=0.71 VDC (Vcore), OK
hw.sensors.ipmi0.volt1=3.23 VDC (3.3VCC), OK
hw.sensors.ipmi0.volt2=12.14 VDC (12V), OK
hw.sensors.ipmi0.volt3=1.53 VDC (VDIMM), OK
hw.sensors.ipmi0.volt4=4.99 VDC (5VCC), OK
hw.sensors.ipmi0.volt5=-12.49 VDC (-12V), OK
hw.sensors.ipmi0.volt6=3.17 VDC (VBAT), OK
hw.sensors.ipmi0.volt7=3.36 VDC (VSB), OK
hw.sensors.ipmi0.volt8=3.23 VDC (AVCC), OK
hw.sensors.ipmi0.indicator0=Off (Chassis Intru), OK

There's more sensor data available via the IPMI interface than the
kernel supplies without it. It's also useful to be able to view the SEL
without having to loop over the network to the SP management IP. On my
linux boxes I also use the ipmi hardware watchdog, but last time I tried
that on openbsd it just kept rebooting continuously 8-/. Guess that's
one of the parts that's not stable :), but I can't remember the last
time one of my openbsd boxes wedged up anyway.

Anyway, thanks for the thoughts; but I do still want a working ipmi :).
No biggie to add one line and recompile the kernel, but it would be nice
to get fixed. It's still disabled by default out of the box, you have to
explicitly reconfigure your kernel to enable it.



ipmi driver broken

2017-06-28 Thread Paul B. Henson
I noticed back when I upgraded to 5.9 the ipmi driver stopped working,
it just said:

ipmi0: get header fails
ipmi0: no SDRs IPMI disabled

I found the following post at the time which appeared to point out the
issue and suggest a fix:

http://openbsd-archive.7691.n7.nabble.com/fix-for-quot-ipmi0-get-header-fails-quot-td299427.html

After applying this and installing the resulting kernel, ipmi worked
fine. I skipped 6.0, but just updated my boxes to 6.1, and see the same
ipmi failures. It looks like this fix hasn't been applied, the code in
head is still missing this line. I applied it again to my 6.1 kernel and
it still seems to make ipmi work fine as far as I can tell.

Is there anyone maintaining ipmi or someone with commit privs that might
be kind enough to apply this so the next release version would have
working ipmi?

Thanks much...



Re: what all touches the carp demote counter?

2016-10-14 Thread Paul B. Henson
On Fri, Oct 14, 2016 at 01:27:42PM -0700, Paul B. Henson wrote:
> Arg, I'm still having issues with the carp demote counter. I disabled
> ospfd for now, but something is still changing it. After a reboot
> without ospfd, the counter is changing between 0 and 1:

Ah, I tracked it down. I had configured another carp interface on the
new system which didn't yet have a corresponding interface on the old
system. I have the carp interfaces configured with explicit peer
addresses rather than using multicast, and evidentally the inability to
send a packet to the peer was causing the other carp interface to
twiddle the global carp demote counter, which popped up once I cranked
up the carp log level:

Oct 14 15:21:48 lisa /bsd: carp: carp1 demoted group carp by -1 to 2 (< 
snderrors)
Oct 14 15:21:52 lisa /bsd: carp1: ip_output failed: 64
Oct 14 15:21:54 lisa /bsd: carp: carp1 demoted group carp by 1 to 3 (> 
snderrors)
Oct 14 15:21:55 lisa /bsd: carp1: ip_output failed: 64
Oct 14 15:22:14 lisa /bsd: carp: carp1 demoted group carp by -1 to 2 (< 
snderrors)
Oct 14 15:22:18 lisa /bsd: carp1: ip_output failed: 64
Oct 14 15:22:20 lisa /bsd: carp: carp1 demoted group carp by 1 to 3 (> 
snderrors)

It doesn't do this if I remove the carppeer and use the default multicast;
that's an unexpected side effect of configuring a carppeer that might be
worth documenting. A down carppeer on one interface can impact the
functionality of all carp interfaces on the system.



Re: what all touches the carp demote counter?

2016-10-14 Thread Paul B. Henson
Arg, I'm still having issues with the carp demote counter. I disabled
ospfd for now, but something is still changing it. After a reboot
without ospfd, the counter is changing between 0 and 1:

bash-4.3# ifconfig -g carp
carp: carp demote count 1

bash-4.3# ifconfig -g carp
carp: carp demote count 0

bash-4.3# ifconfig -g carp
carp: carp demote count 1

bash-4.3# ifconfig -g carp
carp: carp demote count 0

And the carp interface is flapping:

Oct 14 13:17:17 lisa /bsd: carp0: state transition: BACKUP -> MASTER
Oct 14 13:17:23 lisa /bsd: carp0: state transition: MASTER -> BACKUP
Oct 14 13:17:43 lisa /bsd: carp0: state transition: BACKUP -> MASTER
Oct 14 13:17:49 lisa /bsd: carp0: state transition: MASTER -> BACKUP
Oct 14 13:18:08 lisa /bsd: carp0: state transition: BACKUP -> MASTER

There's not too much running; smtpd, sshd, npppd, dhcpd. Any suggestions
as to what might be screwing with the carp demote value?

Thanks...


root 1  0.0  0.0   440   520 ??  Is 1:14PM0:01.01 /sbin/init
root 21696  0.0  0.0  1044  1296 ??  Isp1:14PM0:00.00 syslogd: 
[priv] (syslogd)
_syslogd 22103  0.0  0.0  1044  1388 ??  Sp 1:14PM0:00.07 
/usr/sbin/syslogd
_pflogd   5335  0.0  0.0   684   400 ??  Sp 1:14PM0:00.02 pflogd: 
[running] -s 160 -i pfl
root 27252  0.0  0.0   620   600 ??  Is 1:14PM0:00.00 pflogd: 
[priv] (pflogd)
_ntp 16170  0.0  0.0   636  1472 ??  Isp1:14PM0:00.02 ntpd: dns 
engine (ntpd)
_ntp 15754  0.0  0.0   688  1540 ??  S I'm setting up a second router that's going to sit next to an existing
> one and become a redundant failover system. The current one is in
> production, and I've been converting some of the existing LAN subnets on it
> to use carp interfaces and making them primary and the new box
> secondary. I also set up a carp interface on the WAN side and made the
> new box primary for testing as that didn't exist before. That all
> worked fine when I set it up by hand, but when I rebooted the new box,
> the old box stayed primary for everything including the WAN interface,
> which I tracked down to the carp demote counter, which ended up at 2 on
> the new box after the reboot:
> 
> bash-4.3# ifconfig -g carp
> carp: carp demote count 2
> 
> After I manually decreased the demote counter by 2 back to 0 the WAN
> interface master switched back to the new box.
> 
> I'm not sure what's doing that at boot? I am running ospfd on the box,
> but I don't have any demote statements in my configuration. I'm also
> running npppd, but I don't see anything about that and carp demotion.
> What else might be setting carp demotion values?
> 
> Thanks...



Re: what all touches the carp demote counter?

2016-10-12 Thread Paul B. Henson
On Wed, Oct 12, 2016 at 08:37:59AM +0200, mxb wrote:

> But as R0me0 stated, you should probably re-check your configuration.

The configuration checked out. I rebooted a few more times, and I
couldn't reproduce the problem. I still have no idea why the carp
demotion counter was set to 2 the first time I rebooted. It doesn't seem
to be doing it anymore though. Thanks for all the suggestions though, it
helped to verify everything was set up right.



Re: what all touches the carp demote counter?

2016-10-11 Thread Paul B. Henson
On Tue, Oct 11, 2016 at 08:44:05AM +0200, mxb wrote:

> Master-Backup setup with pfsync in place, means that you synchronize
> states between boxes.  Then Master is rebooted, it becomes out-of-sync
> then it comes to states.  So until it is in sync with Backup (which
> became Master after reboot), it will not become Master.
> 
> This process is auto. Just need to wait.

I haven't set up pfsync yet, I need to upgrade the old box first. Right
now I'm just working with carp. Does pfsync fiddle with the carp
demotion value even if it's not configured?

Thanks...



Re: what all touches the carp demote counter?

2016-10-10 Thread Paul B. Henson
On Mon, Oct 10, 2016 at 09:43:56PM -0300, R0me0 *** wrote:

> Did you adjust advskew value on the machine you want to be Backup ?

Yes, the backup has an advskew of 5 and the primary an advskew of 1. As
I mentioned, when I first configured the interfaces by hand the two
systems properly negotiated master/backup roles, it was only after I
rebooted the one that was supposed to be primary on this interface that
it came up as backup, and I traced it to the fact the the carp demote value
was set to 2. When I manually changed the carp demote value to 0, the
system once again pre-empted the master role on the interface.

I'm just not sure what is twiddling with the carp demotion value. Unless
ospdf does it by default? The man page for the config file reads like it
would only do it if you explicitly include the demote keyword in the
area or interface section.

Thanks for the suggestion though.



what all touches the carp demote counter?

2016-10-10 Thread Paul B. Henson
I'm setting up a second router that's going to sit next to an existing
one and become a redundant failover system. The current one is in
production, and I've been converting some of the existing LAN subnets on it
to use carp interfaces and making them primary and the new box
secondary. I also set up a carp interface on the WAN side and made the
new box primary for testing as that didn't exist before. That all
worked fine when I set it up by hand, but when I rebooted the new box,
the old box stayed primary for everything including the WAN interface,
which I tracked down to the carp demote counter, which ended up at 2 on
the new box after the reboot:

bash-4.3# ifconfig -g carp
carp: carp demote count 2

After I manually decreased the demote counter by 2 back to 0 the WAN
interface master switched back to the new box.

I'm not sure what's doing that at boot? I am running ospfd on the box,
but I don't have any demote statements in my configuration. I'm also
running npppd, but I don't see anything about that and carp demotion.
What else might be setting carp demotion values?

Thanks...



no SDRs IPMI disabled?

2016-04-02 Thread Paul B. Henson
I just installed 5.9 on a Supermicro X11SSL-F board, and tried to enable
the ipmi driver. During boot, it shows:

ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1
iic0: skipping sensors to avoid ipmi0 interactions
ipmi0: get header fails
ipmi0: no SDRs IPMI disabled
ipmi at mainbus0 not configured

Any suggestions on how to make this work? The full dmesg is:


OpenBSD 5.9 (GENERIC.MP) #1888: Fri Feb 26 01:20:19 MST 2016
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 16976416768 (16189MB)
avail mem = 16457711616 (15695MB)
User Kernel Config
UKC> enable ipmi
401 ipmi0 enabled
UKC> quit
Continuing...
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7fb95000 (59 entries)
bios0: vendor American Megatrends Inc. version "1.0b" date 12/29/2015
bios0: Supermicro Super Server
acpi0 at bios0: rev 2
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG HPET SSDT LPIT SSDT SSDT SSDT 
DBGP DBG2 SSDT SSDT UEFI SSDT DMAR EINJ ERST BERT HEST
acpi0: wakeup devices PEGP(S4) PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) 
PXSX(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) 
PXSX(S4) RP13(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.85 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 1 (application processor)
cpu4: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz
cpu4: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 1, core 0, package 0
cpu5 at mainbus0: apid 3 (application processor)
cpu5: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz
cpu5: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT
cpu5: 

Re: Supermicro X11SSL-F freezes probing USB 3

2016-03-31 Thread Paul B. Henson
On Wed, Mar 30, 2016 at 03:34:25PM -0400, Sonic wrote:

> Ahha! Who would have thought... com0 was the ticket. Thanks much!

Sweet, glad to hear you got it working. Usually the IPMI SOL comes after
the physical serial ports, I've never seen it be the first one. But hey,
it's Dell :).

Maybe now that 5.9 is out (a month early, nice, just in time for my new
box) one of the devs will have time to take a look at the skylake
usb 3 issues.



Re: Supermicro X11SSL-F freezes probing USB 3

2016-03-31 Thread Paul B. Henson
On Tue, Mar 29, 2016 at 10:46:15PM -0400, Sonic wrote:

> The IPMI is part of Dell's iDRAC stuff and the only thing I've found
[...]
> may be the iDRAC license level as well, anything above the "basic"
> level, providing a limited feature set, requires purchasing a license

Eeew. We've got some HP gear that requires an extra cost license to make
the remote kvm gui head work past the bootloader which is ridiculous
(but technically, I don't think remote kvm is part of the base IPMI
standard), but the IPMI SOL serial port??? That's just crazy. I've never
used Dell and never will for servers; desktops/notebooks, sure, but
servers? Nah. Sun gear was pretty good until Oracle killed them off, we
used IBM for a while until they sold it off to Lenovo and policy
wouldn't let us buy from a non-US company (like the gear itself doesn't
come from China anyway). Right now we're using HP at my dayjob and it's
working out ok. I pretty much use supermicro for personal gear and
sidejobs, it's generally good stuff. At least my IPMI SOL port works :).

Good luck :).



Re: Supermicro X11SSL-F freezes probing USB 3

2016-03-29 Thread Paul B. Henson
On Tue, Mar 29, 2016 at 07:06:41PM -0400, Sonic wrote:
> On Tue, Mar 29, 2016 at 6:15 PM, Paul B. Henson <hen...@acm.org> wrote:
> > stty com1 115200
> > set tty com1
> 
> Yes, tried that with no luck, SOL still stops forwarding. The box does

Hmm, that sounds broken. Are you sure you've got the right serial port
and baud rate? Once you switch the boot loader to serial, it's no longer
a matter of "forwarding", it's direct serial access as far as the
bootloader/OS is concerned. The BIOS forwarding piece is out of the
picture. Unless your IPMI serial port implementation is broken you've
probably got the wrong settings. Double check in the BIOS what the IPMI
serial port settings are (which port, speed, etc) and make sure they
match what you tell the bootloader.

> rely on ssh for everything. But there's always some possible problem
> that it would be nice to be able to plug in a keyboard and monitor to
> work with.

Can't you use the IPMI virtual head? I can't remember the last time I
used a physical anything with a server. Rack and forget... Unless a
power supply blows.



Re: Supermicro X11SSL-F freezes probing USB 3

2016-03-29 Thread Paul B. Henson
On Tue, Mar 29, 2016 at 04:55:05PM -0400, Sonic wrote:

> Unfortunately that option isn't available for me. The IPMI SOL on this
> Dell stops forwarding the console once the system boots.

The usb keyboard should still work when the bootloader is running,
that's being handled by the BIOS. You just need to determine what
port/baud rate the IPMI serial port is on your system, and when the
bootloader shows up, just type for example:

stty com1 115200
set tty com1

That will switch the bootloader to directly use the serial port, and
when you boot the OS, it will do so as well. After these two commands,
just 'boot -c' as usual to disable xhci, and continue on the serial
port. Real servers don't need heads or keyboards ;).

At least on my supermicro box, the default bios setting also allows me
to type that on the IPMI serial console as well, in addition to the boot
up messages it also forwards the bootloader by default, it doesn't stop
forwarding until the OS itself loads.

If you install over the serial console, the OS will by default be
installed to use it too. So maybe you don't need that keyboard after all
:).



Re: Supermicro X11SSL-F freezes probing USB 3

2016-03-28 Thread Paul B. Henson
On Mon, Mar 28, 2016 at 03:06:39PM -0400, Sonic wrote:

> If I wait long enough the install will finally finish booting but the
> keyboard (no ps2 ports) doesn't work.

Could I trouble you to be more specific as to the duration of "long
enough" :)? I think my patience ran out after about 15-20 minutes. So it
eventually boots without disabling xhci, but the USB doesn't work in the
end anyway? I'm installing via an IPMI virtual serial port so the lack
of keyboard isn't really an issue for me, I can live without USB but as
the box won't be going live for a few weeks I thought I'd see if any
devs wanted me to try anything on it before I just moved forward without
USB support. I've got -current set up to ready to patch and compile to
test stuff on it if I can. It would be nice to get it working for
situations like yours where it's needed.

I booted a FreeBSD 10.2 livecd on it, and that initialized the xhci
chipset fine and usb devices seem to work ok. I tried to compare the
drivers, they share a bit in common but they're also quite different and
it doesn't help that I'm not really a low level driver guy 8-/. I'm sure
the new Skylake stuff just needs some minor tweak to make it happy.

Thanks...



Supermicro X11SSL-F freezes probing USB 3

2016-03-07 Thread Paul B. Henson
I just put together a new server with a Supermicro X11SSL-F motherboard
and a Xeon E3-1240L v5 processor, and was trying to install openbsd 5.8
on it. The install cd freezes while booting after it probes the USB 3
devices:

>>> xhci probe won
xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x31: msi
>>> probing for usb*
>>> usb probe returned 1
>>> usb probe won
usb0 at xhci0: USB revision 3.0
>>> probing for uhub*
>>> uhub probe returned 10
>>> uhub probe won
uhub0 at usb0 "Intel xHCI root hub" rev 3.00/1.00 addr 1
[system freezes here]


I also tried the latest snapshot install cd, same problem. If I disable
xhci, the installer boots successfully, although I haven't actually
tried installing yet. I don't really need usb on this box, so I'm not
really concerned if it's not going to work. It's not going into
production for a few weeks though, so if anybody's interested in looking
at why it's broken I could provide further details or test possible
fixes. Here's a dmesg of the snapshot install kernel booted without
xhci:

OpenBSD 5.9-current (RAMDISK_CD) #1737: Sun Mar  6 19:18:13 MST 2016
   
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
   
real mem = 16976416768 (16189MB)
   
avail mem = 16460062720 (15697MB)   
   
User Kernel Config  
   
UKC> do\^H \^Hiso\^H \^Hable xhci   
   
 98 xhci* disabled  
   
 UKC> quit  

 Continuing...  

 mainbus0 at root   

 bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7fb95000 (59 entries)   

 bios0: vendor American Megatrends Inc. version "1.0b" date
 12/29/2015  
 bios0: Supermicro Super Server 

 acpi0 at bios0: rev 2  

 acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG HPET SSDT LPIT
 SSDT SSDT SSDT DBGP DBG2 SSDT SSDT UEFI SSDT DMAR EINJ ERST BERT
 HEST 
 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat   

 cpu0 at mainbus0: apid 0 (boot processor)  

 cpu0: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.73 MHz  

 cpu0:
 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT

 cpu0: 256KB 64b/line 8-way L2 cache

 cpu0: apic clock running at 24MHz  

 cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE   

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 cpu at mainbus0: not configured

 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins 

 acpiprt0 at acpi0: bus 0 (PCI0)

 acpiprt1 at acpi0: bus 1 (PEG0)

 acpiprt2 at acpi0: bus -1 (PEG1)   

 acpiprt3 at acpi0: bus -1 (PEG2)   

 acpiprt4 at acpi0: bus 2 (RP09)

 acpiprt5 at acpi0: bus 3 (RP10)  

skylake Xeon, C232 chipset, i210-AT ethernet

2015-12-17 Thread Paul B. Henson
I'm about to build a server with a supermicro X11SSL-F motherboard and a
Xeon E3-1240L v5 processor. The SATA ports should be AHCI compliant, and
it looks like the i210-AT ethernet is supported by the em driver, so I
think everything should work ok. But it's pretty new stuff, so I wanted
to check and see if anybody was aware of any problems or issues with
OpenBSD 5.8 and the latest Intel processors and chipsets before I pulled
the trigger on it.

Thanks much!



Re: npppd with two pppx interfaces causes kernel panic

2014-03-20 Thread Paul B. Henson
 From: YASUOKA Masahiko
 Sent: Wednesday, March 19, 2014 9:44 PM

  Should I just keep an eye on the changelog for mention of pppx
  changes to tell when it's safe to try again?
 
 Sorry I cannot understand the point of this question.

Sorry to be confusing; I switched to tun because of this bug, but I would
prefer to use pppx once it has been fixed. The question was regarding how to
know when the bug has been fixed. As I am unaware of any publicly accessible
bug tracking system for openbsd, I was thinking of just watching the change
log (http://www.openbsd.org/plus.html) for any mention of pppx.

Thanks.



Re: npppd with two pppx interfaces causes kernel panic

2014-03-20 Thread Paul B. Henson
 From: Jonathan Gray
 Sent: Thursday, March 20, 2014 3:36 AM

 The following diff prevents the panic here:

Interesting, given the XXX, it seems somebody was already a little
suspicious of this section :).

From a cursory glance, it seems pppx_dev_lookup is supposed to return data
about a particular instance if it is in use, or NULL otherwise. So it seems
for some reason pppxclose is being called for device that isn't open? While
avoiding a panic is a plus :), it seems there is perhaps a logic error
somewhere else resulting in an attempt to close a device that isn't open?
 
 Index: if_pppx.c
 ==
 =
 RCS file: /cvs/src/sys/net/if_pppx.c,v
 retrieving revision 1.26
 diff -u -p -r1.26 if_pppx.c
 --- if_pppx.c 19 Oct 2013 14:46:30 -  1.26
 +++ if_pppx.c 20 Mar 2014 10:21:04 -
 @@ -590,7 +590,8 @@ pppxclose(dev_t dev, int flags, int mode
 
   rw_enter_write(pppx_devs_lk);
 
 - pxd = pppx_dev_lookup(dev);
 + if ((pxd = pppx_dev_lookup(dev)) == NULL)
 + return (ENXIO);
 
   /* XXX */
   while ((pxi = LIST_FIRST(pxd-pxd_pxis)))



npppd can't open /dev/pppx1

2014-03-19 Thread Paul B. Henson
I set up an L2TP VPN with npppd recently using pppx, and other than some
routing issues with ospfd it works great. I'm trying to add a second VPN
connection, but that doesn't seem to work using pppx.

With this config:

interface pppx0 address 10.128.120.1 ipcp IPCP_admin
interface pppx1 address 10.128.120.129 ipcp IPCP

bind tunnel from L2TP_ipv4 authenticated by LOCAL_admin to pppx0
bind tunnel from L2TP_ipv4 authenticated by LOCAL to pppx1

npppd won't start:

# npppd -d
2014-03-19 14:08:27:NOTICE: Starting npppd pid=28792 version=5.0.0
2014-03-19 14:08:27:WARNING: pptpd GRE protocol not allowed
2014-03-19 14:08:27:NOTICE: Load configuration
from='/etc/npppd/npppd.conf' successfully.
2014-03-19 14:08:27:INFO: pppx0 Started pppx
2014-03-19 14:08:27:ERR: pppx1 open(/dev/pppx1) failed: No such file or 
directory

If I switch to tun instead of pppx:

interface tun0 address 10.128.120.1 ipcp IPCP_admin
interface tun1 address 10.128.120.129 ipcp IPCP
bind tunnel from L2TP_ipv4 authenticated by LOCAL_admin to tun0
bind tunnel from L2TP_ipv4 authenticated by LOCAL to tun1

it works fine:

# npppd -d
2014-03-19 14:14:28:NOTICE: Starting npppd pid=3355 version=5.0.0
2014-03-19 14:14:28:WARNING: pptpd GRE protocol not allowed
2014-03-19 14:14:28:NOTICE: Load configuration
from='/etc/npppd/npppd.conf' successfully.
2014-03-19 14:14:28:INFO: tun0 Started ip4addr=10.128.120.1
2014-03-19 14:14:28:INFO: tun1 Started ip4addr=10.128.120.129

Is there any way to make two VPN connections work with pppx, or are you
stuck with tun for that scenario?

Thanks...



Re: npppd can't open /dev/pppx1

2014-03-19 Thread Paul B. Henson
D'oh, I finally realized I needed to go to /dev and MAKEDEV pppx1 8-/.

Now it's working fine. I had thought pppx was one of those magic
clonable devices that you didn't need to explicitly create, I guess I
was mistaken. When I was testing the vpn, there were pppx1 and pppx2
interfaces that showed up in ifconfig for the clients, which I guess led
me to believe I didn't have to do anything special to use pppx1 in the
npppd config.

Thanks, and sorry for the noise.


On Wed, Mar 19, 2014 at 02:29:35PM -0700, Paul B. Henson wrote:
 I set up an L2TP VPN with npppd recently using pppx, and other than some
 routing issues with ospfd it works great. I'm trying to add a second VPN
 connection, but that doesn't seem to work using pppx.
 
 With this config:
 
 interface pppx0 address 10.128.120.1 ipcp IPCP_admin
 interface pppx1 address 10.128.120.129 ipcp IPCP
 
 bind tunnel from L2TP_ipv4 authenticated by LOCAL_admin to pppx0
 bind tunnel from L2TP_ipv4 authenticated by LOCAL to pppx1
 
 npppd won't start:
 
 # npppd -d
 2014-03-19 14:08:27:NOTICE: Starting npppd pid=28792 version=5.0.0
 2014-03-19 14:08:27:WARNING: pptpd GRE protocol not allowed
 2014-03-19 14:08:27:NOTICE: Load configuration
 from='/etc/npppd/npppd.conf' successfully.
 2014-03-19 14:08:27:INFO: pppx0 Started pppx
 2014-03-19 14:08:27:ERR: pppx1 open(/dev/pppx1) failed: No such file or 
 directory
 
 If I switch to tun instead of pppx:
 
 interface tun0 address 10.128.120.1 ipcp IPCP_admin
 interface tun1 address 10.128.120.129 ipcp IPCP
 bind tunnel from L2TP_ipv4 authenticated by LOCAL_admin to tun0
 bind tunnel from L2TP_ipv4 authenticated by LOCAL to tun1
 
 it works fine:
 
 # npppd -d
 2014-03-19 14:14:28:NOTICE: Starting npppd pid=3355 version=5.0.0
 2014-03-19 14:14:28:WARNING: pptpd GRE protocol not allowed
 2014-03-19 14:14:28:NOTICE: Load configuration
 from='/etc/npppd/npppd.conf' successfully.
 2014-03-19 14:14:28:INFO: tun0 Started ip4addr=10.128.120.1
 2014-03-19 14:14:28:INFO: tun1 Started ip4addr=10.128.120.129
 
 Is there any way to make two VPN connections work with pppx, or are you
 stuck with tun for that scenario?
 
 Thanks...



npppd with two pppx interfaces causes kernel panic

2014-03-19 Thread Paul B. Henson
After successfully setting up an L2TP VPN with npppd and pppx, I tried
to add a second VPN subnet with a different authentication base. I was
working remotely, and after starting npppd in debug mode:

bash-4.2# npppd -d
2014-03-19 14:41:50:NOTICE: Starting npppd pid=32407 version=5.0.0
2014-03-19 14:41:50:WARNING: pptpd GRE protocol not allowed
2014-03-19 14:41:51:NOTICE: Load configuration
from='/etc/npppd/npppd.conf' successfully.
2014-03-19 14:41:51:INFO: pppx0 Started pppx
2014-03-19 14:41:51:INFO: pppx1 Started pppx
2014-03-19 14:41:51:INFO: Listening /var/run/npppd_ctl (npppd_ctl)
2014-03-19 14:41:51:INFO: ipcp=IPCP_admin pool
dyn_pool=[10.128.120.0/25] pool=[10.128.120.0/25]
2014-03-19 14:41:51:INFO: ipcp=IPCP pool dyn_pool=[10.128.120.128/25]
pool=[10.128.120.128/25]
2014-03-19 14:41:51:INFO: Loading pool config successfully.

the box stopped responding :(. When I got on site, it was frozen and
nonresponsive. I rebooted, and on the way back up it panic'd when
starting npppd:

starting early daemons: syslogd pflogd named ntpd isakmpd npppd.
uvm_fault(0xfe812f620e00, 0x30, 0, 1) - e
fatal page fault in supervisor mode
trap type 6 code 0 rip 81385b40 cs 8 rflags 10257 cr2  30 cpl 0
rsp 8000221fdd38
panic: trap type 6, code=0, pc=81385b40
Starting stack trace...
panic() at panic+0xf5
trap() at trap+0x7f1
--- trap (number 6) ---
mtx_enter() at mtx_enter
VOP_KQFILTER() at VOP_KQFILTER+0x2b
kqueue_register() at kqueue_register+0x332
sys_kevent() at sys_kevent+0x115
syscall() at syscall+0x249
--- syscall (number 270) ---
end of kernel
end trace frame: 0x11be0a5e, count: 250
0x11be006eca6a:

It then said Syncing disks and sat there for 30 minutes, at which
point I gave up, booted in single user, and disabled npppd.
Unfortunately I don't have a serial console logger at the moment, so
while I assume it did the same panic when I was working remotely I don't
have logs for it. This is a 5.4 box with a generic kernel, other than
using config -e to enable ipmi and change the irq for com2.

Any thoughts on this? Here is the npppd config that causes it to blow
up:

authentication LOCAL_admin type local {
users-file /etc/npppd/npppd-users
username-suffix @admin
}
authentication LOCAL type local {
users-file /etc/npppd/npppd-users
}

tunnel L2TP_ipv4 protocol l2tp {
listen on 96.251.22.154
# l2tp-require-ipsec yes # buggy, doesn't work currently
}

ipcp IPCP_admin {
pool-address 10.128.120.0/25
dns-servers 10.128.0.4
allow-user-selected-address no
}
ipcp IPCP {
pool-address 10.128.120.128/25
dns-servers 10.128.0.4
allow-user-selected-address no
}

interface pppx0 address 10.128.120.1 ipcp IPCP_admin
interface pppx1 address 10.128.120.129 ipcp IPCP

bind tunnel from L2TP_ipv4 authenticated by LOCAL_admin to pppx0
bind tunnel from L2TP_ipv4 authenticated by LOCAL to pppx1



Re: npppd with two pppx interfaces causes kernel panic

2014-03-19 Thread Paul B. Henson
On Thu, Mar 20, 2014 at 10:22:51AM +0900, YASUOKA Masahiko wrote:

 pppx will be fixed.

Great :). This is a known bug then? Should I just keep an eye on the
changelog for mention of pppx changes to tell when it's safe to try
again?

 You can use tun(4) instead if you want to use multiple interfaces for
 that purpose.

Yes, I switched to tun for now pending the ability to have multiple pppx
interfaces defined. It was a rather big surprise for the box to
disappear on me while I was working with it, I don't have any out of
band access to it so it was offline until I got to it sigh.

Thanks...



Re: ospfd and L2VPN routes

2014-03-05 Thread Paul B. Henson
 From: YASUOKA Masahiko
 Sent: Wednesday, March 05, 2014 1:48 AM

 framed-ip-netmask in npppd-user to set the netmask of the route to
 the PPP link.  But it is not to set the client netmask (on iPhone).
 
 AFAIK to set the client netmask, DHCP inform can be used.

Hmm, I thought the VPN client picked up its IP address from IPCP? How would
you get DHCP involved? Also, it's not really the netmask of the client's IP
address, but the mask for the route from the client to the VPN'd network
that needs to be tweaked.



Re: ospfd and L2VPN routes

2014-03-05 Thread Paul B. Henson
 From: YASUOKA Masahiko
 Sent: Wednesday, March 05, 2014 3:20 AM

   % ospfctl show fib | grep 128
   *56 10.128.120.0/24  127.0.0.1
   *56 10.128.120.213/3210.0.0.1

Interesting, not only does it show a /24 route, it looks like it has it
marked as valid. Is this with pppx or tun? IIRC, when I tested tun ospfd
found a /24 route, but still didn't propagate it.

 And do you know why the ospfd doesn't use the whole /24?

No, not a clue :(. I tried posting an inquiry to the tech list in the hope
someone familiar with the guts of osfpd might comment, but no responses so
far.

 Even if tun(4) is used, packets are processed in-kernel.

Ah, I had thought  pipex was just for pppx, but now I see in the man page it
says pipex is used with tun(4) and pppx. What's the difference between tun
and pppx in the context of npppd then? Is there any particular reason to use
one versus the other for an L2TP VPN?

Thanks much.



Re: npppd ipcp pool address configuration

2014-03-01 Thread Paul B. Henson
On Sat, Mar 01, 2014 at 12:56:16PM +0900, YASUOKA Masahiko wrote:
 Currently the parser needs to surrounding the address-mask with double
 quote like below:
 
   pool-address 10.128.120.0/24

Ah, yes; that's much better:

2014-03-01 15:59:13:INFO: ipcp=IPCP pool dyn_pool=[10.128.120.0/24]
pool=[10.128.120.0/24]

 And also we can use
 
   pool-address 10.128.120.0:255.255.255.0

Yes, that also results in a nice clean pool:

2014-03-01 16:00:43:INFO: ipcp=IPCP pool dyn_pool=[10.128.120.0/24]
pool=[10.128.120.0/24]

I think I'll stick with the quotes for now, I like CIDR notation :).

I had actually looked a bit at the parser before I posted, nothing
initially jumped out, but now that you point it out as the culprit I see
that the pool address is defined as a STRING, and based on:

#define allowed_in_string(x) \  
(isalnum(x) || (ispunct(x)  x != '('  x != ')'  \
x != '{'  x != '}'  x != ''  x != ''  \
x != '!'  x != '='  x != '/'  x != '#'  \
x != ','))

The literal / is not allowed in a string, whereas the literal : is.

From a quick look, I don't really see anything that would be broken by
allowing a / in a non-quoted string? Would just removing the explicit
restriction on the / in the above define fix it, or I am missing
something more subtle?

 As the default, npppd doesn't use the local tunnel endpoint address
 and broadcast addresses in class network (10.0.0.0 and 10.255.255.255)
 for the clients.  Do you worry about 10.128.120.0 or 10.128.120.255 in
 this case?

Technically, given they are all /32 point to point links, you could actually
make the local side .0 and use everything up to .255 (I think), but
I'm ok with losing two addresses to conform to usual /24 semantics and
less risk of confusion at some point. Setting the pppx IP to 10.128.120.1
and the ipcp pool to 10.128.120.0/24 will work perfectly for me, thanks
much for the help.



Re: ospfd and L2VPN routes

2014-03-01 Thread Paul B. Henson
On Sat, Mar 01, 2014 at 01:48:06PM +0900, YASUOKA Masahiko wrote:
  on the other side? Right now it looks like the client is setting a
  route to 10.0.0.0/8 across the tunnel, that should actually be
  10.128.0.0/16, would setting the netmask in npppd-users fix that remote
  route? Can I set the netmask but still let the client get a dynamic IP?
 
 My answer was wrong.  Assigning statically or netmask to the client is
 not related the ospf problem, I'm sorry.

No worries, I appreciate the help :). I tried setting the netmask in
npppd-users, that didn't change the /8 route the iPhone client set. From
a little investigation, it doesn't look like there's any way to set the
client netmask for the l2tp vpn route? The client just does whatever it
wants it seems, whether to just assume a class based route (/8 in the
case of my 10.128 address) or some seem to just assume a /24 8-/. You'd
think defining the client netmask would be part of the protocol, but
unless I'm missing something, I guess it's not.

 npppd set a /32 route for a VPN client and delete it when the link
 down.
 
  Isn't each instance of pppx for the VPN a /32 route to the remote
  IP?
 
 You had 16 /32 routes.  Don't you mean you had 32 VPN clients
 actually, right?

I only had one or two test clients connected at a time. But it looks
like ospfd picks up the route when a VPN client connects, but then
doesn't drop it when it disconnects, so the routes pile up.

After reloading the fib with no vpn clients, there are no /32 routes:

# ospfctl  fib reload   
reload request sent.

# ospfctl show fib | grep 120

I connect a client and a route shows up (but isn't advertised to the other
ospf connected routers):

# ospfctl show fib | grep 120
  4 10.128.120.109/3210.128.120.1

I disconnect the client, it's still there:

# ospfctl show fib | grep 120
  4 10.128.120.109/3210.128.120.1

I reconnect the client, it receives a different IP, and there are now
two routes:

# ospfctl show fib | grep 120
  4 10.128.120.109/3210.128.120.1
  4 10.128.120.155/3210.128.120.1

Disconnect, still two routes:

# ospfctl show fib | grep 120
  4 10.128.120.109/3210.128.120.1
  4 10.128.120.155/3210.128.120.1

Definitely something not quite right with ospfd and npppd l2tp vpns.

Thanks...



Re: ospfd and L2VPN routes

2014-03-01 Thread Paul B. Henson
On Sat, Mar 01, 2014 at 07:41:10PM +0900, YASUOKA Masahiko wrote:

 I could repeat the problem.  ospfd seems not to be able to use routes
 set by npppd.  The problem seems to be come from pppx(4)'s behavior of
 its link state.
 
 Using tun(4) instead of pppx(4) avoid the problem.

If I switch npppd to use tun, after startup (before any clients even
connect) ospfd shows three routes:

# ospfctl show fib | grep 120
*56 10.128.120.0/24  127.0.0.1
  4 10.128.120.1/32  10.128.120.1
*56 10.128.120.1/32  127.0.0.1

Kind of odd ones though, routing the whole /24 to localhost? After a
client connects, it adds another /32 to it:

# ospfctl show fib | grep 120
*56 10.128.120.0/24  127.0.0.1
  4 10.128.120.1/32  10.128.120.1
*56 10.128.120.1/32  127.0.0.1
 56 10.128.120.12/32 10.128.120.1

And after the client disconnects, it goes away:

# ospfctl show fib | grep 120
*56 10.128.120.0/24  127.0.0.1
  4 10.128.120.1/32  10.128.120.1
*56 10.128.120.1/32  127.0.0.1

However, it's still not advertised by ospfd to other routers, so the
client still can't communicate with the network beyond the openbsd
router terminating the vpn connection.

So while tun seems to avoid the pppx problem of dangling routes, it
still doesn't solve the issue of ospfd not advertising them sigh.

pppx is more efficient than tun, right? It keeps packets in the kernel
rather than bouncing them through userspace?

No ospfd experts that would like to chime in :)? The vpn /32 routes
don't seem to be marked as either valid or connected...

Thanks...



  1   2   >