Re: Can't figure out what's taking up space on /

2021-08-03 Thread Greg Thomas
I thought Paul's advice only applies if I was trying to figure it out
before rebooting?  I'd already rebooted before sending my first email.



On Tue, Aug 3, 2021 at 10:40 PM Otto Moerbeek  wrote:

> On Tue, Aug 03, 2021 at 12:39:54PM -0700, Greg Thomas wrote:
>
> > I'm definitely suffering from filesystem corruption on root.  I had
> > rebooted last night with no change.
> >
> > I have no options for mounting root.
> >
> > grits# cat /etc/fstab
> > 16a27b4b4549ce04.b none swap sw
> > 16a27b4b4549ce04.a / ffs rw 1 1
> > 16a27b4b4549ce04.k /home ffs rw,nodev,nosuid 1 2
> > 16a27b4b4549ce04.d /tmp ffs rw,nodev,nosuid 1 2
> > 16a27b4b4549ce04.f /usr ffs rw,nodev 1 2
> > 16a27b4b4549ce04.g /usr/X11R6 ffs rw,nodev 1 2
> > 16a27b4b4549ce04.h /usr/local ffs rw,wxallowed,nodev 1 2
> > 16a27b4b4549ce04.j /usr/obj ffs rw,nodev,nosuid 1 2
> > 16a27b4b4549ce04.i /usr/src ffs rw,nodev,nosuid 1 2
> > 16a27b4b4549ce04.e /var ffs rw,nodev,nosuid 1 2
> > /dev/sd1c /backup ffs rw,nodev,nosuid 1 2
> >
> > I need to upgrade so I can do that from scratch.  This is my backup
> server
> > so the configuration is pretty simple.
> >
> > Not sure fsck output helps here?
> >
> > grits# fsck /dev/sd0a
> > ** /dev/rsd0a (NO WRITE)
> > ** Last Mounted on /
> > ** Root file system
> > ** Phase 1 - Check Blocks and Sizes
> > ** Phase 2 - Check Pathnames
> > ** Phase 3 - Check Connectivity
> > ** Phase 4 - Check Reference Counts
> > ** Phase 5 - Check Cyl groups
> > 12852 files, 469195 used, 35516 free (44 frags, 4434 blocks, 0.0%
> > fragmentation)
> >
> > Anyway, I'll reinstall unless someone has more learning experiences for
> me.
> >
> > And thank you to Paul for giving a quick explanation of the difference
> > between df and du.
> >
> > Thanks all!
>
> fsck looks normal for a mounted filesystem.
>
> but did you try following Paul's advice to find an open file that has
> no directory entry? That is not corruption, but explains why more
> storage is in use than du shows.
>
> -Otto
>
> >
> >
> >
> > On Tue, Aug 3, 2021 at 11:39 AM Ali Farzanrad 
> > wrote:
> >
> > > I also suspected that it is a filesystem corruption.
> > > Do you have `async` mount option on your root?
> > >
> > > Sebastien Marie  wrote:
> > > > On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd wrote:
> > > > > df shows you how much data you can write to an fs, while du shows
> the
> > > > > disk usage of files it can find.  If it can't find a file (because
> > > > > it's been deleted), it won't account for it.  But if it's been
> deleted
> > > > > and still held open by some process, it would still consume disk
> > > > > space.
> > > > >
> > > > > So it looks like a process has a file open on the root filesystem
> that
> > > > > has been deleted.  You're looking for a root-owned process that is
> > > > > (probably) long-running.  My guess the file is in /dev/ (that's my
> > > > > crystal ball talking though).
> > > > >
> > > > > Easiest way out is generally to reboot - this stops all processes
> > > > > (d0h), dus freeing up all the resources they had tied up, including
> > > > > files that had been deleted from the filesystem.  But going through
> > > > > your process list to see if you can spot something that may have
> done
> > > > > this can be a good learning experience.  In general, base OpenBSD
> > > > > daemons don't behave this way.
> > > >
> > > > I agree with Paul: you should have a running process which hold
> > > > descriptor on unlinked file.
> > > >
> > > > fstat(1) could be used to see list of opened files, and specially
> > > > unlinked files:
> > > >
> > > >  INUM   The inode number of the file.  It will be followed by an
> > > asterisk
> > > > (‘*’) if the inode is unlinked from disk.
> > > >
> > > >
> > > > $ fstat | grep -F '* -'
> > > > [...]
> > > > semarie  chrome   537   25 /tmp   48* -rw---   rwp
> > >  279793
> > > > [...]
> > > >
> > > > here, chrome (pid 537) has descriptor 25 opened to a file on /tmp
> > > > inode=48 (unlinked), the file size is 279793 bytes.
> > > >
> > > > --
> > > > Sebastien Marie
> > > >
> > > >
> > >
> > >
>


Re: Can't figure out what's taking up space on /

2021-08-03 Thread Otto Moerbeek
On Tue, Aug 03, 2021 at 12:39:54PM -0700, Greg Thomas wrote:

> I'm definitely suffering from filesystem corruption on root.  I had
> rebooted last night with no change.
> 
> I have no options for mounting root.
> 
> grits# cat /etc/fstab
> 16a27b4b4549ce04.b none swap sw
> 16a27b4b4549ce04.a / ffs rw 1 1
> 16a27b4b4549ce04.k /home ffs rw,nodev,nosuid 1 2
> 16a27b4b4549ce04.d /tmp ffs rw,nodev,nosuid 1 2
> 16a27b4b4549ce04.f /usr ffs rw,nodev 1 2
> 16a27b4b4549ce04.g /usr/X11R6 ffs rw,nodev 1 2
> 16a27b4b4549ce04.h /usr/local ffs rw,wxallowed,nodev 1 2
> 16a27b4b4549ce04.j /usr/obj ffs rw,nodev,nosuid 1 2
> 16a27b4b4549ce04.i /usr/src ffs rw,nodev,nosuid 1 2
> 16a27b4b4549ce04.e /var ffs rw,nodev,nosuid 1 2
> /dev/sd1c /backup ffs rw,nodev,nosuid 1 2
> 
> I need to upgrade so I can do that from scratch.  This is my backup server
> so the configuration is pretty simple.
> 
> Not sure fsck output helps here?
> 
> grits# fsck /dev/sd0a
> ** /dev/rsd0a (NO WRITE)
> ** Last Mounted on /
> ** Root file system
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> ** Phase 5 - Check Cyl groups
> 12852 files, 469195 used, 35516 free (44 frags, 4434 blocks, 0.0%
> fragmentation)
> 
> Anyway, I'll reinstall unless someone has more learning experiences for me.
> 
> And thank you to Paul for giving a quick explanation of the difference
> between df and du.
> 
> Thanks all!

fsck looks normal for a mounted filesystem.

but did you try following Paul's advice to find an open file that has
no directory entry? That is not corruption, but explains why more
storage is in use than du shows.

-Otto

> 
> 
> 
> On Tue, Aug 3, 2021 at 11:39 AM Ali Farzanrad 
> wrote:
> 
> > I also suspected that it is a filesystem corruption.
> > Do you have `async` mount option on your root?
> >
> > Sebastien Marie  wrote:
> > > On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd wrote:
> > > > df shows you how much data you can write to an fs, while du shows the
> > > > disk usage of files it can find.  If it can't find a file (because
> > > > it's been deleted), it won't account for it.  But if it's been deleted
> > > > and still held open by some process, it would still consume disk
> > > > space.
> > > >
> > > > So it looks like a process has a file open on the root filesystem that
> > > > has been deleted.  You're looking for a root-owned process that is
> > > > (probably) long-running.  My guess the file is in /dev/ (that's my
> > > > crystal ball talking though).
> > > >
> > > > Easiest way out is generally to reboot - this stops all processes
> > > > (d0h), dus freeing up all the resources they had tied up, including
> > > > files that had been deleted from the filesystem.  But going through
> > > > your process list to see if you can spot something that may have done
> > > > this can be a good learning experience.  In general, base OpenBSD
> > > > daemons don't behave this way.
> > >
> > > I agree with Paul: you should have a running process which hold
> > > descriptor on unlinked file.
> > >
> > > fstat(1) could be used to see list of opened files, and specially
> > > unlinked files:
> > >
> > >  INUM   The inode number of the file.  It will be followed by an
> > asterisk
> > > (‘*’) if the inode is unlinked from disk.
> > >
> > >
> > > $ fstat | grep -F '* -'
> > > [...]
> > > semarie  chrome   537   25 /tmp   48* -rw---   rwp
> >  279793
> > > [...]
> > >
> > > here, chrome (pid 537) has descriptor 25 opened to a file on /tmp
> > > inode=48 (unlinked), the file size is 279793 bytes.
> > >
> > > --
> > > Sebastien Marie
> > >
> > >
> >
> >



scp -M sftp

2021-08-03 Thread Christian Weisgerber
Damien Miller:

> CVSROOT:  /cvs
> Module name:  src
> Changes by:   d...@cvs.openbsd.org2021/08/02 17:38:27
> 
> Modified files:
>   usr.bin/ssh: scp.1 scp.c 
>   usr.bin/ssh/scp: Makefile 
> 
> Log message:
> support for using the SFTP protocol for file transfers in scp, via a
> new "-M sftp" option. Marked as experimental for now.
[...]

You want to test this but
* are tired of typing "-M sftp" all the time, and
* can't use alias scp='scp -M sftp' because you frequently use -3?

Here you go, this simple wrapper should take care of it:

--->
#!/bin/sh

sftp='-M sftp'

while getopts :12346ABCTdfpqrtvD:F:J:M:P:S:c:i:l:o: name; do
case $name in
[3M])   sftp='' ;;
esac
done

exec /usr/bin/scp $sftp "$@"
<---

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: How to troubleshoot DHCP issues?

2021-08-03 Thread Mike
On 8/3/2021 11:57 AM, beebeet...@posteo.de wrote:
> The router works fine most of the time -- except that it stops
> working every one and a half day, and I have to reset the modem
> for it to work again.

In my experience with my ISP (Comcast in the US), I note the following:

When the lladdr changes, the modem needs to be restarted in order for
the new lladdr to be seen.  If I don't restart the modem, I see the
symptoms you document.

My ISP gives out leases with a 3-day duration, so the leases renew every
day and a half.

The "random" lladdr catches my eye.  But I don't know how frequently
that changes.  Could it change every time the lease is renewed?

My first suggestion might be to stay with a single lladdr for a while to
see if your setup works for more than a day and a half.

Once (if) you have that working baseline, they start experimenting with
random lladdrs.







Re: Crash when unplugging a UPS USB connection

2021-08-03 Thread Mike
On 7/12/2021 4:16 PM, Mike wrote:
> On 7/12/2021 3:12 PM, Mike Larkin wrote:
>> On Sun, Jul 11, 2021 at 04:11:39PM -0400, Mike wrote:
>>> I run NUT on OpenBSD to monitor a Cyperpower UPS.  The UPS plugs into
>>> the OpenBSD box via a USB connection.
>>>
>>> OpenBSD 6.8, I had no problems, everything ran fine.  When the power
>>> went out, NUT saw that and reacted according to configuration.
>>>
>>> After I upgraded to OpenBSD 6.9 (a fresh install, not an in-place
>>> upgrade), when the power dropped, I'd be greeted with a blue crash screen.
>>>
>>> It seems that when the power drops, the UPS temporarily drops the USB
>>> connection, seemingly the equivalent of unplugging the USB connector.
>>>
>>> I am able to reproduce that 100% by booting up OpenBSD 6.9 with the UPS
>>> communications cable plugged into the USB port.  When I unplugged that
>>> USB connector, the crash occurs.
>>>
>>> This first occurred on my production box which is a Supermicro
>>> motherboard.  I can provide that dmesg if needed.
>>>
>>>
>>> Both OpenBSD 6.8 and current below are fresh installs on a test Lenovo
>>> laptop.
>>>
>>> On OpenBSD 6.8, when I plug in the UPS and unplug it, here is what I see
>>> on the console (dmesg is included):
>>>
>>
>> This crash happens to me as well when I unplug my upd(4). I'll try to find
>> what diff caused this.
>>
>> -ml
> 
> 
> Many thanks for the confirmation!
> 
> Mike.
> 
> 

This crash also occurs with the following two UPSs:

Cyberpower EC750G
Tripp-Lite OmniSmart1500LCDT

As before, to reproduce it

1) fresh install of OpenBSD current
2) do the reboot after the install
3) plug in the UPS
4) unplug it


I can supply images of the crash screens for the two UPSs above if they
are needed.

Thanks!





Re: nmea/udcf recommendation

2021-08-03 Thread Theo de Raadt
Christian Weisgerber  wrote:

> Stuart Henderson:
> 
> > > I don't have any practical experience with nmea(4), but I'd like
> > > to draw attention to ldattach(8)'s -t option.  Unless your receiver
> > > offers a pulse per second signal, you are limited to a very jittery
> > > timestamp from the serial telegram, mirroring udcf's fundamental
> > > problem.
> > 
> > The problem isn't getting the pulses generated, it's getting
> > them hooked up to the computer and figuring out accurately
> > when they occurred.
> 
> That's why I mentioned ldattach -t.  We can timestamp on DCD or CTS
> transitions.  On a real serial port, those will trigger an interrupt.
> That shouldn't be too bad.

Correct.

> I see that we also have code for timestamping on ucom(4), but since
> USB devices cannot directly generate interrupts and are in fact
> polled by the host controller, that will give poor results.

Correct.



Re: Can't figure out what's taking up space on /

2021-08-03 Thread Greg Thomas
I'm definitely suffering from filesystem corruption on root.  I had
rebooted last night with no change.

I have no options for mounting root.

grits# cat /etc/fstab
16a27b4b4549ce04.b none swap sw
16a27b4b4549ce04.a / ffs rw 1 1
16a27b4b4549ce04.k /home ffs rw,nodev,nosuid 1 2
16a27b4b4549ce04.d /tmp ffs rw,nodev,nosuid 1 2
16a27b4b4549ce04.f /usr ffs rw,nodev 1 2
16a27b4b4549ce04.g /usr/X11R6 ffs rw,nodev 1 2
16a27b4b4549ce04.h /usr/local ffs rw,wxallowed,nodev 1 2
16a27b4b4549ce04.j /usr/obj ffs rw,nodev,nosuid 1 2
16a27b4b4549ce04.i /usr/src ffs rw,nodev,nosuid 1 2
16a27b4b4549ce04.e /var ffs rw,nodev,nosuid 1 2
/dev/sd1c /backup ffs rw,nodev,nosuid 1 2

I need to upgrade so I can do that from scratch.  This is my backup server
so the configuration is pretty simple.

Not sure fsck output helps here?

grits# fsck /dev/sd0a
** /dev/rsd0a (NO WRITE)
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
12852 files, 469195 used, 35516 free (44 frags, 4434 blocks, 0.0%
fragmentation)

Anyway, I'll reinstall unless someone has more learning experiences for me.

And thank you to Paul for giving a quick explanation of the difference
between df and du.

Thanks all!



On Tue, Aug 3, 2021 at 11:39 AM Ali Farzanrad 
wrote:

> I also suspected that it is a filesystem corruption.
> Do you have `async` mount option on your root?
>
> Sebastien Marie  wrote:
> > On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd wrote:
> > > df shows you how much data you can write to an fs, while du shows the
> > > disk usage of files it can find.  If it can't find a file (because
> > > it's been deleted), it won't account for it.  But if it's been deleted
> > > and still held open by some process, it would still consume disk
> > > space.
> > >
> > > So it looks like a process has a file open on the root filesystem that
> > > has been deleted.  You're looking for a root-owned process that is
> > > (probably) long-running.  My guess the file is in /dev/ (that's my
> > > crystal ball talking though).
> > >
> > > Easiest way out is generally to reboot - this stops all processes
> > > (d0h), dus freeing up all the resources they had tied up, including
> > > files that had been deleted from the filesystem.  But going through
> > > your process list to see if you can spot something that may have done
> > > this can be a good learning experience.  In general, base OpenBSD
> > > daemons don't behave this way.
> >
> > I agree with Paul: you should have a running process which hold
> > descriptor on unlinked file.
> >
> > fstat(1) could be used to see list of opened files, and specially
> > unlinked files:
> >
> >  INUM   The inode number of the file.  It will be followed by an
> asterisk
> > (‘*’) if the inode is unlinked from disk.
> >
> >
> > $ fstat | grep -F '* -'
> > [...]
> > semarie  chrome   537   25 /tmp   48* -rw---   rwp
>  279793
> > [...]
> >
> > here, chrome (pid 537) has descriptor 25 opened to a file on /tmp
> > inode=48 (unlinked), the file size is 279793 bytes.
> >
> > --
> > Sebastien Marie
> >
> >
>
>


Re: nmea/udcf recommendation

2021-08-03 Thread Christian Weisgerber
Stuart Henderson:

> > I don't have any practical experience with nmea(4), but I'd like
> > to draw attention to ldattach(8)'s -t option.  Unless your receiver
> > offers a pulse per second signal, you are limited to a very jittery
> > timestamp from the serial telegram, mirroring udcf's fundamental
> > problem.
> 
> The problem isn't getting the pulses generated, it's getting
> them hooked up to the computer and figuring out accurately
> when they occurred.

That's why I mentioned ldattach -t.  We can timestamp on DCD or CTS
transitions.  On a real serial port, those will trigger an interrupt.
That shouldn't be too bad.

I see that we also have code for timestamping on ucom(4), but since
USB devices cannot directly generate interrupts and are in fact
polled by the host controller, that will give poor results.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Can't figure out what's taking up space on /

2021-08-03 Thread Ali Farzanrad
I also suspected that it is a filesystem corruption.
Do you have `async` mount option on your root?

Sebastien Marie  wrote:
> On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd wrote:
> > df shows you how much data you can write to an fs, while du shows the
> > disk usage of files it can find.  If it can't find a file (because
> > it's been deleted), it won't account for it.  But if it's been deleted
> > and still held open by some process, it would still consume disk
> > space.
> > 
> > So it looks like a process has a file open on the root filesystem that
> > has been deleted.  You're looking for a root-owned process that is
> > (probably) long-running.  My guess the file is in /dev/ (that's my
> > crystal ball talking though).
> > 
> > Easiest way out is generally to reboot - this stops all processes
> > (d0h), dus freeing up all the resources they had tied up, including
> > files that had been deleted from the filesystem.  But going through
> > your process list to see if you can spot something that may have done
> > this can be a good learning experience.  In general, base OpenBSD
> > daemons don't behave this way.
> 
> I agree with Paul: you should have a running process which hold
> descriptor on unlinked file.
> 
> fstat(1) could be used to see list of opened files, and specially
> unlinked files:
> 
>  INUM   The inode number of the file.  It will be followed by an asterisk
> (‘*’) if the inode is unlinked from disk.
> 
> 
> $ fstat | grep -F '* -'
> [...]
> semarie  chrome   537   25 /tmp   48* -rw---   rwp   279793
> [...]
> 
> here, chrome (pid 537) has descriptor 25 opened to a file on /tmp
> inode=48 (unlinked), the file size is 279793 bytes.
> 
> -- 
> Sebastien Marie
> 
> 



Re: WireGuard host crashes roughly every week

2021-08-03 Thread Matt P.
(Resending, as I forgot to include the mailing list itself)

> On Aug 1, 2021, at 3:37 AM, Stuart Henderson  wrote:
> 
> It is always good to include dmesg when reporting a problem.
> 
> An outline of the wireguard and other network config would be
> useful too. If you can give instructions to reproduce that would
> be ideal. If not then as much information about the setup as
> possible so we can try to reproduce.
> 
> Does anything funny show up in dmesg if you do "ifconfig wg0
> debug"? (replace/repeat wg0 if you have other wg interfaces).


Hi Stuart!

Your advice lead me to discover, the issue happens only with the 
"PersistantKeepalive = 25" option I had enabled on each wg-quick peer. Looks 
like you could recreate it by making a few no-address peers with this option 
enabled.

In /etc/wireguard/wg0.conf I have a config file for wg-quick:

> [Interface]
> PrivateKey = 
> ListenPort = 5  
> Address= 10.0.166.1/24
> SaveConfig = false
> MTU= 1400
> 
> [Peer]
> # ExamplePeer1
> PresharedKey= 
> PublicKey= 
> AllowedIPs= 10.0.166.2/32
> PersistentKeepalive = 25

... And so on.

The 'ifconfig wg0 debug' with PersistantKeepalive enabled leaves these messages 
in the dmesg:

> wg0: Handshake for peer 6 did not complete after 5 seconds, retrying (try 18)
> wg0: Sending handshake initiation to peer 6
> wg0: Sending handshake initiation to peer 3
> wg0: Sending handshake initiation to peer 7
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 2 did not complete after 5 seconds, retrying (try 18)
> wg0: Sending handshake initiation to peer 2
> wg0: Sending handshake initiation to peer 1
> wg0: Handshake for peer 4 did not complete after 5 seconds, retrying (try 14)
> wg0: Sending handshake initiation to peer 4
> wg0: Sending handshake initiation to peer 5
> wg0: Handshake for peer 6 did not complete after 5 seconds, retrying (try 19)
> wg0: Sending handshake initiation to peer 6
> wg0: Handshake for peer 3 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 3
> wg0: Handshake for peer 2 did not complete after 5 seconds, retrying (try 19)
> wg0: Sending handshake initiation to peer 2
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 7 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 7
> wg0: Handshake for peer 5 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 5
> wg0: Handshake for peer 4 did not complete after 5 seconds, retrying (try 15)
> wg0: Sending handshake initiation to peer 4
> wg0: Handshake for peer 1 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 1

You can see the peers don't have pre-configured addresses as they are usually 
phones and not connected. But with PersistantKeepalive it looks like Wireguard 
is trying to connect to them, despite having no idea where to find them.

I commented out the PersistantKeepalive lines and the number of mbufs stays low 
as it should be. The VPN still works fine. Supposedly the PersistantKeepalive 
would prevent a NAT from destroying your connection due to no traffic in 30 
seconds, which I've never seen before, but I figured better safe than sorry.

With PersistantKeepalive disabled on the server (enabled on the client), if I 
connect to the server and then disconnect, it begins trying to handshake the 
missing partner again, but this time it _doesn't_ raise the mbufs.

> wg0: Receiving handshake initiation from peer 0
> wg0: Sending handshake response to peer 0
> wg0: Receiving keepalive packet from peer 0
> wg0: Sending keepalive packet to peer 0
> wg0: Receiving keepalive packet from peer 0
> wg0: Receiving keepalive packet from peer 0
> wg0: Receiving keepalive packet from peer 0
> wg0: Receiving keepalive packet from peer 0
> wg0: Retrying handshake with peer 0 because we stopped hearing back after 15 
> seconds
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 3)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 4)
> wg0: Sending handshake initiation to peer 0
> wg0: Retrying handshake with peer 0 because we stopped hearing back after 15 
> seconds
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 2)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 3)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 5 seconds, retrying (try 4)
> wg0: Sending handshake initiation to peer 0
> wg0: Handshake for peer 0 did not complete after 

How to troubleshoot DHCP issues?

2021-08-03 Thread beebeetles

Hi all,

Me again on some DHCP-related issues...

So I started using OpenBSD as my home router around two weeks ago,
running openBSD 6.9. It obtains its IP address from the ISP via
DHCP. The setup is pretty simple, just the following two lines in
my hostname.if file:

lladdr random
inet autoconf

The router works fine most of the time -- except that it stops
working every one and a half day, and I have to reset the modem
for it to work again.

After doing what is in my knowledge to troubleshoot this issue,
I'm still clueless as to what might be causing the problem, thus
hoping to seek some help here.

Can anyone offer some suggestions on what I can do to nail down
the issue?

Below are some of the observations I've made so far:

- Doesn't matter whether I'm using dhclient of dhcpleased, same
  issue.

- When it stops working, tcpdump still shows outgoing packets,
  checksums all OK, but no incoming packets.

- `dhcpleasectl show interface ` shows that there is still
  one day before the lease expires.

- When this first happens, `arp -a` shows that the link layer
  address of the gateway is still in the ARP table. But of course
  it will expire after some time, and the router won't be able to
  obtain the link layer address of the gateway again after that.

- The `netstat -R` still shows the IP address of the gateway.

- My ISP would offer a few short leases at first, and then offer a
  two day lease. This issue seems always to occur around half way
  of the two day lease period.

- I tried several interface cards with drivers including axen, ure,
  axe, bse. axen dies every 10-20 min, outputing some watchdog
  timeout error; ure has the same issue described here, but throws
  some rx/tx error to dmesg in addition; bse and axe doesn't seem
  to output any errors, but both have the issue described here.

- The issue doesn't occur when the IP address is statically
  assigned.

- Didn't experience this problem when I was running Linux on the
  same machine (raspberry pi 4B).

Best Regards



Re: Regarding Openbsd and zoom/hangouts etc

2021-08-03 Thread Yoshihiro Kawamata
Hi,

I have been able to join Jitsi and Zoom using Firefox on OpenBSD.

To join Zoom meetings with FireFox from OpenBSD, please change
"OpenBSD" to "FreeBSD" or any other OS that supports Zoom in
general.useragent.override of about:config.

Also.

Set kern.audio.recording and kern.video.recording to 1 using sysctl.

Set the permissions so that the device file /dev/video? for video can
be accessed by ordinary users.

By default, Firefox will crash when you try to share the screen,
because it is a violation of the pledge system call.

To prevent this, edit the file under /etc/firefox and disable the
pledge system call.  For details, see
/usr/local/share/doc/pkg-readmes/firefox.

I believe that you can use Zoom in a similar way with Chromium.

Regards,

Yoshihiro Kawamata
http://fuguita.org/



Re: nmea/udcf recommendation

2021-08-03 Thread Stuart Henderson
On 2021-08-02, Christian Weisgerber  wrote:
> I don't have any practical experience with nmea(4), but I'd like
> to draw attention to ldattach(8)'s -t option.  Unless your receiver
> offers a pulse per second signal, you are limited to a very jittery
> timestamp from the serial telegram, mirroring udcf's fundamental
> problem.  The last time I looked--admittedly it's been a few years--
> if you wanted to have a PPS on a serial port, you had to get some
> industrial GPS module and do your own soldering.  And you can't do
> it over USB.

The problem isn't getting the pulses generated, it's getting
them hooked up to the computer and figuring out accurately
when they occurred.

The old cheap method was using the elan cpu on Soekris 45xx
(https://www.usenix.org/system/files/login/articles/160-vandrunen.pdf,
http://phk.freebsd.dk/soekris/pps/), which had accurate hardware
imestamping of external signals (probably added for some particular
customer use case, not common in PC hardware).

The last gps-backed ntp server I built was an rpi with Linux and a
u-blox GPS HAT which fed a PPS signal via a GPIO pin. It was a bit of a
faff but not too bad (there are lots of guides - search "raspberry pi
ntp gps kernel pps"). That's probably the cheapest way with accuracy
acceptable for most purposes. But if budget allowed I'd just buy a
leontp, higher accuracy, less hassle.

OpenBSD really isn't the ideal OS for this. Bumping HZ it will be less
bad, but if you need accuracy you can do a lot better.




Re: Can't figure out what's taking up space on /

2021-08-03 Thread Sebastien Marie
On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd wrote:
> df shows you how much data you can write to an fs, while du shows the
> disk usage of files it can find.  If it can't find a file (because
> it's been deleted), it won't account for it.  But if it's been deleted
> and still held open by some process, it would still consume disk
> space.
> 
> So it looks like a process has a file open on the root filesystem that
> has been deleted.  You're looking for a root-owned process that is
> (probably) long-running.  My guess the file is in /dev/ (that's my
> crystal ball talking though).
> 
> Easiest way out is generally to reboot - this stops all processes
> (d0h), dus freeing up all the resources they had tied up, including
> files that had been deleted from the filesystem.  But going through
> your process list to see if you can spot something that may have done
> this can be a good learning experience.  In general, base OpenBSD
> daemons don't behave this way.

I agree with Paul: you should have a running process which hold
descriptor on unlinked file.

fstat(1) could be used to see list of opened files, and specially
unlinked files:

 INUM   The inode number of the file.  It will be followed by an asterisk
(‘*’) if the inode is unlinked from disk.


$ fstat | grep -F '* -'
[...]
semarie  chrome   537   25 /tmp   48* -rw---   rwp   279793
[...]

here, chrome (pid 537) has descriptor 25 opened to a file on /tmp
inode=48 (unlinked), the file size is 279793 bytes.

-- 
Sebastien Marie



Re: Can't figure out what's taking up space on /

2021-08-03 Thread Paul de Weerd
df shows you how much data you can write to an fs, while du shows the
disk usage of files it can find.  If it can't find a file (because
it's been deleted), it won't account for it.  But if it's been deleted
and still held open by some process, it would still consume disk
space.

So it looks like a process has a file open on the root filesystem that
has been deleted.  You're looking for a root-owned process that is
(probably) long-running.  My guess the file is in /dev/ (that's my
crystal ball talking though).

Easiest way out is generally to reboot - this stops all processes
(d0h), dus freeing up all the resources they had tied up, including
files that had been deleted from the filesystem.  But going through
your process list to see if you can spot something that may have done
this can be a good learning experience.  In general, base OpenBSD
daemons don't behave this way.

Cheers,

Paul 'WEiRD' de Weerd

On Tue, Aug 03, 2021 at 12:48:42AM -0700, Greg Thomas wrote:
| grits# df -h
| Filesystem SizeUsed   Avail Capacity  Mounted on
| /dev/sd0a  986M936M162K   100%/
| /dev/sd0k 57.7G   23.7G   31.1G43%/home
| /dev/sd0d  3.9G   10.0K3.7G 0%/tmp
| /dev/sd0f  5.8G1.1G4.4G21%/usr
| /dev/sd0g  986M234M702M25%/usr/X11R6
| /dev/sd0h 16.8G   35.5M   15.9G 0%/usr/local
| /dev/sd0j  5.8G2.0K5.5G 0%/usr/obj
| /dev/sd0i  1.9G2.0K1.8G 0%/usr/src
| /dev/sd0e 13.8G   18.8M   13.1G 0%/var
| /dev/sd1c  440G305G113G73%/backup
| 
| grits# du -xsh /
| 186M/
| 
| I just removed /bsd.sp to free up a little bit of space but I don't
| understand the discrepancy between df and du.  How do I troubleshoot
| further?
| 
| Thanks,
| Greg

-- 
>[<++>-]<+++.>+++[<-->-]<.>+++[<+
+++>-]<.>++[<>-]<+.--.[-]
 http://www.weirdnet.nl/ 



Can't figure out what's taking up space on /

2021-08-03 Thread Greg Thomas
grits# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/sd0a  986M936M162K   100%/
/dev/sd0k 57.7G   23.7G   31.1G43%/home
/dev/sd0d  3.9G   10.0K3.7G 0%/tmp
/dev/sd0f  5.8G1.1G4.4G21%/usr
/dev/sd0g  986M234M702M25%/usr/X11R6
/dev/sd0h 16.8G   35.5M   15.9G 0%/usr/local
/dev/sd0j  5.8G2.0K5.5G 0%/usr/obj
/dev/sd0i  1.9G2.0K1.8G 0%/usr/src
/dev/sd0e 13.8G   18.8M   13.1G 0%/var
/dev/sd1c  440G305G113G73%/backup

grits# du -xsh /
186M/

I just removed /bsd.sp to free up a little bit of space but I don't
understand the discrepancy between df and du.  How do I troubleshoot
further?

Thanks,
Greg


Re: nmea/udcf recommendation

2021-08-03 Thread Maurice Janssen
On Mon, Aug 02, 2021 at 06:38:32PM +0200, Jan Stary wrote:
>Hello,
>
>playing with ntpd a bit, I am looking for a working
>nmea or udcf sensor. Can people please recommend
>an easy to use device known to work?

I use a Garmin GPS 18x with ntpd.  Works fine, just make sure you flash
it with the latest firmware (my model had an older firmware from before
the 2019 week number rollover, which confused ntpd so it wouldn't accept
the time).

I also use a Meinberg C51 DCF receiver.  Not as accurate as GPS, but also
works fine.  This model is EOL.  I expect that the newer model (C600RS)
also works, but I've never used it.

Maurice