Re: 4k sector disk on APU2 problems

2021-03-02 Thread Raimo Niskanen
On Mon, Mar 01, 2021 at 09:41:31PM +, Stuart Henderson wrote:
> On 2021-03-01, Raimo Niskanen  wrote:
> > Hi Misc!
> >
> > Unfortunately I do not have one clear question here, but I wonder if 
> > somebody
> > could shed som light on some problems I have encountered on my PC Engines 
> > APU2.
> >
> > It runs OpenBSD 6.7 from a 32 GB mSATA SSD disk, and I would like to change
> > the disk since it is a few years old now, so I buyed a 128 GB Kingston
> > mSATA SSD, and an mSATA -> SATA adapter and put that combo in an USB 2
> > external disk adapter.
> >
> > The disk showed up as a 4k sector disk, and after installing OpenBSD 6.7
> > over USB over the mSATA-SATA adapter I plugged it in the internal mSATA
> > connector, and it did not boot.
> 
> This is a problem with some USB-SATA adapters. See e.g.
> https://www.klennet.com/notes/2018-04-14-usb-and-sector-size.aspx
> 

Just my bad luck, then...

> > Much fumbling later it seems that when the disk is connected to the
> > internal mSATA slot it is seen as a 512 bytes per sector disk.  I do not
> > know what the BIOS thinks of it (factory SeaBIOS 1.10.something).  When I
> > re-installed with the disk in the mSATA slot I got a bootable installation.
> > Both fdisk and disklabel now says the disk has got 512 bytes per sector.
> > (fdisk says nothing but for a 4k disk it should say it is a 4k disk)
> >
> > My old 32 GB mSATA disk is readable over the mSATA-SATA adapter USB adapter
> > as a 512 bytes per sector disk.
> 
> You could try looking for a different adapter but at this point 
> I would probably install on the new drive (PXE boot or use another USB
> drive to boot the installer), then copy files back from the old drive.
> 
> > So I am just curious about how to handle this disk.  I can install to it
> > in the internal mSATA connector and read the old installation over the
> > mSATA-SATA-USB-adapter.  But one day when I want to install to a new disk
> > again, I will not be able to read from the disk in the 
> > mSATA-SATA-USB-adapter,
> > so the next re-installation looks unpromising.
> 
> backup/restore over the network via another machine perhaps?
> 
> > Some more specific questions:
> > * Would upgrading the BIOS be a good idea
> 
> yes but it won't help with this problem.
> (https://github.com/pcengines/apu2-documentation/blob/master/docs/apu_CPU_boost.md)
> 
> > * Sould upgrading to OpenBSD 6.8 improve the situation
> 
> it won't.
> 
> > * How is the disk sector size determined, and can I affect that?
> 
> by the manufacturer.
> 

Thank you for the information!  Enlighting!

Since I can boot from the internal SD card as well, I can use a different
USB drive as dump/restore storage instead of an external machine.  This
USB encosure sector size pecularity only blocks me from direct copy from
old installation to new, for future re-installations.

But for current re-installation I can read the old disk from the USB
enclosure, since it apparently does not alter the sector size for the old
32 GB disk.

Cheers
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



4k sector disk on APU2 problems

2021-03-01 Thread Raimo Niskanen
Hi Misc!

Unfortunately I do not have one clear question here, but I wonder if somebody
could shed som light on some problems I have encountered on my PC Engines APU2.

It runs OpenBSD 6.7 from a 32 GB mSATA SSD disk, and I would like to change
the disk since it is a few years old now, so I buyed a 128 GB Kingston
mSATA SSD, and an mSATA -> SATA adapter and put that combo in an USB 2
external disk adapter.

The disk showed up as a 4k sector disk, and after installing OpenBSD 6.7
over USB over the mSATA-SATA adapter I plugged it in the internal mSATA
connector, and it did not boot.

Much fumbling later it seems that when the disk is connected to the
internal mSATA slot it is seen as a 512 bytes per sector disk.  I do not
know what the BIOS thinks of it (factory SeaBIOS 1.10.something).  When I
re-installed with the disk in the mSATA slot I got a bootable installation.
Both fdisk and disklabel now says the disk has got 512 bytes per sector.
(fdisk says nothing but for a 4k disk it should say it is a 4k disk)

My old 32 GB mSATA disk is readable over the mSATA-SATA adapter USB adapter
as a 512 bytes per sector disk.

Some time during my fumbling with the 120 GB disk in the mSATA slot, fdisk -v
claimed it could not read from sector 0, but the disklabel command could read
a disklabel.  I think the disklabel then claimed the disk was a 4k sector disk.

So I am just curious about how to handle this disk.  I can install to it
in the internal mSATA connector and read the old installation over the
mSATA-SATA-USB-adapter.  But one day when I want to install to a new disk
again, I will not be able to read from the disk in the mSATA-SATA-USB-adapter,
so the next re-installation looks unpromising.

Some more specific questions:
* Would upgrading the BIOS be a good idea
* Sould upgrading to OpenBSD 6.8 improve the situation
* How is the disk sector size determined, and can I affect that?

Cheers!
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: PC Engines APU2 boot problem

2021-03-01 Thread Raimo Niskanen
On Wed, Feb 17, 2021 at 02:05:27PM +0100, Raimo Niskanen wrote:
> On Wed, Feb 17, 2021 at 12:36:57PM +0100, Stefan Sperling wrote:
> > On Wed, Feb 17, 2021 at 12:08:52PM +0100, Raimo Niskanen wrote:
> > > Hello misc!
> > > 
> > > I have problem booting an APU2 from SD card and USB stick.
> > > It boots fine from the mSATA disk where I have the OpenBSD installation
> > > that I have upgraded several times using sysupgrade(8).
> > > 
> > > I have tried to write install67.fs and install68.img to an SD card and to
> > > an USB stick from a Linux machine using e.g
> > > dd if=install67.fs of=/dev/sdc bs=1M
> > > 
> > > On the APU:s serial console, I press [F10] to get a boot prompt, and then
> > > select the SD card or the USB stick.  The kernel is loaded and the last
> > > printout is "Entry point: 0x..." something.  The next line
> > > [ELF ... whatnot] does not come.  After a while the APU resets and boots
> > > again, or sometimes hangs.
> > 
> > Before loading a kernel the serial console needs to be enabled with:
> > 
> >   stty com0 115200
> >   set tty com0
> > 
> > On an installed system /etc/boot.conf is usually set up to do this
> > automatically but manual setup is still required when booting from
> > other media.
> 
> Oh, bummer!  Of course.  I hope it is such a stupid mistake!
> I will try when I get a new opportunity...
> 
> Thank you very much.

Confirmed.  It was nothing more than that silly beginner's mistake.
Than you for the cluestick!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: PC Engines APU2 Leds control

2021-02-22 Thread Raimo Niskanen
I have now made a pull request for an example on PC Engines'
qpu_gpio_lib GitHub repository of my LED server solution:

https://github.com/pcengines/apu_gpio_lib/pull/4

Maybe it should use unveil(2) and pledge(2), if I manage to figure out how
to use them...

/ Raimo Niskanen



On Wed, Feb 17, 2021 at 11:53:31AM +0100, Raimo Niskanen wrote:
> I solved this problem a while ago using
> https://github.com/pcengines/apu_gpio_lib
> since gpio(4), that they linked to, only seems to work for APU1.
> I have no such device mentioned in dmesg(8) on my APU2,
> and gpioctl(8) says all /dev/gpio? devices are not configured.
> 
> OpenBSD does not allow direct memory access in default securelevel(7),
> so I wrote a small daemon that I start from rc.securelevel(8) which
> reads one byte commands from a fifo to control the leds.
> 
> In my case ifstated(8) writes to the fifo to show status.
> This solution works just fine for me.
> 
> Unfortunately the code is in a lousy state build-wise, so I need to clean
> it up and for example create a pull request for PCEngines' repository
> to add this daemon as an OpenBSD example.  Even if they would not
> accept a pull request it would be published my GitHub account...
> 
> / Raimo Niskanen
> 
> 
> On Fri, May 08, 2020 at 09:43:38PM +0200, Sacha wrote:
> > Dear all,
> > 
> > I'm enjoying OpenBSD on PC Engines hardwares called APU2:
> > https://www.pcengines.ch/apu2.htm
> > 
> > There is 3 led, which could be very usefull to deliver informations to
> > the endusers, but I never could control them with OpenBSD /o\
> > 
> > Is any way to make it work ?
> > 
> > On PCEngines forum I got the following answer:
> > 
> > >You cannot control the GPIOs on J20, because those are are driven by
> > a NCT5104D and wbsio(4) only supports hardware monitoring.
> > >The LEDs OTOH are on GPIOs of the AMD FCH. I am not a hardware guy, and
> > OpenBSD seems to have a lot of drivers which attach - but probably none
> > for those GPIOs.
> > >If you want to dig deeper, there is AMD documentation for the FCH and
> > also a linux driver called "amd-fch-gpio"
> > 
> > >Update: There seems to be somebody, who worked on this a while ago on
> > OpenBSD: https://marc.info/?l=openbsd-tech=155355977613046
> > 
> > 
> > Sacha.
> 
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: PC Engines APU2 boot problem

2021-02-17 Thread Raimo Niskanen
On Wed, Feb 17, 2021 at 12:36:57PM +0100, Stefan Sperling wrote:
> On Wed, Feb 17, 2021 at 12:08:52PM +0100, Raimo Niskanen wrote:
> > Hello misc!
> > 
> > I have problem booting an APU2 from SD card and USB stick.
> > It boots fine from the mSATA disk where I have the OpenBSD installation
> > that I have upgraded several times using sysupgrade(8).
> > 
> > I have tried to write install67.fs and install68.img to an SD card and to
> > an USB stick from a Linux machine using e.g
> > dd if=install67.fs of=/dev/sdc bs=1M
> > 
> > On the APU:s serial console, I press [F10] to get a boot prompt, and then
> > select the SD card or the USB stick.  The kernel is loaded and the last
> > printout is "Entry point: 0x..." something.  The next line
> > [ELF ... whatnot] does not come.  After a while the APU resets and boots
> > again, or sometimes hangs.
> 
> Before loading a kernel the serial console needs to be enabled with:
> 
>   stty com0 115200
>   set tty com0
> 
> On an installed system /etc/boot.conf is usually set up to do this
> automatically but manual setup is still required when booting from
> other media.

Oh, bummer!  Of course.  I hope it is such a stupid mistake!
I will try when I get a new opportunity...

Thank you very much.

Cheers
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



PC Engines APU2 boot problem

2021-02-17 Thread Raimo Niskanen
Hello misc!

I have problem booting an APU2 from SD card and USB stick.
It boots fine from the mSATA disk where I have the OpenBSD installation
that I have upgraded several times using sysupgrade(8).

I have tried to write install67.fs and install68.img to an SD card and to
an USB stick from a Linux machine using e.g
dd if=install67.fs of=/dev/sdc bs=1M

On the APU:s serial console, I press [F10] to get a boot prompt, and then
select the SD card or the USB stick.  The kernel is loaded and the last
printout is "Entry point: 0x..." something.  The next line
[ELF ... whatnot] does not come.  After a while the APU resets and boots
again, or sometimes hangs.

The BIOS is factory installed SeaBIOS 1.10... something.

Can I expect a BIOS upgrade (flashrom) to solve this?  I might have had 2
USB sticks in when booting, might that provoke a bug?

This machine should boot OpenBSD 6.7 from at least an USB stick, right?

Cheers
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: PC Engines APU2 Leds control

2021-02-17 Thread Raimo Niskanen
I solved this problem a while ago using
https://github.com/pcengines/apu_gpio_lib
since gpio(4), that they linked to, only seems to work for APU1.
I have no such device mentioned in dmesg(8) on my APU2,
and gpioctl(8) says all /dev/gpio? devices are not configured.

OpenBSD does not allow direct memory access in default securelevel(7),
so I wrote a small daemon that I start from rc.securelevel(8) which
reads one byte commands from a fifo to control the leds.

In my case ifstated(8) writes to the fifo to show status.
This solution works just fine for me.

Unfortunately the code is in a lousy state build-wise, so I need to clean
it up and for example create a pull request for PCEngines' repository
to add this daemon as an OpenBSD example.  Even if they would not
accept a pull request it would be published my GitHub account...

/ Raimo Niskanen


On Fri, May 08, 2020 at 09:43:38PM +0200, Sacha wrote:
> Dear all,
> 
> I'm enjoying OpenBSD on PC Engines hardwares called APU2:
> https://www.pcengines.ch/apu2.htm
> 
> There is 3 led, which could be very usefull to deliver informations to
> the endusers, but I never could control them with OpenBSD /o\
> 
> Is any way to make it work ?
> 
> On PCEngines forum I got the following answer:
> 
> >You cannot control the GPIOs on J20, because those are are driven by
> a NCT5104D and wbsio(4) only supports hardware monitoring.
> >The LEDs OTOH are on GPIOs of the AMD FCH. I am not a hardware guy, and
> OpenBSD seems to have a lot of drivers which attach - but probably none
> for those GPIOs.
> >If you want to dig deeper, there is AMD documentation for the FCH and
> also a linux driver called "amd-fch-gpio"
> 
> >Update: There seems to be somebody, who worked on this a while ago on
> OpenBSD: https://marc.info/?l=openbsd-tech=155355977613046
> 
> 
> Sacha.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-11-25 Thread Raimo Niskanen
I have upgraded the machines to 6.6 and the problem seems to be gone.
The machines now have been up for 10 and 16 days, which is a record.

Great!

/ Raimo Niskanen



On Fri, Nov 01, 2019 at 11:50:18AM +0100, Andreas Kusalananda Kähäri wrote:
> On Fri, Nov 01, 2019 at 11:38:06AM +0100, Raimo Niskanen wrote:
> > On Mon, Sep 09, 2019 at 05:44:32PM +0200, Raimo Niskanen wrote:
> > > On Mon, Sep 09, 2019 at 05:42:02PM +0200, Raimo Niskanen wrote:
> > > > On Wed, Jul 31, 2019 at 05:46:08PM +0200, Raimo Niskanen wrote:
> > > > > On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> > > > > > On 2019-07-29, Raimo Niskanen  
> > > > > > wrote:
> > > > > > > A new hang, I tried to invstigate:
> > > > > > >
> > > > > > > At July 19 the last log entry from my 'ps' log was from 14:55, 
> > > > > > > which is
> > > > > > > also the time on the 'systat vmstat' screen when it froze.  Then 
> > > > > > > the machine
> > > > > > > hums along but just after midnight at 00:42:01 the first "/bsd: 
> > > > > > > process:
> > > > > > > table is full" entry appears.  That message repeats until I 
> > > > > > > rebooted it
> > > > > > > today at July 29 10:48.
> > > > > > >
> > > > > > > I had a terminal with top running.  It was still updating.  It 
> > > > > > > showed about
> > > > > > > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then 
> > > > > > > (after
> > > > > > > the process table had gotten full) it had 1282 idle processes and 
> > > > > > > 1 on
> > > > > > > processor, which was 'top' itself.
> > > > > > > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 
> > > > > > > 0K/16G.
> > > > > > >
> > > > > > > I had 8 shells under tmux ready for debugging.  'ls worked.
> > > > > > > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > > > > > > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > > > > > > and then hung.  'exec reboot' did not succeed.  Neither did a 
> > > > > > > short power
> > > > > > > button, that at least caused a printout "stopping daemon 
> > > > > > > nginx(failed)",
> > > > > > > but got no further.  I had to do a hard power off. 
> > > > > > >
> > > > > > > My theory now is that our daily tests right before 14:55 started 
> > > > > > > a process
> > > > > > > (this process is the top 'top' process with 10:14 execution time) 
> > > > > > > that
> > > > > > > triggers a lock or other contention problem in the kernel which 
> > > > > > > causes
> > > > > > > one CPU to spin in the system, and blocks processes from dying.
> > > > > > > About 10 hours later the process table gets full.
> > > > > > >
> > > > > > > Any, ANY ideas of how to proceed would be appreciated!
> > > > > > >
> > > > > > > Best Regards
> > > > > > 
> > > > > > Did you notice any odd waitchan's (WAIT in top output)?
> > > > > > 
> > > > > > Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> > > > > > set), then try to break into DDB during a hang and see how things 
> > > > > > look
> > > > > > in ps there. (Test breaking into DDB before a hang first so you know
> > > > > > that you can do it .. you can just "c" to continue).
> > > > > > 
> > > > > > There might also be clues in things like "sh malloc" or "sh all 
> > > > > > pools".
> > > > > > 
> > > > > > Perhaps you could also get clues from running a kernel built with
> > > > > > 'option WITNESS', you may get some messages in dmesg, or it adds 
> > > > > > commands
> > > > > > to ddb like "show locks", "show all locks", "show witness" (see 
> > > > > > ddb(4) for
> > > > > > details).
> > >

Re: Disabling laptop display & turning off suspend on lid close

2019-11-22 Thread Raimo Niskanen
On Fri, Nov 22, 2019 at 09:45:44AM +0100, Unicorn wrote:
> On Fri, 2019-11-22 at 09:28 +0100, Claus Assmann wrote:
> > On Fri, Nov 22, 2019, Unicorn wrote:
> > 
> > > Still would like to know how to turn the display off, have not
> > > figured
> > > that out yet ;)
> > 
> > man xset
> > 
> > Not sure if this is what you want (yes, it's ugly):
> > 
> > #!/bin/sh
> > if test $# -ge 1
> > then
> >   TO=$1
> > else
> >   TO=300
> > fi
> > xset s $TO
> > xset s blank
> > if test $# -lt 1
> > then
> > xset dpms 500 660 900
> > fi
> > 
> 
> Thank you for the suggestion!
> 
> Will using xset work without running X? I intended to not use X as I am
> just trying to set up a simple mailserver. :)
> 
> Best,
> 
> Unicorn

Have a look at wsconsctl.conf(5).  Might be relevant.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Patch suggestion for sysupgrade

2019-11-15 Thread Raimo Niskanen
On Fri, Nov 15, 2019 at 07:00:05AM +0100, NilsOla Nilsson wrote:
> I have upgraded a machine where /home was NFS-mounted,
> like this:
> - check that the / partition has space for the files
>   that will populate /home/_sysupgrade
> - unmount /home
> - comment ut the /home line in /etc/fstab
> - upgrade with sysupgrade
> - restore the line in /etc/fstab
> - mount -a

That is another way to do it.

Though, for the last sysupgrade on amd64 6.5 -> 6.6 the _sysupgrade
directory used 443M, and to depend on that it is fine to put that on the /
partition feels a bit risky...  It _should_ work since the / partition
is typically 1G and has 833M available, but it feels discomforting.

/ Raimo


> 
> All this could be done remote.
> 
> Note that I can log in to a user where the home
> directory is not NFS-mounted, in our case
> /local_home/
> 
> On Thu, Nov 14, 2019 at 03:01:18PM +0100, Raimo Niskanen wrote:
> > The use case for this patch is that in our lab network we have NFS
> > automounted /home/* directories, so using /home/_sysupgrade
> > for sysupgrade does not work.
> > 
> > With this patch it is easy to modify /usr/sbin/sysupgrade and change
> > just the line SETSDIR=/home/_sysupgrade to point to some other local file
> > system that is outside hier(7) for example /opt/_sysupgrade
> > or /srv/_sysupgrade.
> > 
> > Even using /var/_sysupgrade or /usr/_sysupgrade should work.  As far as
> > I can tell the sysupgrade directory only has to be on a local file system,
> > and not get overwritten by the base system install.
> > 
> > The change for mkdir -p ${SETSDIR} is to make the script more defensive 
> > about
> > the result of mkdir, e.g in case the umask is wrong, or if the directory
> > containing the sysupgrade directory has got the wrong group, etc.
> > 
> > 
> > A follow-up to this patch, should it be accepted, could be to add an option
> > -d SysupgradeDir, but I do not know if that would be considered as a too odd
> > and error prone feature to merit an option.  Or?
> > 
> > The patch is on 6.6 stable.
> > 
> > Index: usr.sbin/sysupgrade/sysupgrade.sh
> > ===
> > RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v
> > retrieving revision 1.25
> > diff -u -u -r1.25 sysupgrade.sh
> > --- usr.sbin/sysupgrade/sysupgrade.sh   28 Sep 2019 17:30:07 -  
> > 1.25
> > +++ usr.sbin/sysupgrade/sysupgrade.sh   14 Nov 2019 13:27:34 -
> > @@ -119,6 +119,7 @@
> > URL=${MIRROR}/${NEXT_VERSION}/${ARCH}/
> >  fi
> >  
> > +[[ -e ${SETSDIR} ]] || mkdir -p ${SETSDIR}
> >  if [[ -e ${SETSDIR} ]]; then
> > eval $(stat -s ${SETSDIR})
> > [[ $st_uid -eq 0 ]] ||
> > @@ -127,8 +128,6 @@
> >  ug_err "${SETSDIR} needs to be owned by root:wheel"
> > [[ $st_mode -eq 040755 ]] || 
> > ug_err "${SETSDIR} is not a directory with permissions 0755"
> > -else
> > -   mkdir -p ${SETSDIR}
> >  fi
> >  
> >  cd ${SETSDIR}
> > @@ -185,7 +184,7 @@
> >  
> >  cat <<__EOT >/auto_upgrade.conf
> >  Location of sets = disk
> > -Pathname to the sets = /home/_sysupgrade/
> > +Pathname to the sets = ${SETSDIR}/
> >  Set name(s) = done
> >  Directory does not contain SHA256.sig. Continue without verification = yes
> >  __EOT
> > @@ -193,7 +192,7 @@
> >  if ! ${KEEP}; then
> > CLEAN=$(echo SHA256 ${SETS} | sed -e 's/ /,/g')
> > cat <<__EOT > /etc/rc.firsttime
> > -rm -f /home/_sysupgrade/{${CLEAN}}
> > +rm -f ${SETSDIR}/{${CLEAN}}
> >  __EOT
> >  fi
> > 
> > Best regards
> > --  
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> 
> -- 
> Nils Ola Nilsson, email nils...@abc.se, tel +46-70-374 69 89



-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Patch suggestion for sysupgrade

2019-11-15 Thread Raimo Niskanen
On Thu, Nov 14, 2019 at 04:59:23PM +, gil...@poolp.org wrote:
> A similar patch for this was sent to tech@ by Renaud Allard, you might want to
> go review the "sysupgrade: Allow to use another directory for data sets" 
> thread
> and comment it.

Thanks for the pointer!  I see in that thread that this is hard
to find a safe solution to this problem...

/ Raimo


> 
> 
> November 14, 2019 3:01 PM, "Raimo Niskanen"  
> wrote:
> 
> > The use case for this patch is that in our lab network we have NFS
> > automounted /home/* directories, so using /home/_sysupgrade
> > for sysupgrade does not work.
> > 
> > With this patch it is easy to modify /usr/sbin/sysupgrade and change
> > just the line SETSDIR=/home/_sysupgrade to point to some other local file
> > system that is outside hier(7) for example /opt/_sysupgrade
> > or /srv/_sysupgrade.
> > 
> > Even using /var/_sysupgrade or /usr/_sysupgrade should work. As far as
> > I can tell the sysupgrade directory only has to be on a local file system,
> > and not get overwritten by the base system install.
> > 
> > The change for mkdir -p ${SETSDIR} is to make the script more defensive 
> > about
> > the result of mkdir, e.g in case the umask is wrong, or if the directory
> > containing the sysupgrade directory has got the wrong group, etc.
> > 
> > A follow-up to this patch, should it be accepted, could be to add an option
> > -d SysupgradeDir, but I do not know if that would be considered as a too odd
> > and error prone feature to merit an option. Or?
> > 
> > The patch is on 6.6 stable.
> > 
> > Index: usr.sbin/sysupgrade/sysupgrade.sh
> > ===
> > RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v
> > retrieving revision 1.25
> > diff -u -u -r1.25 sysupgrade.sh
> > --- usr.sbin/sysupgrade/sysupgrade.sh 28 Sep 2019 17:30:07 - 1.25
> > +++ usr.sbin/sysupgrade/sysupgrade.sh 14 Nov 2019 13:27:34 -
> > @@ -119,6 +119,7 @@
> > URL=${MIRROR}/${NEXT_VERSION}/${ARCH}/
> > fi
> > 
> > +[[ -e ${SETSDIR} ]] || mkdir -p ${SETSDIR}
> > if [[ -e ${SETSDIR} ]]; then
> > eval $(stat -s ${SETSDIR})
> > [[ $st_uid -eq 0 ]] ||
> > @@ -127,8 +128,6 @@
> > ug_err "${SETSDIR} needs to be owned by root:wheel"
> > [[ $st_mode -eq 040755 ]] || 
> > ug_err "${SETSDIR} is not a directory with permissions 0755"
> > -else
> > - mkdir -p ${SETSDIR}
> > fi
> > 
> > cd ${SETSDIR}
> > @@ -185,7 +184,7 @@
> > 
> > cat <<__EOT >/auto_upgrade.conf
> > Location of sets = disk
> > -Pathname to the sets = /home/_sysupgrade/
> > +Pathname to the sets = ${SETSDIR}/
> > Set name(s) = done
> > Directory does not contain SHA256.sig. Continue without verification = yes
> > __EOT
> > @@ -193,7 +192,7 @@
> > if ! ${KEEP}; then
> > CLEAN=$(echo SHA256 ${SETS} | sed -e 's/ /,/g')
> > cat <<__EOT > /etc/rc.firsttime
> > -rm -f /home/_sysupgrade/{${CLEAN}}
> > +rm -f ${SETSDIR}/{${CLEAN}}
> > __EOT
> > fi
> > 
> > Best regards
> > -- 
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Patch suggestion for sysupgrade

2019-11-14 Thread Raimo Niskanen
The use case for this patch is that in our lab network we have NFS
automounted /home/* directories, so using /home/_sysupgrade
for sysupgrade does not work.

With this patch it is easy to modify /usr/sbin/sysupgrade and change
just the line SETSDIR=/home/_sysupgrade to point to some other local file
system that is outside hier(7) for example /opt/_sysupgrade
or /srv/_sysupgrade.

Even using /var/_sysupgrade or /usr/_sysupgrade should work.  As far as
I can tell the sysupgrade directory only has to be on a local file system,
and not get overwritten by the base system install.

The change for mkdir -p ${SETSDIR} is to make the script more defensive about
the result of mkdir, e.g in case the umask is wrong, or if the directory
containing the sysupgrade directory has got the wrong group, etc.


A follow-up to this patch, should it be accepted, could be to add an option
-d SysupgradeDir, but I do not know if that would be considered as a too odd
and error prone feature to merit an option.  Or?

The patch is on 6.6 stable.

Index: usr.sbin/sysupgrade/sysupgrade.sh
===
RCS file: /cvs/src/usr.sbin/sysupgrade/sysupgrade.sh,v
retrieving revision 1.25
diff -u -u -r1.25 sysupgrade.sh
--- usr.sbin/sysupgrade/sysupgrade.sh   28 Sep 2019 17:30:07 -  1.25
+++ usr.sbin/sysupgrade/sysupgrade.sh   14 Nov 2019 13:27:34 -
@@ -119,6 +119,7 @@
URL=${MIRROR}/${NEXT_VERSION}/${ARCH}/
 fi
 
+[[ -e ${SETSDIR} ]] || mkdir -p ${SETSDIR}
 if [[ -e ${SETSDIR} ]]; then
eval $(stat -s ${SETSDIR})
[[ $st_uid -eq 0 ]] ||
@@ -127,8 +128,6 @@
 ug_err "${SETSDIR} needs to be owned by root:wheel"
[[ $st_mode -eq 040755 ]] || 
ug_err "${SETSDIR} is not a directory with permissions 0755"
-else
-   mkdir -p ${SETSDIR}
 fi
 
 cd ${SETSDIR}
@@ -185,7 +184,7 @@
 
 cat <<__EOT >/auto_upgrade.conf
 Location of sets = disk
-Pathname to the sets = /home/_sysupgrade/
+Pathname to the sets = ${SETSDIR}/
 Set name(s) = done
 Directory does not contain SHA256.sig. Continue without verification = yes
 __EOT
@@ -193,7 +192,7 @@
 if ! ${KEEP}; then
CLEAN=$(echo SHA256 ${SETS} | sed -e 's/ /,/g')
cat <<__EOT > /etc/rc.firsttime
-rm -f /home/_sysupgrade/{${CLEAN}}
+rm -f ${SETSDIR}/{${CLEAN}}
 __EOT
 fi

Best regards
--  
/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-11-01 Thread Raimo Niskanen
On Mon, Sep 09, 2019 at 05:44:32PM +0200, Raimo Niskanen wrote:
> On Mon, Sep 09, 2019 at 05:42:02PM +0200, Raimo Niskanen wrote:
> > On Wed, Jul 31, 2019 at 05:46:08PM +0200, Raimo Niskanen wrote:
> > > On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> > > > On 2019-07-29, Raimo Niskanen  wrote:
> > > > > A new hang, I tried to invstigate:
> > > > >
> > > > > At July 19 the last log entry from my 'ps' log was from 14:55, which 
> > > > > is
> > > > > also the time on the 'systat vmstat' screen when it froze.  Then the 
> > > > > machine
> > > > > hums along but just after midnight at 00:42:01 the first "/bsd: 
> > > > > process:
> > > > > table is full" entry appears.  That message repeats until I rebooted 
> > > > > it
> > > > > today at July 29 10:48.
> > > > >
> > > > > I had a terminal with top running.  It was still updating.  It showed 
> > > > > about
> > > > > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then 
> > > > > (after
> > > > > the process table had gotten full) it had 1282 idle processes and 1 on
> > > > > processor, which was 'top' itself.
> > > > > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.
> > > > >
> > > > > I had 8 shells under tmux ready for debugging.  'ls worked.
> > > > > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > > > > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > > > > and then hung.  'exec reboot' did not succeed.  Neither did a short 
> > > > > power
> > > > > button, that at least caused a printout "stopping daemon 
> > > > > nginx(failed)",
> > > > > but got no further.  I had to do a hard power off. 
> > > > >
> > > > > My theory now is that our daily tests right before 14:55 started a 
> > > > > process
> > > > > (this process is the top 'top' process with 10:14 execution time) that
> > > > > triggers a lock or other contention problem in the kernel which causes
> > > > > one CPU to spin in the system, and blocks processes from dying.
> > > > > About 10 hours later the process table gets full.
> > > > >
> > > > > Any, ANY ideas of how to proceed would be appreciated!
> > > > >
> > > > > Best Regards
> > > > 
> > > > Did you notice any odd waitchan's (WAIT in top output)?
> > > > 
> > > > Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> > > > set), then try to break into DDB during a hang and see how things look
> > > > in ps there. (Test breaking into DDB before a hang first so you know
> > > > that you can do it .. you can just "c" to continue).
> > > > 
> > > > There might also be clues in things like "sh malloc" or "sh all pools".
> > > > 
> > > > Perhaps you could also get clues from running a kernel built with
> > > > 'option WITNESS', you may get some messages in dmesg, or it adds 
> > > > commands
> > > > to ddb like "show locks", "show all locks", "show witness" (see ddb(4) 
> > > > for
> > > > details).
> > > 
> > > I have enabled Witness, it went so-so.  We'll see what it catches.
> > > 
> > > I downloaded 6.5 amd64 src.tar.gz and sys.tar.gz, unpacked them,
> > > applied all patches for stable 001-006 and built a kernel with:
> > >   include "arch/amd64/conf/GENERIC"
> > >   option  MULTIPROCESSOR
> > >   option  MP_LOCKDEBUG
> > >   option  WITNESS
> > > 
> > > Then I activated in /etc/sysctl.conf:
> > >   ddb.console=1
> > >   kern.witness.locktrace=1
> > >   kern.witness.watch=3
> > > 
> > > For fun, I pressed Ctrl+Alt+Esc at the console, got a ddb> prompt and 
> > > typed
> > > "show witness".  It printed lots of info, I scrolled down to the end, but
> > > during the printout there was an UVM fault:
> > > 
> > >   Spin locks:
> > >   /usr/src/sys/
> > >   :
> > >   bla bla bla
> > >   :
> > >   uvm_fault(0x81e03b50, 0x800022368360, 0, 1) -> e
> > >   kernel: pag

Re: How to debug hanging machines / proc: table is full

2019-09-09 Thread Raimo Niskanen
On Mon, Sep 09, 2019 at 05:42:02PM +0200, Raimo Niskanen wrote:
> On Wed, Jul 31, 2019 at 05:46:08PM +0200, Raimo Niskanen wrote:
> > On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> > > On 2019-07-29, Raimo Niskanen  wrote:
> > > > A new hang, I tried to invstigate:
> > > >
> > > > At July 19 the last log entry from my 'ps' log was from 14:55, which is
> > > > also the time on the 'systat vmstat' screen when it froze.  Then the 
> > > > machine
> > > > hums along but just after midnight at 00:42:01 the first "/bsd: process:
> > > > table is full" entry appears.  That message repeats until I rebooted it
> > > > today at July 29 10:48.
> > > >
> > > > I had a terminal with top running.  It was still updating.  It showed 
> > > > about
> > > > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then (after
> > > > the process table had gotten full) it had 1282 idle processes and 1 on
> > > > processor, which was 'top' itself.
> > > > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.
> > > >
> > > > I had 8 shells under tmux ready for debugging.  'ls worked.
> > > > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > > > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > > > and then hung.  'exec reboot' did not succeed.  Neither did a short 
> > > > power
> > > > button, that at least caused a printout "stopping daemon nginx(failed)",
> > > > but got no further.  I had to do a hard power off. 
> > > >
> > > > My theory now is that our daily tests right before 14:55 started a 
> > > > process
> > > > (this process is the top 'top' process with 10:14 execution time) that
> > > > triggers a lock or other contention problem in the kernel which causes
> > > > one CPU to spin in the system, and blocks processes from dying.
> > > > About 10 hours later the process table gets full.
> > > >
> > > > Any, ANY ideas of how to proceed would be appreciated!
> > > >
> > > > Best Regards
> > > 
> > > Did you notice any odd waitchan's (WAIT in top output)?
> > > 
> > > Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> > > set), then try to break into DDB during a hang and see how things look
> > > in ps there. (Test breaking into DDB before a hang first so you know
> > > that you can do it .. you can just "c" to continue).
> > > 
> > > There might also be clues in things like "sh malloc" or "sh all pools".
> > > 
> > > Perhaps you could also get clues from running a kernel built with
> > > 'option WITNESS', you may get some messages in dmesg, or it adds commands
> > > to ddb like "show locks", "show all locks", "show witness" (see ddb(4) for
> > > details).
> > 
> > I have enabled Witness, it went so-so.  We'll see what it catches.
> > 
> > I downloaded 6.5 amd64 src.tar.gz and sys.tar.gz, unpacked them,
> > applied all patches for stable 001-006 and built a kernel with:
> >   include "arch/amd64/conf/GENERIC"
> >   optionMULTIPROCESSOR
> >   optionMP_LOCKDEBUG
> >   optionWITNESS
> > 
> > Then I activated in /etc/sysctl.conf:
> >   ddb.console=1
> >   kern.witness.locktrace=1
> >   kern.witness.watch=3
> > 
> > For fun, I pressed Ctrl+Alt+Esc at the console, got a ddb> prompt and typed
> > "show witness".  It printed lots of info, I scrolled down to the end, but
> > during the printout there was an UVM fault:
> > 
> >   Spin locks:
> >   /usr/src/sys/
> >   :
> >   bla bla bla
> >   :
> >   uvm_fault(0x81e03b50, 0x800022368360, 0, 1) -> e
> >   kernel: page fault trap, code=0
> >   Faulted in DDB: continuing...
> > 
> > Then I typed "cont" and it panicked.
> > If anybody want details I took a picture.
> > 
> > Have I combined too many debugging options, or is this sh*t that happens?
> > 
> > Nevertheless, now the machine is running again, with Witness...
> > 
> > I'll be back.
> 
> I have encountered some kind of stop, oddly enought not a panic - it
> just sat in ddb and I missed it for a week (or more).  Then I did not
> remember what I had planned to do so I "improvised" X-| , bu

Re: How to debug hanging machines / proc: table is full

2019-09-09 Thread Raimo Niskanen
On Wed, Jul 31, 2019 at 05:46:08PM +0200, Raimo Niskanen wrote:
> On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> > On 2019-07-29, Raimo Niskanen  wrote:
> > > A new hang, I tried to invstigate:
> > >
> > > At July 19 the last log entry from my 'ps' log was from 14:55, which is
> > > also the time on the 'systat vmstat' screen when it froze.  Then the 
> > > machine
> > > hums along but just after midnight at 00:42:01 the first "/bsd: process:
> > > table is full" entry appears.  That message repeats until I rebooted it
> > > today at July 29 10:48.
> > >
> > > I had a terminal with top running.  It was still updating.  It showed 
> > > about
> > > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then (after
> > > the process table had gotten full) it had 1282 idle processes and 1 on
> > > processor, which was 'top' itself.
> > > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.
> > >
> > > I had 8 shells under tmux ready for debugging.  'ls worked.
> > > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > > and then hung.  'exec reboot' did not succeed.  Neither did a short power
> > > button, that at least caused a printout "stopping daemon nginx(failed)",
> > > but got no further.  I had to do a hard power off. 
> > >
> > > My theory now is that our daily tests right before 14:55 started a process
> > > (this process is the top 'top' process with 10:14 execution time) that
> > > triggers a lock or other contention problem in the kernel which causes
> > > one CPU to spin in the system, and blocks processes from dying.
> > > About 10 hours later the process table gets full.
> > >
> > > Any, ANY ideas of how to proceed would be appreciated!
> > >
> > > Best Regards
> > 
> > Did you notice any odd waitchan's (WAIT in top output)?
> > 
> > Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> > set), then try to break into DDB during a hang and see how things look
> > in ps there. (Test breaking into DDB before a hang first so you know
> > that you can do it .. you can just "c" to continue).
> > 
> > There might also be clues in things like "sh malloc" or "sh all pools".
> > 
> > Perhaps you could also get clues from running a kernel built with
> > 'option WITNESS', you may get some messages in dmesg, or it adds commands
> > to ddb like "show locks", "show all locks", "show witness" (see ddb(4) for
> > details).
> 
> I have enabled Witness, it went so-so.  We'll see what it catches.
> 
> I downloaded 6.5 amd64 src.tar.gz and sys.tar.gz, unpacked them,
> applied all patches for stable 001-006 and built a kernel with:
>   include "arch/amd64/conf/GENERIC"
>   option  MULTIPROCESSOR
>   option  MP_LOCKDEBUG
>   option  WITNESS
> 
> Then I activated in /etc/sysctl.conf:
>   ddb.console=1
>   kern.witness.locktrace=1
>   kern.witness.watch=3
> 
> For fun, I pressed Ctrl+Alt+Esc at the console, got a ddb> prompt and typed
> "show witness".  It printed lots of info, I scrolled down to the end, but
> during the printout there was an UVM fault:
> 
>   Spin locks:
>   /usr/src/sys/
>   :
>   bla bla bla
>   :
>   uvm_fault(0x81e03b50, 0x800022368360, 0, 1) -> e
>   kernel: page fault trap, code=0
>   Faulted in DDB: continuing...
> 
> Then I typed "cont" and it panicked.
> If anybody want details I took a picture.
> 
> Have I combined too many debugging options, or is this sh*t that happens?
> 
> Nevertheless, now the machine is running again, with Witness...
> 
> I'll be back.

I have encountered some kind of stop, oddly enought not a panic - it
just sat in ddb and I missed it for a week (or more).  Then I did not
remember what I had planned to do so I "improvised" X-| , but anyway:

ddb{0}> ps
shows about 350 processes from cron, half of them in state netlock, half
in state piperd.  Then I have my test processes beam.smp: 6 in netlock, 6
in piperd, about 70 in fsleep, 3 in poll, 3 in select, 4 in kqread.
Then about 100 more ordinary looking processes...

ddb{0}> trace
db_enter()...
softclock(0)...
softintr_dispatch(0)...
Xsoftclock(0,0,1388,)...
acpicpu_idle()...
shed_idle(81ceff0)...
end trace frame: 0x0, count: -6

ddb{0}> show locks
exclusive kernel_lock _lock r = 0 (0x81e37b10) locked @
/usr/src/sys/arch/amd64/amd64/softintr.c:87
#0  witness_lock+0x41f
#1  softintr_dispatc+0x56
#2  Xsoftclock+0x1f
#3  acpicpu_idle+0x271
#4  sched_idle+0x235
#5  proc_trampoline+0x1c

ddb{0}> show nfsnode
size 5476515891122113356 flag 0 vnode 0xd080c7d8 accstamp 1099511681152

(I think the size looks strange)

Then I tried show map and got a protection fault trap, gave up and
rebooted.

That was it!  Next time I will try:
  trace
  ps
  show malloc
  show all pools
  show locks
  show all locks
unless anyone has got more or better suggestions...

Best Regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-08-01 Thread Raimo Niskanen
On Wed, Jul 31, 2019 at 04:20:12PM +, Visa Hankala wrote:
> On Wed, Jul 31, 2019 at 05:46:08PM +0200, Raimo Niskanen wrote:
> > I have enabled Witness, it went so-so.  We'll see what it catches.
> > 
> > I downloaded 6.5 amd64 src.tar.gz and sys.tar.gz, unpacked them,
> > applied all patches for stable 001-006 and built a kernel with:
> >   include "arch/amd64/conf/GENERIC"
> >   optionMULTIPROCESSOR
> >   optionMP_LOCKDEBUG
> >   optionWITNESS
> > 
> > Then I activated in /etc/sysctl.conf:
> >   ddb.console=1
> >   kern.witness.locktrace=1
> >   kern.witness.watch=3
> > 
> > For fun, I pressed Ctrl+Alt+Esc at the console, got a ddb> prompt and typed
> > "show witness".  It printed lots of info, I scrolled down to the end, but
> > during the printout there was an UVM fault:
> > 
> >   Spin locks:
> >   /usr/src/sys/
> >   :
> >   bla bla bla
> >   :
> >   uvm_fault(0x81e03b50, 0x800022368360, 0, 1) -> e
> >   kernel: page fault trap, code=0
> >   Faulted in DDB: continuing...
> 
> The output of "show witness" is unlikely to be useful in your case.
> It is more of a tool for debugging witness. You can ignore it.
> However, "show all locks" might display interesting information
> after a witness-related panic.

Ok, great!

It is just that an uvm_fault during show witness felt like a bad thing...

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-31 Thread Raimo Niskanen
On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> On 2019-07-29, Raimo Niskanen  wrote:
> > A new hang, I tried to invstigate:
> >
> > At July 19 the last log entry from my 'ps' log was from 14:55, which is
> > also the time on the 'systat vmstat' screen when it froze.  Then the machine
> > hums along but just after midnight at 00:42:01 the first "/bsd: process:
> > table is full" entry appears.  That message repeats until I rebooted it
> > today at July 29 10:48.
> >
> > I had a terminal with top running.  It was still updating.  It showed about
> > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then (after
> > the process table had gotten full) it had 1282 idle processes and 1 on
> > processor, which was 'top' itself.
> > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.
> >
> > I had 8 shells under tmux ready for debugging.  'ls worked.
> > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > and then hung.  'exec reboot' did not succeed.  Neither did a short power
> > button, that at least caused a printout "stopping daemon nginx(failed)",
> > but got no further.  I had to do a hard power off. 
> >
> > My theory now is that our daily tests right before 14:55 started a process
> > (this process is the top 'top' process with 10:14 execution time) that
> > triggers a lock or other contention problem in the kernel which causes
> > one CPU to spin in the system, and blocks processes from dying.
> > About 10 hours later the process table gets full.
> >
> > Any, ANY ideas of how to proceed would be appreciated!
> >
> > Best Regards
> 
> Did you notice any odd waitchan's (WAIT in top output)?
> 
> Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> set), then try to break into DDB during a hang and see how things look
> in ps there. (Test breaking into DDB before a hang first so you know
> that you can do it .. you can just "c" to continue).
> 
> There might also be clues in things like "sh malloc" or "sh all pools".
> 
> Perhaps you could also get clues from running a kernel built with
> 'option WITNESS', you may get some messages in dmesg, or it adds commands
> to ddb like "show locks", "show all locks", "show witness" (see ddb(4) for
> details).

I have enabled Witness, it went so-so.  We'll see what it catches.

I downloaded 6.5 amd64 src.tar.gz and sys.tar.gz, unpacked them,
applied all patches for stable 001-006 and built a kernel with:
  include "arch/amd64/conf/GENERIC"
  optionMULTIPROCESSOR
  optionMP_LOCKDEBUG
  optionWITNESS

Then I activated in /etc/sysctl.conf:
  ddb.console=1
  kern.witness.locktrace=1
  kern.witness.watch=3

For fun, I pressed Ctrl+Alt+Esc at the console, got a ddb> prompt and typed
"show witness".  It printed lots of info, I scrolled down to the end, but
during the printout there was an UVM fault:

  Spin locks:
  /usr/src/sys/
  :
  bla bla bla
  :
  uvm_fault(0x81e03b50, 0x800022368360, 0, 1) -> e
  kernel: page fault trap, code=0
  Faulted in DDB: continuing...

Then I typed "cont" and it panicked.
If anybody want details I took a picture.

Have I combined too many debugging options, or is this sh*t that happens?

Nevertheless, now the machine is running again, with Witness...

I'll be back.


> 
> Can you provoke a hang by running this process manually?

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-30 Thread Raimo Niskanen
On Mon, Jul 29, 2019 at 01:20:58PM +, Stuart Henderson wrote:
> On 2019-07-29, Raimo Niskanen  wrote:
> > A new hang, I tried to invstigate:
> >
> > At July 19 the last log entry from my 'ps' log was from 14:55, which is
> > also the time on the 'systat vmstat' screen when it froze.  Then the machine
> > hums along but just after midnight at 00:42:01 the first "/bsd: process:
> > table is full" entry appears.  That message repeats until I rebooted it
> > today at July 29 10:48.
> >
> > I had a terminal with top running.  It was still updating.  It showed about
> > 98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then (after
> > the process table had gotten full) it had 1282 idle processes and 1 on
> > processor, which was 'top' itself.
> > Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.
> >
> > I had 8 shells under tmux ready for debugging.  'ls worked.
> > 'systat' on one hung.  'top' on another failed with "cannot fork".
> > 'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
> > and then hung.  'exec reboot' did not succeed.  Neither did a short power
> > button, that at least caused a printout "stopping daemon nginx(failed)",
> > but got no further.  I had to do a hard power off. 
> >
> > My theory now is that our daily tests right before 14:55 started a process
> > (this process is the top 'top' process with 10:14 execution time) that
> > triggers a lock or other contention problem in the kernel which causes
> > one CPU to spin in the system, and blocks processes from dying.
> > About 10 hours later the process table gets full.
> >
> > Any, ANY ideas of how to proceed would be appreciated!
> >
> > Best Regards
> 
> Did you notice any odd waitchan's (WAIT in top output)?

I do not think so:
  select (for the possibly triggering process), - (for 'top'), kqread, netlock,
  bpf, wait, piperd.

> 
> Maybe set ddb.console=1 in sysctl.conf and reboot (if not already
> set), then try to break into DDB during a hang and see how things look
> in ps there. (Test breaking into DDB before a hang first so you know
> that you can do it .. you can just "c" to continue).
> 
> There might also be clues in things like "sh malloc" or "sh all pools".

Sounds like fun - will try that!

> 
> Perhaps you could also get clues from running a kernel built with
> 'option WITNESS', you may get some messages in dmesg, or it adds commands
> to ddb like "show locks", "show all locks", "show witness" (see ddb(4) for
> details).

Maybe later.  I have gotten used to not compiling my kernel...

> 
> Can you provoke a hang by running this process manually?

Might be worth a try to repeat the suspected test case many times.
I will try.

Thanks for the hints!
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-29 Thread Raimo Niskanen
On Mon, Jul 15, 2019 at 08:55:35AM +0200, Raimo Niskanen wrote:
> On Tue, Jul 02, 2019 at 05:13:43PM +, Stuart Henderson wrote:
> > On 2019-07-02, Raimo Niskanen  wrote:
> > > Hi misc@!
> > >
> > > If anyone has got some tips about how to debug two hanging machines we 
> > > have
> > > in our test lab I am eager to learn.
> > >
> > > The machines runs 6.5, amd64 and are patched up to 005_libssl using 
> > > M:Tier's
> > > openup.  Other than that they are rather different, one small Zotac
> > > ZBox-AD02 with AMD E-350 at 1.6 GHz, and one rack mounted Dell PowerEdge
> > > R230 with Intel Xeon E3-1220.
> > >
> > > The overall symptoms are that it is possible to switch screens using
> > > Alt+Ctrl+F1..Fn, but when logging in as root the greeting prints but no
> > > prompt.  Alt+Ctrl+Del does not work.  The power button does not work.  I
> > > have to long press the power button to force power off.
> > >
> > > This happens during our nightly tests, that are quite resource intesive.
> > >
> > > In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
> > > possibly before the machines become inresponsive, but these entries appear
> > > many more times before that point.  And after this "table is full" message
> > > there are many syslog entries; on one machine smartd constatly complains 
> > > about
> > > an unreadable (pending) sector and atascsi_passthru_done timeout, and on
> > > the other the kernel complains about a probed monitor but no|invalid EDID.
> > >
> > > So it seems the machine is out of some resource and fails to spawn a login
> > > shell.  Any clues to how I can find more details and a remedy?  I suspect 
> > > a
> > > full process table, but wonder how to detect and|or avoid that.
> > >
> > > I have considered having systat running on a console screen but do not 
> > > know
> > > which systat display that might tell me anything.
> > >
> > > Best regards
> > 
> > "/bsd: proc: table is full" means that the process table is full, but it 
> > doesn't
> > tell you what caused this.
> > 
> > The process table size is controlled by kern.maxproc, it is possible
> > that the default is insufficient for your needs, but it's also possible
> > that there was a build-up of processes that didn't exit due to another
> > problem on the system.
> > 
> > I would leave top(1) running on the system, and also save "ps ax" output
> > regularly, then look at that output in the run-up to a failure, to see
> > if that gives clues.
> > 
> 
> It seems that the full process table is a secondary symptom, and that there
> is something else that happens on the machines a few hours before the
> process table fills...
> 
> On one machine I hade left "systat pigs" running, and the last thing it
> showed was about 90% for softnet and the rest , IIRC.
> 
> I have now corrected a presumably unrelated error in our nightly tests that
> occured just before the freeze.  The test started a child process that was
> abandoned, and when it noticed its controlling socket close it started to
> write an error log.  Previously that froze sometimes and a few hours later
> the process table got full.  Now the child process is not abandoned, and
> I have not seen the freeze since...
> 
> Still chasing ghosts, this can simply not be over yet.

A new hang, I tried to invstigate:

At July 19 the last log entry from my 'ps' log was from 14:55, which is
also the time on the 'systat vmstat' screen when it froze.  Then the machine
hums along but just after midnight at 00:42:01 the first "/bsd: process:
table is full" entry appears.  That message repeats until I rebooted it
today at July 29 10:48.

I had a terminal with top running.  It was still updating.  It showed about
98% sys and 2% spin on one of 4 CPUs, the others 100% idle.  Then (after
the process table had gotten full) it had 1282 idle processes and 1 on
processor, which was 'top' itself.
Memory: Real: 456M/1819M act/tot Free: 14G Cache: 676M Swap: 0K/16G.

I had 8 shells under tmux ready for debugging.  'ls worked.
'systat' on one hung.  'top' on another failed with "cannot fork".
'exec ps ajxww" printed two lines with /sbin/init and /sbin/slaac
and then hung.  'exec reboot' did not succeed.  Neither did a short power
button, that at least caused a printout "stopping daemon nginx(failed)",
but got no further.  I had to do a hard power off. 

My theory now is that our daily tests right before 14:55 started a process
(this process is the top 'top' process with 10:14 execution time) that
triggers a lock or other contention problem in the kernel which causes
one CPU to spin in the system, and blocks processes from dying.
About 10 hours later the process table gets full.

Any, ANY ideas of how to proceed would be appreciated!

Best Regards
-- 
Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-15 Thread Raimo Niskanen
On Tue, Jul 02, 2019 at 05:13:43PM +, Stuart Henderson wrote:
> On 2019-07-02, Raimo Niskanen  wrote:
> > Hi misc@!
> >
> > If anyone has got some tips about how to debug two hanging machines we have
> > in our test lab I am eager to learn.
> >
> > The machines runs 6.5, amd64 and are patched up to 005_libssl using M:Tier's
> > openup.  Other than that they are rather different, one small Zotac
> > ZBox-AD02 with AMD E-350 at 1.6 GHz, and one rack mounted Dell PowerEdge
> > R230 with Intel Xeon E3-1220.
> >
> > The overall symptoms are that it is possible to switch screens using
> > Alt+Ctrl+F1..Fn, but when logging in as root the greeting prints but no
> > prompt.  Alt+Ctrl+Del does not work.  The power button does not work.  I
> > have to long press the power button to force power off.
> >
> > This happens during our nightly tests, that are quite resource intesive.
> >
> > In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
> > possibly before the machines become inresponsive, but these entries appear
> > many more times before that point.  And after this "table is full" message
> > there are many syslog entries; on one machine smartd constatly complains 
> > about
> > an unreadable (pending) sector and atascsi_passthru_done timeout, and on
> > the other the kernel complains about a probed monitor but no|invalid EDID.
> >
> > So it seems the machine is out of some resource and fails to spawn a login
> > shell.  Any clues to how I can find more details and a remedy?  I suspect a
> > full process table, but wonder how to detect and|or avoid that.
> >
> > I have considered having systat running on a console screen but do not know
> > which systat display that might tell me anything.
> >
> > Best regards
> 
> "/bsd: proc: table is full" means that the process table is full, but it 
> doesn't
> tell you what caused this.
> 
> The process table size is controlled by kern.maxproc, it is possible
> that the default is insufficient for your needs, but it's also possible
> that there was a build-up of processes that didn't exit due to another
> problem on the system.
> 
> I would leave top(1) running on the system, and also save "ps ax" output
> regularly, then look at that output in the run-up to a failure, to see
> if that gives clues.
> 

It seems that the full process table is a secondary symptom, and that there
is something else that happens on the machines a few hours before the
process table fills...

On one machine I hade left "systat pigs" running, and the last thing it
showed was about 90% for softnet and the rest , IIRC.

I have now corrected a presumably unrelated error in our nightly tests that
occured just before the freeze.  The test started a child process that was
abandoned, and when it noticed its controlling socket close it started to
write an error log.  Previously that froze sometimes and a few hours later
the process table got full.  Now the child process is not abandoned, and
I have not seen the freeze since...

Still chasing ghosts, this can simply not be over yet.

Best Regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



smartd - Currently unreadable (pending) sectors

2019-07-12 Thread Raimo Niskanen
Hi all.

I just wanted to share a small success story, partly for the archives.

I have an OpenBSD 6.5 amd64 server whose smartd has been whining about:
  smartd[666]: Device: /dev/sd0c, 1 Currently unreadable (pending) sectors

smartctl -a /dev/sd0c showed attribute 197 Current_Pending_Sector to be 1.

The disk is part of a softraid mirror sd2, so I offlined the disk
and rebuilt the mirror, thusly:
  bioctl -O /dev/sd0a sd2
  bioctl -R /dev/sd0a sd2

After a few hours of parity rebuild smartd has stopped spamming
/var/log/messages, and smartctl -a shows Current_Pending_Sector to be 0.
The parity rebuild, as expected, must have re-written the broken sector and
the disk reallocated the sector.

Sysadmin is happy!

Except that smartctl -a informs me that the disk predicts its power-on
lifetime to be 382 days, which apparently is not good.  The other disk
in the mirror says nothing of this, so I have a very early warning
about getting a new disk...  (sd0a's Reallocated_Sector_Ct is 148
while sd0b's is 0)

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-09 Thread Raimo Niskanen
On Tue, Jul 09, 2019 at 10:33:46AM -0400, Kenneth Gober wrote:
> On Tue, Jul 2, 2019 at 10:06 AM Raimo Niskanen <
> raimo+open...@erix.ericsson.se> wrote:
> 
> > In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
> > possibly before the machines become inresponsive, but these entries appear
> > many more times before that point.  And after this "table is full" message
> > there are many syslog entries; on one machine smartd constatly complains
> > about
> > an unreadable (pending) sector and atascsi_passthru_done timeout, and on
> > the other the kernel complains about a probed monitor but no|invalid EDID.
> >
> 
> In addition to Stuart's suggestion to leave top(1) running, and
> periodically save "ps ax"
> output, it might also be a good idea to start up a bunch of nested shells
> and just leave
> them running.  This will reserve a bunch of process table slots, which you
> will be able to
> use via "exec", the idea being that if you can't fork new processes, you
> can at least use
> exec to replace an existing ksh process with something else.  This will
> hopefully give you
> some limited ability to run a few post-mortem diagnostic commands before
> you run out
> of reserved process table slots.
> 
> -ken

That's a nice one.  Thank you!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Feeding DHCP leases into unbound

2019-07-04 Thread Raimo Niskanen
On Thu, Jun 22, 2017 at 11:47:03AM +0200, Andreas Kusalananda Kähäri wrote:
> Hi,
> 
> I have unbound(8) and dhcpd(8) running on a router (OpenBSD 6.1-stable).
> dhcpd currently hands out fixed addresses to my clients, but I'd like
> these to be allocated dynamically from the common pool, while at the
> same time being resolvable.
> 
> Is there an existing solution for feeding the IP-addresses of the leases
> that dhcpd hands out into the unbound configuration and reload it, or
> would I have to write a script that parses the lease declarations in
> /var/db/dhcpd.leases?

I have scripted it the other way around.  I have comments in /etc/hosts
containing the MAC addresses for the hosts and then generate dhcpd and
unbound configurations from that.  dhcpd offers IP address based on MAC.

> 
> I know about dnsmasq in ports which I believe serves as both DHCP and
> DNS server, but I'd rather use the software in the base system if at all
> possible.
> 
> Regards,
> Kusalananda

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: How to debug hanging machines / proc: table is full

2019-07-03 Thread Raimo Niskanen
On Tue, Jul 02, 2019 at 05:13:43PM +, Stuart Henderson wrote:
> On 2019-07-02, Raimo Niskanen  wrote:
> > Hi misc@!
> >
> > If anyone has got some tips about how to debug two hanging machines we have
> > in our test lab I am eager to learn.
> >
> > The machines runs 6.5, amd64 and are patched up to 005_libssl using M:Tier's
> > openup.  Other than that they are rather different, one small Zotac
> > ZBox-AD02 with AMD E-350 at 1.6 GHz, and one rack mounted Dell PowerEdge
> > R230 with Intel Xeon E3-1220.
> >
> > The overall symptoms are that it is possible to switch screens using
> > Alt+Ctrl+F1..Fn, but when logging in as root the greeting prints but no
> > prompt.  Alt+Ctrl+Del does not work.  The power button does not work.  I
> > have to long press the power button to force power off.
> >
> > This happens during our nightly tests, that are quite resource intesive.
> >
> > In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
> > possibly before the machines become inresponsive, but these entries appear
> > many more times before that point.  And after this "table is full" message
> > there are many syslog entries; on one machine smartd constatly complains 
> > about
> > an unreadable (pending) sector and atascsi_passthru_done timeout, and on
> > the other the kernel complains about a probed monitor but no|invalid EDID.
> >
> > So it seems the machine is out of some resource and fails to spawn a login
> > shell.  Any clues to how I can find more details and a remedy?  I suspect a
> > full process table, but wonder how to detect and|or avoid that.
> >
> > I have considered having systat running on a console screen but do not know
> > which systat display that might tell me anything.
> >
> > Best regards
> 
> "/bsd: proc: table is full" means that the process table is full, but it 
> doesn't
> tell you what caused this.
> 
> The process table size is controlled by kern.maxproc, it is possible
> that the default is insufficient for your needs, but it's also possible
> that there was a build-up of processes that didn't exit due to another
> problem on the system.
> 
> I would leave top(1) running on the system, and also save "ps ax" output
> regularly, then look at that output in the run-up to a failure, to see
> if that gives clues.
> 

Great!  I will do that...
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



How to debug hanging machines / proc: table is full

2019-07-02 Thread Raimo Niskanen
Hi misc@!

If anyone has got some tips about how to debug two hanging machines we have
in our test lab I am eager to learn.

The machines runs 6.5, amd64 and are patched up to 005_libssl using M:Tier's
openup.  Other than that they are rather different, one small Zotac
ZBox-AD02 with AMD E-350 at 1.6 GHz, and one rack mounted Dell PowerEdge
R230 with Intel Xeon E3-1220.

The overall symptoms are that it is possible to switch screens using
Alt+Ctrl+F1..Fn, but when logging in as root the greeting prints but no
prompt.  Alt+Ctrl+Del does not work.  The power button does not work.  I
have to long press the power button to force power off.

This happens during our nightly tests, that are quite resource intesive.

In /var/log/messages I find suspicious entries "/bsd: proc: table is full"
possibly before the machines become inresponsive, but these entries appear
many more times before that point.  And after this "table is full" message
there are many syslog entries; on one machine smartd constatly complains about
an unreadable (pending) sector and atascsi_passthru_done timeout, and on
the other the kernel complains about a probed monitor but no|invalid EDID.

So it seems the machine is out of some resource and fails to spawn a login
shell.  Any clues to how I can find more details and a remedy?  I suspect a
full process table, but wonder how to detect and|or avoid that.

I have considered having systat running on a console screen but do not know
which systat display that might tell me anything.

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: 6.5 pkg_add "Fatal error: Can't write session into tmp directory"

2019-07-01 Thread Raimo Niskanen
On Sun, Jun 30, 2019 at 01:18:15PM -0700, Jonathan Thornburg wrote:
> I have 6.5/i386 installed on a PC Engines alix board (hostname 'sodium'),
> acting as a home firewall and router.  I'd like to install some packages
> the firewall it to make system adminstration easier.  So... I downloaded
> the appropriate 6./i386 packages from a nearby OpenBSD mirror, ssh-ed them
> to /tmp on the firewall, and then (logged into the firewall as root) tried
> to  pkg_add  them.  Alas, pkg_add failed with an error message about being
> unable to write into a temp directory:
> 
>   sodium# pkg_add -vv tcsh-6.20.00p1-static.tgz
>   Fatal error: Can't write session into tmp directory
>at /usr/libdata/perl5/OpenBSD/PackageRepository.pm line 1025.
>   sodium#
> 
> I've checked that the firewall has adequate free memory & swap space,
> that all the obviously-relevant filesystems are mounted read-write and
> have free inodes and disk space, and that 'touch foo' can create a new
> file in each of /tmp, /var/tmp, and /usr/tmp.
> 
> Is there something obvious I'm overlooked here?  A Fine Man Page I should
> be rereading before I start hacking debug prints into the pkg_add (perl)
> source code?
> 
> Further information (cut-and-pasted from ssh session on the firewall):
> 
>   sodium# uname -a
>   OpenBSD sodium.bkis-orchard.net 6.5 GENERIC#1 i386
>   sodium# df -hi
>   Filesystem SizeUsed   Avail Capacity iused   ifree  %iused  Mounted 
> on
>   /dev/wd0a  378M   47.7M311M13%1771   47379 4%   /
>   mfs:54350 62.9M2.0M   57.7M 3%   88182 0%   /tmp
>   /dev/wd0e  677M   15.1M628M 2% 352   87710 0%   /var
>   /dev/wd0f  1.5G698M734M49%   16248  191622 8%   /usr
>   mfs:42325 62.9M2.0K   59.7M 0%   18189 0%   /usr/tmp

Am I reading the numbers correctly that /tmp and /usr/tmp are two different
memory file systems of maximum size 62.9M?  If so, I wonder what pkg_add is
trying to write into /tmp, it migh be way more than just some metadata...

/ Raimo Niskanen


>   /dev/wd0g  516M138M352M28%8980   5860213%   
> /usr/X11R6
>   /dev/wd0h  1.7G218K1.6G 0% 110  233744 0%   
> /usr/local
>   /dev/wd0j  5.1G2.0K4.8G 0%   1  701565 0%   /usr/obj
>   /dev/wd0i  1.3G2.0K1.3G 0%   1  181885 0%   /usr/src
>   sodium# cat /etc/fstab
>   5fd63b50b0c6cb1d.a /ffs rw,softdep,noatime  1 1
>   5fd63b50b0c6cb1d.d /tmp mfs rw,async,nodev,nosuid,-s=64m0 0
>   5fd63b50b0c6cb1d.e /var ffs rw,softdep,noatime,nodev,nosuid 1 2
>   5fd63b50b0c6cb1d.f /usr ffs rw,softdep,noatime,nodev1 2
>   5fd63b50b0c6cb1d.d /usr/tmp mfs rw,async,nodev,nosuid,-s=64m0 0
>   5fd63b50b0c6cb1d.g /usr/X11R6   ffs rw,softdep,noatime,nodev1 2
>   5fd63b50b0c6cb1d.h /usr/local   ffs rw,softdep,noatime,wxallowed,nodev  1 2
>   5fd63b50b0c6cb1d.j /usr/obj ffs rw,softdep,noatime,nodev,nosuid 1 2
>   5fd63b50b0c6cb1d.i /usr/src ffs rw,softdep,noatime,nodev,nosuid 1 2
>   sodium# top|head
>   load averages:  0.08,  0.02,  0.01sodium.bkis-orchard.net 13:12:00
>   52 processes: 1 running, 50 idle, 1 on processor  up 14 days,  5:21
>   CPU:  0.1% user,  0.0% nice,  0.3% sys,  0.0% spin,  0.3% intr, 99.3% idle
>   Memory: Real: 35M/110M act/tot Free: 127M Cache: 46M Swap: 0K/548M
>   
> PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
>   59735 root  1000K   19M sleep bored44:53  0.44% softnet
>   65312 root -2200K   19M sleep -   339.9H  0.00% idle0
>   57981 root  1000K   19M sleep bored 7:56  0.00% sensors
>   39371 _unbound   20   12M   10M sleep kqread1:33  0.00% unbound
>   sodium# cd /tmp
>   sodium# ls -l
>   total 4144
>   drwxrwxrwt  2 root  wheel  512 Jun 16 07:51 .ICE-unix
>   drwxrwxrwt  2 root  wheel  512 Jun 16 07:51 .X11-unix
>   -rw-r--r--  1 root  wheel  1499861 Jun 30 12:31 lynx-2.8.9rel1.tgz
>   drwxr-xr-x  2 root  wheel  512 Jun 16 07:51 sndio
>   -rw-r--r--  1 root  wheel   564428 Jun 30 12:31 tcsh-6.20.00p1-static.tgz
>   drwxrwxrwt  2 root  wheel  512 Jun 30 12:33 vi.recover
>   sodium#
>   sodium# pkg_info
>   sodium# 
>   sodium# which pkg_add
>   /usr/sbin/pkg_add
>   sodium# pkg_add -vv tcsh-6.20.00p1-static.tgz
>   Fatal error: Can't write session into tmp directory
>at /usr/libdata/perl5/OpenBSD/PackageRepository.pm line 1025.
>   sodium# env
>   _=/usr/bin/env
>   LOGNAME=root
>   PWD=/tmp
>   HOME=/root
>   OLDPWD=/tmp
>   SSH_TTY=/dev/ttyp0
>   TOP=

Re: how to know the progressive state of dd

2018-06-28 Thread Raimo Niskanen
On Mon, Jun 25, 2018 at 06:07:23PM -0600, Todd C. Miller wrote:
> As someone else mentioned you would use pkill on OpenBSD.
> 
> However, you will also need to use SIGINFO, not SIGUSR1, to get
> dd's status.  BSD systems have traditionally used SIGINFO for this
> purpose.  Linux lacks SIGINFO so there is no consistent signal for
> this kind of a thing there.
> 
>  - todd

... and do not send random signals to all processes.  Find some way to
target the right signal to the right process.  For example from a shell
script starting a dd background process use kill $! which will send a
signal to the most recent background command.


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: fdisk MBR contains more than one OpenBSD partition!

2018-05-14 Thread Raimo Niskanen
On Wed, May 09, 2018 at 12:33:40PM +, Rudolf Sykora wrote:
> > So please describe more in detail what kind of backuping you want.
> 
> I just want to regularly rsync /home to the "backup" partition
> with some history (along the lines of
> 
> https://netfuture.ch/2013/08/simple-versioned-timemachine-like-backup-using-rsync/
> ).
> 
> This partition (or part of it) will later also be backed up to some
> other machine.
> 
> The partition will be mounted read-only most of the time; only for
> back-up it will remounted.

So far a regular OpenBSD disklabel partition with OpenBSD filesystem
fits the bill.

> 
> I would prefer that the backup partition be readable / mountable from
> other machines. That's why I tried a separate MBR partition rather
> than an OpenBSD disklabel one.

And there it got peculiar.

What "other machines".  What OS:es.  What filesystem?  How do you envision
a separete MBR partition will help you with this?

> 
> Ruda

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: fdisk MBR contains more than one OpenBSD partition!

2018-05-09 Thread Raimo Niskanen
On Wed, May 09, 2018 at 09:06:24AM +, Rudolf Sykora wrote:
> Hello misc,
> 
> I wanted to use a MBR partition for backup purposes,
> so I (almost) created (using fdisk) another OpenBSD MBR (A6)
> partiotion, but then I got the message
> 
> MBR contains more than one OpenBSD partition!
> Write MBR anyway? [n]
> 
> So am I doing it wrong?

Well.  Yes.

The BSD's has got a disk label of their own, and OpenBSD has got it's
disklabel inside the MBR:s OpenBSD partition, when MBR is used.  So
there is supposed to be only one OpenBSD partition containing the BSD
disklabel describing the OpenBSD view of the disk's partitioning.

If you have more than one it might work, if all parts of the system selects
to use the same OpenBSD MBR partition, and only warns about the second.
But only that one MBR partition, with its BSD disklabel, will be used.

I have heard of variants where you set one MBR partition at the time to A6
and the other to something else, which it messy.

And it is not intended to operate that way.

You could use one OpenBSD MBR partition and in the BSD disklabel allocate a
big partition of type RAID.  Then use that partition in softraid as RAID 0
or CONCAT - they might allow using a single chunk.  Or as CRYPTO with a
dummy encryption key.

On the new softraid disk you create an MBR OpenBSD partition and so on...

See softraid(4), bioctl(8) and
https://www.openbsd.org/faq/faq14.html#softraid

Whether that is a good suggestions depends very much on what kind of backup
you have in mind.  There are probably many other more BSD:ish ways to do it
than you think.

So please describe more in detail what kind of backuping you want.

> 
> Thanks for comments!
> 
> Ruda

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: 6.2 song?

2018-03-20 Thread Raimo Niskanen
On Thu, Mar 15, 2018 at 07:48:45AM +, Maurice McCarthy wrote:
> On 15/03/18 01:38, Stuart Henderson wrote:
> > On 2018-03-15, jungle boogie <jungleboog...@gmail.com> wrote:
> 
> > 
> > it doesn't say which December.
> > 
> > (and I don't really see why 14/3/2018 would be "pi day"...)
> > 
> > 
> 
> Pi = 3.14 ... 

Pi =~ 3.14

> 
> Personally I think it should be 22nd July, in Britain.
> 
> Pi =~ 22/7

Actually, 22/7 is closer to Pi than 3.14, by about 3e-4.

> 
> :)
> 
>  

Since Sweden joined the EU it has become a mess about date formats here.
Before that the ISO 8601 was winning, now we have 3 different date markings
on food depending which type of food it is or whatnot.

So I celebrate both Pi dates, just to be sure :-)


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: http_proxy for rc.firsttime after Upgrade

2018-01-29 Thread Raimo Niskanen
On Mon, Jan 29, 2018 at 09:48:08AM +, Stuart Henderson wrote:
> On 2018-01-23, Raimo Niskanen <raimo+open...@erix.ericsson.se> wrote:
> > On Mon, Jan 22, 2018 at 08:22:34PM -0500, trondd wrote:
> >> On Mon, January 22, 2018 2:36 am, Raimo Niskanen wrote:
> >> > On Fri, Jan 19, 2018 at 10:47:15AM -0500, trondd wrote:
> >> >> On Fri, January 19, 2018 4:29 am, Raimo Niskanen wrote:
> >> >> > I have some machines behind a squid proxy and have set the http_proxy 
> >> >> > and
> >> >> > ftp_proxy environment variables both in /etc/profile and in 
> >> >> > /etc/login.conf
> >> >> > for the default login class.  This works well.
> >> >> >
> >> >> > But after an upgrade when rc.firsttime calls fw_update and checks for
> >> >> > binary patches the proxy is not used, so I have to wait for that to 
> >> >> > time
> >> >> > out or break it with Ctrl-C and call fw_update manually.
> >> >> >
> >> >> > So I just wonder if anybody have an idea of how to set the http_proxy 
> >> >> > and
> >> >> > ftp_proxy environment variables so they are picked up by rc.firsttime?
> >> >>
> >> >> I submitted a patch for this:
> >> >> https://marc.info/?l=openbsd-tech=151260860105270=2
> >> >
> >> > That sure looks like an improvement!  But should maybe $http_proxy be
> >> > placed between single quotes?
> >> >
> >> > Unfortunately I fetch the sets into /var/OpenBSD/`machine` and
> >> > verify them before rebooting into /bsd62.rd, so it would not work
> >> > for me...
> >> 
> >> Ah, I see. Yeah, I only acconted for the obvious case when a net
> >> install was done.
> >> 
> >> Having thought about it again, an easier solution will be to write your
> >> http_proxy export to /etc/rc.firsttime before rebooting into bsd.rd.  If
> >> you have your update process scripted already, it's an easy additional
> >> line.  The installer only appends commands so anything you have in
> >> rc.firsttime will be preserved.
> >
> > In my case it would work if rc.firsttime sourced /etc/profile, but I do not
> > know if that is a generally good idea...,
> 
> I think this is probably not a good idea, profile may not be squeaky-clean
> or it might not work correctly before system startup.

I see your point.

But maybe transferring http_proxy from the installer to rc.firsttime is a
good feature...  Does not help me, though.

> 
> >   in particular since I can not
> > find any way to set environment variables for /etc/rc to pass to system and
> > package daemons.
> 
> It doesn't help for rc.firsttime, but the canonical way to do this is via
> a class in login.conf - rc.d(8) automatically handles this:
> 
>"daemon_class is a special read-only variable.  It is set to "daemon" 
> unless
>there is a login class configured in login.conf(5) with the same name as 
> the
>rc.d script itself, in which case it will be set to that login class.  This
>allows setting many initial process properties, for example environment
>variables, scheduling priority, and process limits such as maximum memory
>use and number of files."
> 
> For example:
> 
> daemonname:setenv=FOO=bar:tc=daemon:
> 
> If you need a : in the definition, use \c.
> 
> daemonname:setenv=PATH=/usr/local/bin\c/usr/bin:tc=daemon:

That is a neat feature!  I have already set http_proxy, https_proxy
and ftp_proxy for the 'default' login class.

Great to know for the future that variables can be set per daemon name!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: http_proxy for rc.firsttime after Upgrade

2018-01-23 Thread Raimo Niskanen
On Mon, Jan 22, 2018 at 08:22:34PM -0500, trondd wrote:
> On Mon, January 22, 2018 2:36 am, Raimo Niskanen wrote:
> > On Fri, Jan 19, 2018 at 10:47:15AM -0500, trondd wrote:
> >> On Fri, January 19, 2018 4:29 am, Raimo Niskanen wrote:
> >> > Hello list!
> >> >
> >> > I have some machines behind a squid proxy and have set the http_proxy
> >> and
> >> > ftp_proxy environment variables both in /etc/profile and in
> >> > /etc/login.conf
> >> > for the default login class.  This works well.
> >> >
> >> > But after an upgrade when rc.firsttime calls fw_update and checks for
> >> > binary patches the proxy is not used, so I have to wait for that to
> >> time
> >> > out or break it with Ctrl-C and call fw_update manually.
> >> >
> >> > So I just wonder if anybody have an idea of how to set the http_proxy
> >> and
> >> > ftp_proxy environment variables so they are picked up by rc.firsttime?
> >> >
> >> > Best regards
> >> > --
> >> >
> >> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >> >
> >>
> >> I submitted a patch for this:
> >> https://marc.info/?l=openbsd-tech=151260860105270=2
> >
> > That sure looks like an improvement!  But should maybe $http_proxy be
> > placed between single quotes?
> >
> > Unfortunately I fetch the sets into /var/OpenBSD/`machine` and verify them
> > before rebooting into /bsd62.rd, so it would not work for me...
> >
> >>
> >> In the meantime, before reboot, you can edit the rc.firstime script
> >> after
> >> installation.
> >
> > I'll try that trick next time.  Thank you!
> >
> >>
> >> Tim.
> >
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> 
> Ah, I see.  Yeah, I only acconted for the obvious case when a net install
> was done.
> 
> Having thought about it again, an easier solution will be to write your
> http_proxy export to /etc/rc.firsttime before rebooting into bsd.rd.  If
> you have your update process scripted already, it's an easy additional
> line.  The installer only appends commands so anything you have in
> rc.firsttime will be preserved.
> 
> Tim.
> 

In my case it would work if rc.firsttime sourced /etc/profile, but I do not
know if that is a generally good idea..., in particular since I can not
find any way to set environment variables for /etc/rc to pass to system and
package daemons.

Is this a feature or a missing feature???

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Boot problem OpenBSD 6.2 amd64 softraid keydisk

2018-01-23 Thread Raimo Niskanen
On Mon, Jan 22, 2018 at 11:16:29AM +0100, Stefan Sperling wrote:
> On Mon, Jan 22, 2018 at 10:08:30AM +0100, Raimo Niskanen wrote:
> > Hello misc@!
> > 
> > I just wanted to share a problem and a solution that I encountered.  Just
> > posting to maybe help someone else in the future, and perhaps a developer
> > feels that improving a particular error message could be important enough.
> > 
> > My goal was to create an installation with a fully encrypted hard drive
> > using a keydisk, and at first reboot into the installed system I got this:
> > 
> > Booting from hard disk...
> > Using drive 0, partition 3.
> > Loading..
> > probing: pc0 com0 com1 mem[638K 3582M 496M a20=on]
> > disk: hd0+ hd1+ hd2 sr0*
> > >> OpenBSD/amd64 BOOT 3.33
> > unknown KDF type 2
> > open(sr0a:/etc/boot.conf): Operation not permitted
> > boot>
> > 
> > The error message "unknown KDF type 2" is the one that maybe could
> > be improved...
> 
> This error message has already been improved in -current by sunil@ in r1.3 of
> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/lib/libsa/softraid.c
> The message now says ""keydisk not found".

Excellent!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: http_proxy for rc.firsttime after Upgrade

2018-01-22 Thread Raimo Niskanen
On Fri, Jan 19, 2018 at 10:47:15AM -0500, trondd wrote:
> On Fri, January 19, 2018 4:29 am, Raimo Niskanen wrote:
> > Hello list!
> >
> > I have some machines behind a squid proxy and have set the http_proxy and
> > ftp_proxy environment variables both in /etc/profile and in
> > /etc/login.conf
> > for the default login class.  This works well.
> >
> > But after an upgrade when rc.firsttime calls fw_update and checks for
> > binary patches the proxy is not used, so I have to wait for that to time
> > out or break it with Ctrl-C and call fw_update manually.
> >
> > So I just wonder if anybody have an idea of how to set the http_proxy and
> > ftp_proxy environment variables so they are picked up by rc.firsttime?
> >
> > Best regards
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> 
> I submitted a patch for this:
> https://marc.info/?l=openbsd-tech=151260860105270=2

That sure looks like an improvement!  But should maybe $http_proxy be
placed between single quotes?

Unfortunately I fetch the sets into /var/OpenBSD/`machine` and verify them
before rebooting into /bsd62.rd, so it would not work for me...

> 
> In the meantime, before reboot, you can edit the rc.firstime script after
> installation.

I'll try that trick next time.  Thank you!

> 
> Tim.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Boot problem OpenBSD 6.2 amd64 softraid keydisk

2018-01-22 Thread Raimo Niskanen
Hello misc@!

I just wanted to share a problem and a solution that I encountered.  Just
posting to maybe help someone else in the future, and perhaps a developer
feels that improving a particular error message could be important enough.

My goal was to create an installation with a fully encrypted hard drive
using a keydisk, and at first reboot into the installed system I got this:

Booting from hard disk...
Using drive 0, partition 3.
Loading..
probing: pc0 com0 com1 mem[638K 3582M 496M a20=on]
disk: hd0+ hd1+ hd2 sr0*
>> OpenBSD/amd64 BOOT 3.33
unknown KDF type 2
open(sr0a:/etc/boot.conf): Operation not permitted
boot>

The error message "unknown KDF type 2" is the one that maybe could
be improved...

The mistake was that I used an USB keydisk size 16 GB and kept an 8 GB
MSDOS section at the start of the disk, then the OpenBSD section
with an 'a' 8 GB 4.2BSD partition and a 'd' 1 2048 sectors RAID partition
for the keydisk.  You see where this is heading...

The keydisk partition was simply out of reach for the boot(8, amd64)
program.  The boot command "machine diskinfo" gave a hint since the disk
geometry there had fewer cylinders than what fdisk(8) had said, i.e it said
(if I recall correctly) C,H,S=1024,255,65, i.e the infamous 8.4 GB limit,
while in fdisk the disk appeared to have about 1900 cylinders.

So I moved the OpenBSD section to the start of the disk, the keydisk
partition to the start of the OpenBSD section, and the MSDOS section
at the end of the disk, and the installation booted.

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



http_proxy for rc.firsttime after Upgrade

2018-01-19 Thread Raimo Niskanen
Hello list!

I have some machines behind a squid proxy and have set the http_proxy and
ftp_proxy environment variables both in /etc/profile and in /etc/login.conf
for the default login class.  This works well.

But after an upgrade when rc.firsttime calls fw_update and checks for
binary patches the proxy is not used, so I have to wait for that to time
out or break it with Ctrl-C and call fw_update manually.

So I just wonder if anybody have an idea of how to set the http_proxy and
ftp_proxy environment variables so they are picked up by rc.firsttime?

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Writing "ones" instead of "zeroes" when wiping disk

2018-01-12 Thread Raimo Niskanen
On Thu, Jan 11, 2018 at 11:16:28AM -0600, L. V. Lammert wrote:
> On Thu, 11 Jan 2018, STeve Andre' wrote:
> 
> > Don't bother.   Wiping the disk twice is enough.   If you are storing state
> > secrets melt the disk.
> >
> An anvil big hammer also works well and gives some exercise in the
> process.

Or a screwdriver and a pair of pliers if you want less excersise.


> 
>   Lee

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Wifi Ierrs

2018-01-11 Thread Raimo Niskanen
On Thu, Jan 11, 2018 at 01:43:11PM +0100, Stefan Sperling wrote:
> On Thu, Jan 11, 2018 at 10:51:32AM +0100, Raimo Niskanen wrote:
> > Hello misc!
> > 
> > I have an PC Engines Alix 2d13 with an Atheros AR9280 running WPA2-PSK,
> > and see a lot of input errors over WiFi.  netstat -ivn shows:
> > 
> > NameMtu   Network Address  Ipkts IerrsOpkts Oerrs   
> > Colls
> > athn0   1500  172.17/16   172.17.0.1 1160154 4029261  1485342 61906 > > 0
> > 
> > I have tried calling "netstat -W athn0" with 10 seconds intervals and get
> > typically over such intervals:
> > 
> > 170 input unencrypted packets with wep/wpa config discarded
> > 12 input packets with mismatched channel
> > 8 input packets with mismatched ssid
> > 2 input frames below block ack window start
> > 
> > So is this normal for a congested neighbourhood (6 stories apartment house
> > - lots of APs around in the house on the 2.4 GHz band), or can anybody 
> > think of
> > a setting to tweak?  The router runs OpenBSD 6.2 stable (patched).
> > 
> > Best regards
> > -- 
> > 
> > / Raimo Niskanen
> > 
> 
> These are the numbers for my AP at home:
> 
> $ netstat -nI athn0 
> NameMtu   Network Address  Ipkts IerrsOpkts Oerrs 
> Colls
> athn0   150004:f0:21:17:3c:6a  2235626 123714  3743974 43802
>  0
> 
> The wifi network is usable but relatively slow.
> 
> This same card worked perfectly fine on a clean wifi channel up in
> the Canadian mountains where there was virtually no interference.
> Up there I got about 3MB/s transfer rates if I recall correctly.
> 
> After some code inspection I've done recently I came to the conclusion
> that this problem might be due to the fact that our driver does not
> run the regular calibration routines which other OS drivers use.
> If someone looked into that it might help fix the known issues we
> have with these devices. There is calibration code in our driver
> already but most of it is not being called yet. And what's there now
> needs to be cross-checked with other OSs since there are probably bugs.

Ok.  Interesting to know, but unfortunately way out of my competence
domain...  I hope for others to pick up this ball.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Wifi Ierrs

2018-01-11 Thread Raimo Niskanen
Hello misc!

I have an PC Engines Alix 2d13 with an Atheros AR9280 running WPA2-PSK,
and see a lot of input errors over WiFi.  netstat -ivn shows:

NameMtu   Network Address  Ipkts IerrsOpkts Oerrs   
Colls
athn0   1500  172.17/16   172.17.0.1 1160154 4029261  1485342 61906 0

I have tried calling "netstat -W athn0" with 10 seconds intervals and get
typically over such intervals:

170 input unencrypted packets with wep/wpa config discarded
12 input packets with mismatched channel
8 input packets with mismatched ssid
2 input frames below block ack window start

So is this normal for a congested neighbourhood (6 stories apartment house
- lots of APs around in the house on the 2.4 GHz band), or can anybody think of
a setting to tweak?  The router runs OpenBSD 6.2 stable (patched).

Best regards
-- 

/ Raimo Niskanen



Re: Read sysctl from file

2017-07-25 Thread Raimo Niskanen
On Tue, Jul 25, 2017 at 09:32:33AM +0300, Mihai Popescu wrote:
> > As I see it everybody has agreed upon that and some are now just making
> > suggestions on how to solve the OP's problem, that do not involve adding -p 
> > to
> > OpenBSD's sysctl. So I thik that was uncalled for.
> 
> Not everybody! Man, you talk like a black suit manager here.

Maybe I am ;-)

But I saw nobody in the thread that still advocated that sysctl -p should
be added to OpenBSD.  So that was what i saw was agreed upon by everybody
(in the thread).  Therefore it was not necessary to once again point out
that sysctl -p will never be added to OpenBSD.
Because it will not.
Never.
Already said that.

> 
> > I just do not get that.
> 
> Yes, you obviously don't. It has been explained that the CONCEPT of -p
> is WRONG in OpenBSD area and maybe other areas, too. IF you can grasp
> that, then think why the hell would someone try to implement this and
> find a solution for the OP?

Now that is a different, and valid argument.  To tell someone that
implementing a substitute for sysctl -p is a bad idea because that would
send the wrong message (no message) to the Ansible folks.

But that was not the response the implementer got.

> 
> I think one of the reasons that OpenBSD avoided to become useless
> swiss army knife of OSes is exactly that resitance to implement crap
> "just because ...".

Bla bla bla.  Heard it before.  Agrees completely.  Have said it myself
many times.  Nothing new.  And that was not the subject.

Sorry, maybe it was the subject, but very indirectly.

As I see it is the message that helping someone solve a problem in a way that
encourages other OS:es bad decision is a bad strategy that did not get
through the usual @misc communication style of go f*ck your self you know
nothing.

There are better ways to send that message then what used in this thread.
For example by writing it up front.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Read sysctl from file

2017-07-24 Thread Raimo Niskanen
On Fri, Jul 21, 2017 at 05:30:32PM -0600, Theo de Raadt wrote:
> > > On Jul 21, 2017, at 3:42 PM, li...@wrant.com wrote:
> > >=20
> > > Fri, 21 Jul 2017 12:33:31 -0700 Peter Faiman <peterfai...@gmail.com>
> > >> # ./sysctl -p example.conf
> > >> Peter
> > >=20
> > > Hi Peter, ansibles,
> > >=20
> > > No guarantee systems controls stay affixed, wrapper tools comply got =
> > it?
> > 
> > The point of sysctl -p is reloading from a file. So that you put controls in
> > the file and load that file, exactly as happens in system startup. The whole
> > point is to ensure consistency with system startup. True, securelevel throws
> > a bit of a wrench in that, but this works for all other settings.
> 
> We don't have -p.
> 
> It is an addition made by a foreign system which barely uses sysctl,
> and has been acting for years like they will be removing support.
> 
> THERE IS NO SUPPORT FOR -p.
> 
> It is unlikely to happen.

As I see it everybody has agreed upon that and some are now just making
suggestions on how to solve the OP's problem, that do not involve adding -p
to OpenBSD's sysctl.  So I thik that was uncalled for.

> 
> Let's just stop this.  You just aren't capable of listening to what
> is being said.  Also, you are ridiculously rude.

I just do not get that.  I think Peter has listened to what was said and
that others are rude to him for no (very little) reason.

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Read sysctl from file

2017-07-24 Thread Raimo Niskanen
On Fri, Jul 21, 2017 at 05:40:04PM -0600, Theo de Raadt wrote:
> Peter, please leave.  People around here don't need to read your
> insults.
>  

Peter, you do not have to leave.  Theo says that all the time.

I did not read your posts as particulary insulting to anyone and understand
why you feel you ought to defend yourself for getting maybe deliberately
misunderstood.

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Restoring /altroot

2017-07-18 Thread Raimo Niskanen
On Mon, Jul 17, 2017 at 08:18:25AM -0400, Nick Holland wrote:
> On 07/17/17 05:50, Raimo Niskanen wrote:
> > On Fri, Jul 14, 2017 at 10:46:14PM -0400, Nick Holland wrote:
> >> On 07/14/17 09:00, Raimo Niskanen wrote:
> >> > Hi misc@.
> >> > 
> >> > I wonder how to restore from an /altroot backup?
> >> > 
> >> > (I missed that pax -r happily writes absolute paths and wrote over
> >> >  /etc from a backup file of another machine)
> >> > 
> >> > 
> >> > Is it to dd(1) back all but the first 16 blocks - the reverse of what
> >> > daily(8) does?  Is that all that is needed?
> >> 
> >> don't...
> >> 
> >> > (I missed to skip the first 16 blocks, and I used the block devices 
> >> > instead
> >> >  of the character devices.  The result was a vegetable, and would like to
> >> >  understand which of my mistakes that were fatal.)
> 
> probably worth answering why this failed...
> 1) The first 16 blocks are where the disklabel is hiding on the first
> partition (usually, 'a').  Blindly copy over a disklabel from the wrong
> disk, you will blow away your current disklabel.  BEST case (both disks
> have the exact same layout), you just changed the DDUID of your target
> disk.
> 
> 2) writing to sd0a/wd0a instead of rsd0a/rwd0a just drops the data in
> the wrong place.  This error probably saved your disklabel, so it's a
> good error to combine with the first.  Didn't help anything, but kept
> the damage from being worse.

Strange.  It seems I got a wiped disklabel.
Anyway not worth digging into anymore.
Thank you for the insight!

> 
> >> yeah, that's why.  It CAN work, but ... it is the hard way and it's
> >> error prone.
> >> 
> >> better way: let's say sd1k is your /altroot...
> >> 
> >> # mount /dev/sd1k /altroot
> >> 
> >> now...it's just a normal file system on a normal place.  Copy out
> >> whatever you want.  umount it when done, please.
> >> 
> >> Nick.
> > 
> > Yes, thank you!  That is the safe way.  In this case I wanted to get rid
> > of all files that my pax fumbling had put there, so I wanted to clear the
> > root filesystem and copy back all from /altroot.  But then I also would
> > have ro run installboot on the restored root filesystem, right?
> > 
> > Is that the right(tm) way to do it?
> 
> If you copy files from any backup back to root, yes, you will need to
> re-run installboot.  This has to be done any time /boot could have moved
> to a new physical spot on the disk.
> 
> If you really want to blow things completely away, give consideration to
> doing an "upgrade" (to either what you were running or most recent
> release, or even -current), then restoring your /etc/ directory, and
> re-running sysmerge afterwards (if you change versions).
> 
> Nick.

A maybe some day useful trick indeed.  Thank you!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: syspatch glitch

2017-07-18 Thread Raimo Niskanen
On Mon, Jul 17, 2017 at 02:20:00PM +0200, Antoine Jacoutot wrote:
> On Mon, Jul 17, 2017 at 12:04:19PM +0200, Raimo Niskanen wrote:
> > It seems syspatch looks at the current machine capabilities instead of
> > which kernel is running when it decides on if /bsd is /bsd.sp or /bsd.mp.
> 
> Hi.
> 
> > I tried to install OpenBSD 6.1 to a USB connected CF card that later will
> > run in an alix2d13 that has got one core, but I did the installation from
> > a laptop with two cores.  Both i386.
> > 
> > Then I moved /bsd to /bsd.mp and /bsd.sp to /bsd since the installer had
> > detected that the install machine should run /bsd.mp.
> > 
> > After that I ran syspatch, still on the laptop, and it failed on patch 002
> > with as I remember tar complaining on not being able to find /bsd.sp.
> 
> I you run syspatch on the laptop then what you call the running kernel is the
> one that booted (i.e. the one on the laptop). That's perfectly normal and as
> you saw this is what the installer does as well.
> 
> > installation, and after that it seems both /bsd (.mp) and /bsd.sp are
> > patched, so I can hopefully change the kernels just before putting the CF
> > card in the Alix instead, so no harm done.
> > 
> > But is it by design that syspatch looks at the running machine instead of
> > the running kernel?  I would have expected it the other way around...
> 
> Why would you expect that?
> The installation was done on an MP system. The running machine and running
> kernel as the same in your setup.

Well, what I tried to do was to swap kernels to run the SP kernel on my MP
machine and then run openup(syspatch), assuming that the running kernel is
what determines how to patch the kernels.
That is what I expected most likely to work...

> 
> What you want to do instead is run syspatch from rc.firstime on your Alix.

In this case I wanted to upgrade the system much far as possible before booting
it on the Alix, i.e pkg_add, and openup(syspatch).  6.1 has got a number of
kernel patches by no so it would be nice to start with them applied.

> Kernel handling is tricky because we need to handle 2 different kernels and
> kernel is usually the thing people like to fuck with...

It seems both kernels are patched, so as long as I know how it works
I can live with that.

> 
> -- 
> Antoine

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Restoring /altroot

2017-07-17 Thread Raimo Niskanen
On Fri, Jul 14, 2017 at 10:46:14PM -0400, Nick Holland wrote:
> On 07/14/17 09:00, Raimo Niskanen wrote:
> > Hi misc@.
> > 
> > I wonder how to restore from an /altroot backup?
> > 
> > (I missed that pax -r happily writes absolute paths and wrote over
> >  /etc from a backup file of another machine)
> > 
> > 
> > Is it to dd(1) back all but the first 16 blocks - the reverse of what
> > daily(8) does?  Is that all that is needed?
> 
> don't...
> 
> > (I missed to skip the first 16 blocks, and I used the block devices instead
> >  of the character devices.  The result was a vegetable, and would like to
> >  understand which of my mistakes that were fatal.)
> 
> yeah, that's why.  It CAN work, but ... it is the hard way and it's
> error prone.
> 
> better way: let's say sd1k is your /altroot...
> 
> # mount /dev/sd1k /altroot
> 
> now...it's just a normal file system on a normal place.  Copy out
> whatever you want.  umount it when done, please.
> 
> Nick.

Yes, thank you!  That is the safe way.  In this case I wanted to get rid
of all files that my pax fumbling had put there, so I wanted to clear the
root filesystem and copy back all from /altroot.  But then I also would
have ro run installboot on the restored root filesystem, right?

Is that the right(tm) way to do it?

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



syspatch glitch

2017-07-17 Thread Raimo Niskanen
It seems syspatch looks at the current machine capabilities instead of
which kernel is running when it decides on if /bsd is /bsd.sp or /bsd.mp.

I tried to install OpenBSD 6.1 to a USB connected CF card that later will
run in an alix2d13 that has got one core, but I did the installation from
a laptop with two cores.  Both i386.

Then I moved /bsd to /bsd.mp and /bsd.sp to /bsd since the installer had
detected that the install machine should run /bsd.mp.

After that I ran syspatch, still on the laptop, and it failed on patch 002
with as I remember tar complaining on not being able to find /bsd.sp.

Restoring /bsd to /bsd.sp and /bsd.mp to /bsd allowed me to syspatch the
installation, and after that it seems both /bsd (.mp) and /bsd.sp are
patched, so I can hopefully change the kernels just before putting the CF
card in the Alix instead, so no harm done.

But is it by design that syspatch looks at the running machine instead of
the running kernel?  I would have expected it the other way around...



By the way.  Syspatch and openup really makes keeping a system updated a
breeze - thank you very much for these tools, everyone involved!

Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Restoring /altroot

2017-07-14 Thread Raimo Niskanen
Hi misc@.

I wonder how to restore from an /altroot backup?

(I missed that pax -r happily writes absolute paths and wrote over
 /etc from a backup file of another machine)


Is it to dd(1) back all but the first 16 blocks - the reverse of what
daily(8) does?  Is that all that is needed?

(I missed to skip the first 16 blocks, and I used the block devices instead
 of the character devices.  The result was a vegetable, and would like to
 understand which of my mistakes that were fatal.)


Best regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: File Server with OpenBSD?

2017-03-08 Thread Raimo Niskanen
On Tue, Mar 07, 2017 at 05:55:08PM +0100, Solène Rapenne wrote:
> Le 2017-03-07 17:29, Roderick a écrit :
> For data integrity, you may use sysutils/bitrot to check for data 
> integrity (bit rot).

mtree(8) with -K sha1digest might be enough, and is in the base
system.

> With OpenBSD, you won't get snapshots, on-the-fly compression etc...
> 
> Don't forget backups, that the most important thing for your file server 
> :-)

Oh yes!

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: rcctl hickups on OpenBSD 6.0?

2017-02-16 Thread Raimo Niskanen
: :
> > Since I hade run 'domainname ' and ypbind by hand it had set
> > /var/yp/binding and therefore 'rcctl enable ypbind' concludes that there is
> > no need for an entry in /etc/rc.conf.local because the quirked default value
> > is already ''.
> > 
> > I am pretty certain that the reason that ypbind did not get started from
> > /etc/rc when /etc/defaultdomain contained a domain name and /var/yp/binding
> > was set is that /etc/rc sources /etc/rc.d/rc.subr and runs _rc_parse_conf
> > before /var is mounted so /etc/rc thinks ypbind_flags=NO.  After /var has
> > been mounted ypbind_flags= and therefore 'rcctl ls failed' lists ypbind,
> > which surely enoug is not started when it should have been.
> > 
> > Nasty glitch...
> > 
> > I do not know how it should be fixed, but if I had enabled ypbind through
> > rcctl from the start I would have gotten an entry in /etc/rc.conf.local and
> > everything would have just worked.
> > 
> > However, the quirked value for ypbind gets wrong for /etc/rc which I think
> > is kind of a bug...
> 
> Ahahaha, that's an awesome "issue".
> I'll look at fixing this asap.

Excellent!

You could also get this "issue" if you have run ypbind, disables it with
rcctl without removing the YP domainname and /var/yp/binding/, and then
enables it again.

: :
> > Ok.  That figures!  I had read /etc/rc.conf and concluded that the default
> > value for nfsd_flags was NO.
> 
> I mean the default flags when nfsd is enabled.

Yes.  The fact that nfsd_flags= means enabled with default flags which
may be found elswhere takes me some time to get used to. (I think I think
nfsd_flags=DEFAULT would have surprised me less, also, to use
nfsd_flags=' ' for enabled with no flags is a bit ugly albeit rarely needed)

> 
> > rc.subr(8) explains that rc.subr global defaults are overridden by
> > /etc/rc.d/ script defaults that are overrriden by /etc/rc.conf.local values.
> > But /etc/rc.conf defaults are not mentioned here.  I feel a bit confused...
> > 
> > But 'rcctl get ' will tell me the truth (except for ypbind_flags
> > in /etc/rc ;-).  Thank you for enlightening me!
> 
> Yes that was one of the reasons rcctl was born; so you can know the status and
> flags of your daemons without having to look into several files.

Maybe make a pointer in /etc/rc.conf to rcctl(8) since historically
the defaults were found there.

Thank you for your prompt response!

> 
> Thanks.
> 
> -- 
> Antoine

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: rcctl hickups on OpenBSD 6.0?

2017-02-16 Thread Raimo Niskanen
On Thu, Feb 16, 2017 at 10:43:43AM +0100, Antoine Jacoutot wrote:
> On Thu, Feb 16, 2017 at 08:46:45AM +0100, Raimo Niskanen wrote:
> > Hello Misc@
> > 
> > I tried to activate ypbind via rcctl:
> > rcctl enable ypbind
> > and it did not write "ypbind_flags=" into /etc/rc.conf.local.
> 
> 
> Can't reproduce here.
> # rcctl enable ypbind ; grep yp /etc/rc.conf.local
>  
> ypbind_flags=
> 
> > I had run ypbind so it should start according to the documentation since
> > there is a domain file in /var/yp/binding/ but when booting the machine
> > ypbind did not start and there was no printout from /etc/rc about starting
> > it.  "rcdctl ls failed" did print ypbind.
> 
> If 'rcctl ls failed' outputs ypbind, then it means ypbind_flags *is* in
> rc.conf.local or something is really bogus...

It is... See below.

> 
> > 
> > I tried to debug rcctl with little success.  Looking at the script it seems
> > to me that it checks /etc/rc.conf and /etc/rc.conf.local and should write a
> > line "ypbind_flags=" into /etc/rc.conf.local since the default in
> > /etc/rc.conf is "ypbind_flags=NO".  But ktrace:ing it indicates that it
> > also checks domainname and /var/yp/binding so it is smarter than it looks.
> 
> Wait what? rcctl certainly does not check for these.

It certainly does.  I have found it now!  Well, rcctl does not check for
these, but it relies on
. /etc/rc.d/rc.subr
_rc_parse_conf

And in /etc/rc.d/rc.subr the function _rc_parse_conf calls _rc_quirks
which checks `domainname` and /var/yp/binding and if they are set
ypbind_flags becomes ''.

So does /etc/rc, but misses; read on!

Since I hade run 'domainname ' and ypbind by hand it had set
/var/yp/binding and therefore 'rcctl enable ypbind' concludes that there is
no need for an entry in /etc/rc.conf.local because the quirked default value
is already ''.

I am pretty certain that the reason that ypbind did not get started from
/etc/rc when /etc/defaultdomain contained a domain name and /var/yp/binding
was set is that /etc/rc sources /etc/rc.d/rc.subr and runs _rc_parse_conf
before /var is mounted so /etc/rc thinks ypbind_flags=NO.  After /var has
been mounted ypbind_flags= and therefore 'rcctl ls failed' lists ypbind,
which surely enoug is not started when it should have been.

Nasty glitch...

I do not know how it should be fixed, but if I had enabled ypbind through
rcctl from the start I would have gotten an entry in /etc/rc.conf.local and
everything would have just worked.

However, the quirked value for ypbind gets wrong for /etc/rc which I think
is kind of a bug...

> 
> > Unfortunately /etc/rc starts ypbind like any other daemon so ypbind_flags
> > has to be != NO and therefore it is not started.
> > 
> > So there seems to be some misunderstanding between /etc/rc and rcctl about
> > exactly when ypbind is enabled or not.
> > 
> > The workaround is easy enough (manually editing /etc/rc.conf.local so no
> > big issue.
> > 
> > Also, I tried to set nfsd flags:
> > rcctl enable nfsd
> > rcctl set nfsd flags -tun 4
> > or
> > rcctl set nfsd flags "-tun 4"
> > but it did not work (nfsd_flags=)
> > rcctl set nfsd flags -tu
> > did work, though.
> > 
> > Known problems?
> 
> It's not a problem, "-tun 4" are the default flags.
> Check the output of 'rcctl get nfsd flags'.

Ok.  That figures!  I had read /etc/rc.conf and concluded that the default
value for nfsd_flags was NO.

rc.subr(8) explains that rc.subr global defaults are overridden by
/etc/rc.d/ script defaults that are overrriden by /etc/rc.conf.local values.
But /etc/rc.conf defaults are not mentioned here.  I feel a bit confused...

But 'rcctl get ' will tell me the truth (except for ypbind_flags
in /etc/rc ;-).  Thank you for enlightening me!


> 
> -- 
> Antoine

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



rcctl hickups on OpenBSD 6.0?

2017-02-16 Thread Raimo Niskanen
Hello Misc@

I tried to activate ypbind via rcctl:
rcctl enable ypbind
and it did not write "ypbind_flags=" into /etc/rc.conf.local.

I had run ypbind so it should start according to the documentation since
there is a domain file in /var/yp/binding/ but when booting the machine
ypbind did not start and there was no printout from /etc/rc about starting
it.  "rcdctl ls failed" did print ypbind.

I tried to debug rcctl with little success.  Looking at the script it seems
to me that it checks /etc/rc.conf and /etc/rc.conf.local and should write a
line "ypbind_flags=" into /etc/rc.conf.local since the default in
/etc/rc.conf is "ypbind_flags=NO".  But ktrace:ing it indicates that it
also checks domainname and /var/yp/binding so it is smarter than it looks.

Unfortunately /etc/rc starts ypbind like any other daemon so ypbind_flags
has to be != NO and therefore it is not started.

So there seems to be some misunderstanding between /etc/rc and rcctl about
exactly when ypbind is enabled or not.

The workaround is easy enough (manually editing /etc/rc.conf.local so no
big issue.

Also, I tried to set nfsd flags:
rcctl enable nfsd
rcctl set nfsd flags -tun 4
or
rcctl set nfsd flags "-tun 4"
but it did not work (nfsd_flags=)
rcctl set nfsd flags -tu
did work, though.

Known problems?
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Dell R230 install, EFI did not work with OpenBSD 6.0

2017-02-16 Thread Raimo Niskanen
acpiprt11 at acpi0: bus -1 (RP08)
acpiprt12 at acpi0: bus 3 (RP09)
acpiprt13 at acpi0: bus -1 (RP10)
acpiprt14 at acpi0: bus 4 (RP11)
acpiprt15 at acpi0: bus -1 (RP12)
acpiprt16 at acpi0: bus -1 (RP13)
acpiprt17 at acpi0: bus -1 (RP14)
acpiprt18 at acpi0: bus -1 (RP15)
acpiprt19 at acpi0: bus -1 (RP16)
acpiprt20 at acpi0: bus -1 (RP17)
acpiprt21 at acpi0: bus -1 (RP18)
acpiprt22 at acpi0: bus -1 (RP19)
acpiprt23 at acpi0: bus -1 (RP20)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpicpu2 at acpi0: C1(@1 halt!)
acpicpu3 at acpi0: C1(@1 halt!)
"ACPI000D" at acpi0 not configured
"INT3F0D" at acpi0 not configured
"PNP0501" at acpi0 not configured
"PNP0501" at acpi0 not configured
"IPI0001" at acpi0 not configured
acpibtn0 at acpi0: SLPB
"PNP0C14" at acpi0 not configured
"PNP0C33" at acpi0 not configured
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
ipmi at mainbus0 not configured
memory map conflict 0xe00fd000/0x1000
memory map conflict 0xfe00/0x11000
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 vendor "Intel", unknown product 0x1918 rev
0x07
ppb0 at pci0 dev 1 function 0 "Intel Core 6G PCIE" rev 0x07: msi
pci1 at ppb0 bus 1
ppb1 at pci0 dev 1 function 1 "Intel Core 6G PCIE" rev 0x07: msi
pci2 at ppb1 bus 2
xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x31: msi
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 "Intel xHCI root hub" rev 3.00/1.00 addr 1
pchtemp0 at pci0 dev 20 function 2 "Intel 100 Series Thermal" rev 0x31
"Intel 100 Series MEI" rev 0x31 at pci0 dev 22 function 0 not configured
vendor "Intel", unknown product 0xa13b (class communications subclass
miscellaneous, rev 0x31) at pci0 dev 22 function 1 not configured
ahci0 at pci0 dev 23 function 0 "Intel 100 Series AHCI" rev 0x31: msi, AHCI
1.3.1
ahci0: port 0: 6.0Gb/s
ahci0: port 1: 6.0Gb/s
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0: <ATA, ST1000NM0055-1V4, > SCSI3 0/direct
fixed naa.5000c5009267a5c7
sd0: 953869MB, 512 bytes/sector, 1953525168 sectors
sd1 at scsibus1 targ 1 lun 0: <ATA, ST1000NM0055-1V4, > SCSI3 0/direct
fixed naa.5000c500926799c8
sd1: 953869MB, 512 bytes/sector, 1953525168 sectors
ppb2 at pci0 dev 29 function 0 "Intel 100 Series PCIE" rev 0xf1: msi
pci3 at ppb2 bus 3
3:0:0: mem address conflict 0xfffc/0x4
3:0:1: mem address conflict 0xfffc/0x4
bge0 at pci3 dev 0 function 0 "Broadcom BCM5720" rev 0x00, BCM5720 A0
(0x572), APE firmware NCSI 1.3.16.0: msi, address 10:98:36:a9:c7:35
brgphy0 at bge0 phy 1: BCM5720C 10/100/1000baseT PHY, rev. 0
bge1 at pci3 dev 0 function 1 "Broadcom BCM5720" rev 0x00, BCM5720 A0
(0x572), APE firmware NCSI 1.3.16.0: msi, address 10:98:36:a9:c7:36
brgphy1 at bge1 phy 2: BCM5720C 10/100/1000baseT PHY, rev. 0
ppb3 at pci0 dev 29 function 2 "Intel 100 Series PCIE" rev 0xf1: msi
pci4 at ppb3 bus 4
ppb4 at pci4 dev 0 function 0 "Renesas SH7758 PCIE Switch" rev 0x00
pci5 at ppb4 bus 5
ppb5 at pci5 dev 0 function 0 "Renesas SH7758 PCIE Switch" rev 0x00
pci6 at ppb5 bus 6
ppb6 at pci6 dev 0 function 0 "Renesas SH7758 PCIE-PCI" rev 0x00
pci7 at ppb6 bus 7
vga1 at pci7 dev 0 function 0 "Matrox MGA G200eR" rev 0x01
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 31 function 0 vendor "Intel", unknown product 0xa149 rev
0x31
"Intel 100 Series PMC" rev 0x31 at pci0 dev 31 function 2 not configured
ichiic0 at pci0 dev 31 function 4 "Intel 100 Series SMBus" rev 0x31: apic 2
int 16
iic0 at ichiic0
sdtemp0 at iic0 addr 0x19: tse2004gb2
sdtemp1 at iic0 addr 0x1b: tse2004gb2
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
uhub1 at uhub0 port 3 "no manufacturer Gadget USB HUB" rev 2.00/0.00 addr 2
uhub2 at uhub0 port 4 "ATEN International product 0x8021" rev 1.10/1.00
addr 3
uhidev0 at uhub2 port 1 configuration 1 interface 0 "ATEN CS1708A V1.4.131"
rev 1.10/1.00 addr 4
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub2 port 1 configuration 1 interface 1 "ATEN CS1708A V1.4.131"
rev 1.10/1.00 addr 4
uhidev1: iclass 3/1
ums0 at uhidev1: 5 buttons, Z dir
wsmouse0 at ums0 mux 0
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
sd2 at scsibus3 targ 1 lun 0: <OPENBSD, SR RAID 1, 006> SCSI2 0/direct
fixed
sd2: 953866MB, 512 bytes/sector, 1953519473 sectors
root on sd2a (05b1917da05f48e3.a) swap on sd2b dump on sd2b

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Memory alignment

2017-02-06 Thread Raimo Niskanen
On Fri, Feb 03, 2017 at 04:07:25PM +0100, Boudewijn Dijkstra wrote:
> Op Sat, 28 Jan 2017 06:26:16 +0100 schreef Damian McGuckin  
> <dami...@esi.com.au>:
> > What is the recommended most portable way to force memory alignment for  
> > a datum of any type, assuming one has a pointer say
> >
> > char *x
> >
> > I currently use something like
> >
> > char *xany = aligntonext(x, sizeof(long))
> >
> > where I use my own function 'aligntionext' which is defined below and I  
> > also assume that a 'long' will be the natural word-size of the machine  
> > and that any datum things just needs to align to this boundary. That  
> > said, if the second argument is say 4k, the function will align its  
> > result to a 4k boundary.
> >
> > I was wondering if there is an optimal, better, more acceptable, or more  
> > portable, way.
> >
> 
> Easy and very portable:
> 
> void *
> aligntonext(void *x, size_t size)
> {
>   return (void *)uintptr_t)x + size - 1u) / size) * size);
> }
> 
> Whether it is optimal depends on compiler optimization.

Isn't this stuff macros are made of:

# define aligntonext(ptr, size) \
((void*)uintptr_t)(ptr) + (size) - 1u) / (size)) * (size)))

Or

# define aligntonext(ptr, bits) \
((void*)((((uintptr_t)(ptr) + (1<<(bits)) - 1u) >> (bits)) << (bits)))

Note that the second argument is evaluated three times in both variants...

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: A couple of password pointers to avoid failed login(1) via cu(1)

2017-01-18 Thread Raimo Niskanen
On Wed, Jan 18, 2017 at 12:37:49PM +0100, Alexander Hall wrote:
> On January 18, 2017 10:32:29 AM GMT+01:00, minek van <minek...@mail.com>
> wrote:
> 
> > 
> 
> Because the simple suggestion below was to easy?
> 
> >>
> >> Or simply:
> >> openssl rand -base64 

Hard to beat, can even be remembered!

Slight "improvement" (to go around some password restrictions and to
use characters that does not move around the keyboar that much
between locales):

openssl rand -base64  | tr +/ .,

> >>
> >> --
> >> Christian "naddy" Weisgerber
> >na...@mips.inka.de
> 
> /Alexander

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Failure to get unbound to talk to nsd on the same server

2016-10-11 Thread Raimo Niskanen
Please give more details on which dig commands you used on which machine(s)
and paste their exact results.  Otherwise hard to tell since your setup
seems about right.  Does pf get in your way?

And -l Port to dig selects a non-default port.

Anything interesting in your system logs on the DNS server?

Try to tcpdump on 127.0.0.1 port 53 and see if you have traffic there
between unbound and nsd.

Good luck!

/ Raimo Niskanen



On Mon, Oct 10, 2016 at 11:42:16PM +0200, Johan Mellberg wrote:
> Hi all,
> 
> I am setting up a fresh OpenBSD 6.0 server in a KVM VM to serve my
> home network with DNS. I have a custom zone (only for LAN use) set up
> and previously used BIND successfully (but that VM crashed and its
> disk was hosed...) both as authoritative and caching/resolving.
> 
> So now I am trying to learn to set up NSD to be authoritative for my
> small zone and Unbound to serve the LAN with all other queries. But
> there is a problem:
> 
> 1. Unbound successfully responds to queries and provides lookup to the
> LAN machines for "the internet".
> 2. NSD successfully responds to queries for the custom zone.
> 3. But I cannot get Unbound to get a reply from NSD...
> 
> I have tried multiple combinations of ports and interface bindings and
> I suspect that I am missing something simple here. Currently I have
> set NSD to listen on 127.0.0.1 and Unbound listens on 192.168.x.91 -
> so there should not be a conflict. In fact it works fine if I use dig
> @localhost  and dig @192.168.x.91 
> respectively, but the second version only provides an answer-less
> response if asked for a LAN hostname.
> 
> Unbound is set to ask localhost for the stub zones, forward and reverse.
> 
> And, yes, I could of course use Unbound to serve my local zone and
> drop NSD - but that would be giving up... It's supposed to work from
> all I read! :-)
> 
> I have also tried having NSD listen on 127.0.0.1@5353, and telling
> unbound to use that as the stub-address, while then having Unbound
> listen on 127.0.0.1 as well as 192.168.x.91 to be able to set
> 127.0.0.1 as the nameserver in /etc/resolv.conf. Same result except I
> can't test NSD with dig as it can't use an alternative port.
> 
> A possibly related question: I can't seem to be able to use
> shortnames. The domain part should be picked up from the host name as
> given in /etc/myname, but that does not seem to work as I expect, I
> always have to provide the FQDN. Again something I have missed
> perhaps?
> 
> Anyway, I am staring blindly at the config files now and really need
> help figuring it out. I have removed all that is commented, otherwise
> it's the default except for changes of course.
> 
> Thanks for any clue bats coming my way...
> /Johan
> 
> * resolv.conf
> lookup file bind
> nameserver 192.168.x.91
> 
> # cat /etc/myname
> dns03.my.domain
> 
> # cat /etc/hosts
> 127.0.0.1   localhost
> ::1 localhost
> 192.168.x.91   dns03.my.domain dns03
> 
> # cat /var/unbound/etc/unbound.conf
> # $OpenBSD: unbound.conf,v 1.7 2016/03/30 01:41:25 sthen Exp $
> 
> server:
> interface: 192.168.x.91
> interface: ::1
> do-not-query-localhost: no
> 
> access-control: 192.168.x.64/24 allow
> access-control: 127.0.0.0/8 allow
> access-control: 0.0.0.0/0 refuse
> access-control: ::0/0 refuse
> access-control: ::1 allow
> 
> hide-identity: yes
> hide-version: yes
> 
> # Uncomment to enable DNSSEC validation.
> #
> auto-trust-anchor-file: "/var/unbound/db/root.key"
> 
> root-hints: /var/unbound/etc/root.hints
> 
> remote-control:
> control-enable: yes
> control-use-cert: no
> control-interface: /var/run/unbound.sock
> 
> stub-zone:
> name: "my.domain"
> stub-addr: 127.0.0.1
> stub-zone:
> name: "x.168.192.in-addr.arpa"
> stub-addr: 127.0.0.1
> 
> # cat /var/nsd/etc/nsd.conf
> # $OpenBSD: nsd.conf,v 1.11 2015/04/12 11:49:39 sthen Exp $
> 
> server:
> hide-version: yes
> verbosity: 1
>     database: "" # disable database
> 
> ## bind to a specific address/port
> ip-address: 127.0.0.1
> 
> remote-control:
> control-enable: yes
> 
> zone:
> name: "my.domain"
> zonefile: "master/my.domain"
> zone:
> name: "x.168.192.in-addr.arpa"
> zonefile: "master/192.168.x.rev"

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: DHCP over vr(4) on bridge(4) through vether(4) no working?

2016-10-09 Thread Raimo Niskanen
On Sun, Oct 09, 2016 at 02:29:12PM +0200, Eric Huiban wrote:
> Raimo Niskanen wrote:
> 
> > I did a bridge configuration according to the FAQ with bridge0 containing
> > athn0, vr1 and vether0.  vether0 got the IP address configuration that
> > athn0 had before, dhcpd was reconfigured to serve vr0 and vether0 and that
> > worked just fine.  DHCP over athn0 passes through bridge0 and vether0 to
> > dhcpd as well as directly from vr0 to dhcpd.
> > 
> > But DHCP over vr1 through bridge0 and vether0 does not work.  I had to
> > configure a static address on the access point to get any further.
> > 
> > I know that DHCP over vr0 that dhcpd serves directly works, and I know that
> > it works when dhcpd serves athn0 directly, plus it works when dhcpd serves
> > athn0 throught bridge0 and vether0.
> 
> did you try to add something like this in your pf.conf for "debug" :
> 
> set skip on { lo0, vr1, athn0 }

Thanks for the tip but I think I have figured this out anyway, and other
packets than DHCP packets pass the firewall.  Plus vr1 and athn0 are
configured identically (they are both in group 'lan' and neither of them is
mentioned by name; only the group name is used in pf.conf, so there should
not be any difference between them) and DHCP througn athn0 works.

But I will keep the tip in mind for future use.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: DHCP over vr(4) on bridge(4) through vether(4) no working?

2016-10-09 Thread Raimo Niskanen
On Fri, Oct 07, 2016 at 11:07:43AM +0200, Raimo Niskanen wrote:
> On Fri, Oct 07, 2016 at 10:42:40AM +0200, LÉVAI Dániel wrote:
> > Raimo Niskanen @ 2016-10-07T09:46:06 +0200:
> > > Hello misc@
> > > 
> > > I have a home router where it seems that DHCP over vr(4) on bridge(4)
> > > through vether(4) does not work.
> > > 
> > [...]
> > > Any hints on how to procede?
> > 
> > Just a shot in the dark, but maybe:
> > 
> > http://marc.info/?l=openbsd-misc=147462832805431=2
> > http://undeadly.org/cgi?action=article=20160725144108
> 
> Nice shot, but a close miss.  I have vr0-bridge0-vether0 and no dhclient
> running on neither vr0 nor vether0.  The client runs on vr2.  Also I see
> no log entrys in /var/log/daemon from dhcpd about getting a DHCPDISCOVER
> and sending a DHCPOFFER, which I get when the request comes in over
> athn0-bridge0-vether0...  So it is the incoming that does not arrive.

I have to back from that statement.  Now I am convinced it is the same bug!

And it seems to be enough to have a dhclient running on the same machine as
the bridge, or on the same interface type.

I have dhclient running on vr2 and bridge0 contains vr1, athn0 and vether0.

Some more tcpdumping shows that the DHCPDISCOVER comes in on vr1 and is not
distributed to any other bridge member.  But when a DHCPDISCOVER comes in on
athn0 it is distributed to vr1 and vether0.  dhcpd listens on vether0 but
the reply to DHCPDISCOVER is not delivered through vether0 and the bridge.
It shows up on athn0 directly and is not distributed to the other bridge
members.

So dhcpd and the bridge does some monkey business, possibly assisted by
dhclient working on an interface not in the bridge.

I think these all concern the same problem:
http://marc.info/?l=openbsd-misc=147462934705670=2
http://marc.info/?l=openbsd-bugs=147291369828477=2
http://marc.info/?l=openbsd-tech=147333147600814=2
so the devs are probably working on a solution.

My current workaround is to have dhcpd listen to vr0, vr1 and athn0, and
give out different address ranges on the different interfaces.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Large datasize - how to limit physical memory?

2016-10-09 Thread Raimo Niskanen
On Fri, Oct 07, 2016 at 11:47:17AM -0400, Ted Unangst wrote:
> Raimo Niskanen wrote:
> > And the manual page is wrong in claiming that ulimit -m takes effect when
> > the system gets low on memory?
> > 
> > So the only memory limit that is enforced is ulimit -d?
> 
> yeah. i'll fix the manual. thanks for noticing.
> 
> > Bummer.
> > 
> > What I guess we (VM tricksters) would really want is MAP_NORESERVE...
> 
> that's not very hard to add. uvm has a concept of maxprot, which is the
> maximum protections one can add to a page. userland doesn't really get any
> control over this however. there could be a flag that leaves maxprot as none,
> and then we wouldn't need to count that as memory.
 
That would be super!

We (Erlang VM) currently tries use MAP_NORESERVE (and PROT_NONE) to
allocate a big address range and later remap some of it as PROT_READ |
PROT_WRITE when memory is needed.  The address range is used to be able to
quickly identify which kind of memory it is.

The current situation when MAP_NORESERVE is defined but ignored is
confusing and I hoped that PROT_NONE would be enough to make it behave as
MAP_NORESERVE, but to make MAP_NORESERVE work as intended would be much
better!

A big thanks if MAP_NORESERVE should get implemented!
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: DHCP over vr(4) on bridge(4) through vether(4) no working?

2016-10-07 Thread Raimo Niskanen
On Fri, Oct 07, 2016 at 10:42:40AM +0200, LÉVAI Dániel wrote:
> Raimo Niskanen @ 2016-10-07T09:46:06 +0200:
> > Hello misc@
> > 
> > I have a home router where it seems that DHCP over vr(4) on bridge(4)
> > through vether(4) does not work.
> > 
> [...]
> > Any hints on how to procede?
> 
> Just a shot in the dark, but maybe:
> 
> http://marc.info/?l=openbsd-misc=147462832805431=2
> http://undeadly.org/cgi?action=article=20160725144108

Nice shot, but a close miss.  I have vr0-bridge0-vether0 and no dhclient
running on neither vr0 nor vether0.  The client runs on vr2.  Also I see
no log entrys in /var/log/daemon from dhcpd about getting a DHCPDISCOVER
and sending a DHCPOFFER, which I get when the request comes in over
athn0-bridge0-vether0...  So it is the incoming that does not arrive.

> 
> 
> Daniel

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



DHCP over vr(4) on bridge(4) through vether(4) no working?

2016-10-07 Thread Raimo Niskanen
Hello misc@

I have a home router where it seems that DHCP over vr(4) on bridge(4)
through vether(4) does not work.

Sorry about the lack of hard details, it was late last night, I was
tired and need to figure out what details to look into now...

The home router is an ALIX 2d13 which has 3 vr(4) interfaces on board and
one athn(4) on mini-PCI.  WAN is vr2, LAN is vr1 and WLAN is athn0.

I had a setup working with dhcpd serving constant addresses based on MAC
address to the LAN on vr0 and athn0 with one address range for vr0 and one
for athn0.

Now I need to start using the 5 GHz Wifi band so I asked the athn0 with
"ifconfig athn0 chan" which channels it supported, tried to configure a 5
GHz one and got an IOCTL error so I guiss it was not acually supported.

Then I bought an ASUS EA-N66 access point that can do 5 GHz and connected
it to vr1.

I did a bridge configuration according to the FAQ with bridge0 containing
athn0, vr1 and vether0.  vether0 got the IP address configuration that
athn0 had before, dhcpd was reconfigured to serve vr0 and vether0 and that
worked just fine.  DHCP over athn0 passes through bridge0 and vether0 to
dhcpd as well as directly from vr0 to dhcpd.

But DHCP over vr1 through bridge0 and vether0 does not work.  I had to
configure a static address on the access point to get any further.

I know that DHCP over vr0 that dhcpd serves directly works, and I know that
it works when dhcpd serves athn0 directly, plus it works when dhcpd serves
athn0 throught bridge0 and vether0.

I tcpdumped vr1 (but now I was getting really tired) and saw 0.0.0.0.bootp
-> 255.255.255.255.bootc packets that I think are DHCP broadcasts from a
client.  But when tcpdumping on vether0 I think I did not see them (lots of
chatter), but possibly other strange packets.  When letting a client
connect over athn0, on the other hand, I think I saw these broadcast
packets on vether0, and got log entries in /var/log/daemon about dhcpd
giving out a license.  Not so when letting a client connect over the access
point and vr0.

So my theory is that either have I missed some stupid little flag, or there
is a bug in vr(4) when it passes packets over a bridge(4) to a vether(4) so
encapsulation is misinterpreted and the IP broadcasts does not arrive in
recognizable shape...

Any hints on how to procede?

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Large datasize - how to limit physical memory?

2016-10-07 Thread Raimo Niskanen
On Thu, Oct 06, 2016 at 11:39:49AM -0400, Ted Unangst wrote:
> Raimo Niskanen wrote:
> > On Wed, Oct 05, 2016 at 12:34:42PM -0400, Ted Unangst wrote:
> > > If somebody writes a C program that demonstrates the problem, I'm happy to
> > > take a look. I'm not installing erlang.
> > 
> > It has been ages since I wrote a C program from scratch, but here goes:
> 
> Thanks. That wasn't so bad, was it? :)

No, I might even have kind of liked it ;-)

> 
> > And the symptom would be that the ulimit -m limit is not immediately
> > enforced.  The question is if that is a problem?  Or rather if I can use
> > the ulimit -m limit to prevent a process from taking all memory since I
> > need to set a large ulimit -d size to do clever address comparision tricks
> > in the Erlang VM.
> 
> Ah, indeed. So ulimit -m doesn't do anything any more. I'm not sure when it
> stopped, but the man page reflects ancient history. Sorry about that.
> Unfortunately, it's not easy to make PROT_NONE stop counting. After all, it
> may have been mapped read/write, and modified, then mapped none, but we can't
> discard the page.

So a program may count on the content persisting after that manouver...
Ugh!

I am quite happy with PROT_NONE not counting, but are you saying that if
you would start counting PROT_READ|PROT_WRITE you would have to also count
PROT_NONE, which would make the trick of allocating a large PROT_NONE block
just for its address space not usable.

And the manual page is wrong in claiming that ulimit -m takes effect when
the system gets low on memory?

So the only memory limit that is enforced is ulimit -d?

Bummer.

What I guess we (VM tricksters) would really want is MAP_NORESERVE...

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Large datasize - how to limit physical memory?

2016-10-06 Thread Raimo Niskanen
On Wed, Oct 05, 2016 at 12:34:42PM -0400, Ted Unangst wrote:
> If somebody writes a C program that demonstrates the problem, I'm happy to
> take a look. I'm not installing erlang.

It has been ages since I wrote a C program from scratch, but here goes:

#include 
#include 
#include 

int main() {
size_t len = 0;
void *p1, *p2;

printf("Pid: %ld\n", (long) getpid());

/* 12 GB PROT_NONE [anon] */
p1 = mmap(NULL, ((size_t)12) << 30, PROT_NONE, MAP_ANON, -1, 0);

/* 200 MB read/write [anon] */
p2 = mmap(NULL, ((size_t)200) << 20, PROT_READ|PROT_WRITE, MAP_ANON, -1, 0);

printf("p1: %p, p2: %p\n", p1, p2);

fgetln(stdin, );
}


$ ulimit -a
time(cpu-seconds)unlimited
file(blocks) unlimited
coredump(blocks) unlimited
data(kbytes) 16777216
stack(kbytes)8192
lockedmem(kbytes)2612782
memory(kbytes)   1
nofiles(descriptors) 1024
processes1024
$ ./a.out
Pid: 56334
p1: 0x1bcb4718, p2: 0x1bca5f10


# procmap 56334 
:
1BCA5F10 204800K read/write  [ anon ]
:
1BCB4718 12582912K [ anon ]
: 


And the symptom would be that the ulimit -m limit is not immediately
enforced.  The question is if that is a problem?  Or rather if I can use
the ulimit -m limit to prevent a process from taking all memory since I
need to set a large ulimit -d size to do clever address comparision tricks
in the Erlang VM.


/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Large datasize - how to limit physical memory?

2016-10-05 Thread Raimo Niskanen
On Wed, Oct 05, 2016 at 03:36:20PM +0200, Otto Moerbeek wrote:
> On Wed, Oct 05, 2016 at 03:28:06PM +0200, Raimo Niskanen wrote:
> 
> > On Mon, Oct 03, 2016 at 04:13:58PM +0200, Otto Moerbeek wrote:
> > > On Mon, Oct 03, 2016 at 02:56:05PM +0200, Raimo Niskanen wrote:
> > > 
> > > > On Fri, Sep 30, 2016 at 01:02:10PM +0200, Otto Moerbeek wrote:
> > > > > On Fri, Sep 30, 2016 at 09:10:21AM +0200, Raimo Niskanen wrote:
> > > > > 
> > > > > > On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> > > > > > > Dear misc@
> > > > > > > 
> > > > > > > I have searched the archives and read the documentation of 
> > > > > > > login.conf(5),
> > > > > > > ksh(1):ulimit and can not find how to limit the amount of 
> > > > > > > physical memory a
> > > > > > > process may use.
> > > > > > > 
> > > > > > > I have the following limits where I have set down ulimit -m and 
> > > > > > > ulimit -l
> > > > > > > to 1 kbytes in an attempt to limit the process I spawn which 
> > > > > > > is
> > > > > > > the Erlang VM.
> > > > > > > 
> > > > > > > $ ulimit -a
> > > > > > > time(cpu-seconds)unlimited
> > > > > > > file(blocks) unlimited
> > > > > > > coredump(blocks) unlimited
> > > > > > > data(kbytes) 33554432
> > > > > > > stack(kbytes)8192
> > > > > > > lockedmem(kbytes)1
> > > > > > > memory(kbytes)   1
> > > > > > > nofiles(descriptors) 1024
> > > > > > > processes1024
> > > > > > > 
> > > > > > > Note that the machine has got 8 GB of physical memory and 8 GB of 
> > > > > > > swap and
> > > > > > > that I have set datasize=infinity in /etc/login.conf. I got
> > > > > > > datasize=33554432 which seems to be the same as 
> > > > > > > kern.shminfo.shmmax.
> > > > > > > The datasize is twice the physical memory + swap.
> > > > > > > 
> > > > > > > Then I start the Erlang VM and tell it to allocate an address 
> > > > > > > block of 3
> > > > > > > MByte for future use where it will store all literal data in the 
> > > > > > > same block
> > > > > > > (this is a garbage collector optimization).  Not much of this 
> > > > > > > data is
> > > > > > > actually used.
> > > > > > > 
> > > > > > >  68196 beam CALL  
> > > > > > > mmap(0,0x75300,0,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
> > > > > > >  68196 beam RET   mmap 11871265173504/0xacbfe8b3000
> > > > > > > 
> > > > > > > Note the protection flags on the block.  No access is allowed.  
> > > > > > > This trick
> > > > > > > works just fine; here is what top says:
> > > > > > > 
> > > > > > > load averages:  0.15,  0.13,  0.09 frerin.otp.ericsson.se 
> > > > > > > 08:49:46
> > > > > > > 48 processes: 47 idle, 1 on processor 
> > > > > > > up 13:49
> > > > > > > CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% 
> > > > > > > interrupt,  100% idle
> > > > > > > CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% 
> > > > > > > interrupt,  100% idle
> > > > > > > Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 
> > > > > > > 0K/8155M
> > > > > > > 
> > > > > > >   PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIME
> > > > > > > CPU COMMAND
> > > > > > > 68196 raimo  20   29G   15M sleep poll  0:00  
> > > > > > > 1.42% beam
> > > > > > > 
> > > > > > > So I have a process with a data size of 29 GB on a machine with 
> > > > > > > 16 GB
> > > > > > > memory + swap.  I have also tried to start an additional Erlang 
> > > 

Re: Large datasize - how to limit physical memory?

2016-10-05 Thread Raimo Niskanen
On Mon, Oct 03, 2016 at 04:13:58PM +0200, Otto Moerbeek wrote:
> On Mon, Oct 03, 2016 at 02:56:05PM +0200, Raimo Niskanen wrote:
> 
> > On Fri, Sep 30, 2016 at 01:02:10PM +0200, Otto Moerbeek wrote:
> > > On Fri, Sep 30, 2016 at 09:10:21AM +0200, Raimo Niskanen wrote:
> > > 
> > > > On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> > > > > Dear misc@
> > > > > 
> > > > > I have searched the archives and read the documentation of 
> > > > > login.conf(5),
> > > > > ksh(1):ulimit and can not find how to limit the amount of physical 
> > > > > memory a
> > > > > process may use.
> > > > > 
> > > > > I have the following limits where I have set down ulimit -m and 
> > > > > ulimit -l
> > > > > to 1 kbytes in an attempt to limit the process I spawn which is
> > > > > the Erlang VM.
> > > > > 
> > > > > $ ulimit -a
> > > > > time(cpu-seconds)unlimited
> > > > > file(blocks) unlimited
> > > > > coredump(blocks) unlimited
> > > > > data(kbytes) 33554432
> > > > > stack(kbytes)8192
> > > > > lockedmem(kbytes)1
> > > > > memory(kbytes)   1
> > > > > nofiles(descriptors) 1024
> > > > > processes1024
> > > > > 
> > > > > Note that the machine has got 8 GB of physical memory and 8 GB of 
> > > > > swap and
> > > > > that I have set datasize=infinity in /etc/login.conf. I got
> > > > > datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> > > > > The datasize is twice the physical memory + swap.
> > > > > 
> > > > > Then I start the Erlang VM and tell it to allocate an address block 
> > > > > of 3
> > > > > MByte for future use where it will store all literal data in the same 
> > > > > block
> > > > > (this is a garbage collector optimization).  Not much of this data is
> > > > > actually used.
> > > > > 
> > > > >  68196 beam CALL  
> > > > > mmap(0,0x75300,0,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
> > > > >  68196 beam RET   mmap 11871265173504/0xacbfe8b3000
> > > > > 
> > > > > Note the protection flags on the block.  No access is allowed.  This 
> > > > > trick
> > > > > works just fine; here is what top says:
> > > > > 
> > > > > load averages:  0.15,  0.13,  0.09 frerin.otp.ericsson.se 
> > > > > 08:49:46
> > > > > 48 processes: 47 idle, 1 on processor up 
> > > > > 13:49
> > > > > CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
> > > > > 100% idle
> > > > > CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
> > > > > 100% idle
> > > > > Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 0K/8155M
> > > > > 
> > > > >   PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
> > > > > COMMAND
> > > > > 68196 raimo  20   29G   15M sleep poll  0:00  1.42% 
> > > > > beam
> > > > > 
> > > > > So I have a process with a data size of 29 GB on a machine with 16 GB
> > > > > memory + swap.  I have also tried to start an additional Erlang VM 
> > > > > that
> > > > > also allocates 29 GB of virtual memory which also works.
> > > > > 
> > > > > That this is allowed is just fine for me - this trick of allocating a
> > > > > "large enough" PROT_NONE memory to get one address range for some 
> > > > > special
> > > > > data type is very useful for the Erlang VM.  But I wonder how to 
> > > > > limit the
> > > > > actual memory use?  Setting down ulimit -m and ulimit -l to 1 
> > > > > kbytes
> > > > > did not prevent this process from getting 15 MByte of "RES" memory...
> > > > > 
> > > > > Is there some way to limit the actual amount of memory for a process 
> > > > > when I
> > > > > need to set up the datasize to allow for large unused virtual memory
> > > > > blocks?
> > 

Re: Large datasize - how to limit physical memory?

2016-10-03 Thread Raimo Niskanen
On Fri, Sep 30, 2016 at 01:02:10PM +0200, Otto Moerbeek wrote:
> On Fri, Sep 30, 2016 at 09:10:21AM +0200, Raimo Niskanen wrote:
> 
> > On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> > > Dear misc@
> > > 
> > > I have searched the archives and read the documentation of login.conf(5),
> > > ksh(1):ulimit and can not find how to limit the amount of physical memory 
> > > a
> > > process may use.
> > > 
> > > I have the following limits where I have set down ulimit -m and ulimit -l
> > > to 1 kbytes in an attempt to limit the process I spawn which is
> > > the Erlang VM.
> > > 
> > > $ ulimit -a
> > > time(cpu-seconds)unlimited
> > > file(blocks) unlimited
> > > coredump(blocks) unlimited
> > > data(kbytes) 33554432
> > > stack(kbytes)8192
> > > lockedmem(kbytes)1
> > > memory(kbytes)   1
> > > nofiles(descriptors) 1024
> > > processes1024
> > > 
> > > Note that the machine has got 8 GB of physical memory and 8 GB of swap and
> > > that I have set datasize=infinity in /etc/login.conf. I got
> > > datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> > > The datasize is twice the physical memory + swap.
> > > 
> > > Then I start the Erlang VM and tell it to allocate an address block of 
> > > 3
> > > MByte for future use where it will store all literal data in the same 
> > > block
> > > (this is a garbage collector optimization).  Not much of this data is
> > > actually used.
> > > 
> > >  68196 beam CALL  
> > > mmap(0,0x75300,0,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
> > >  68196 beam RET   mmap 11871265173504/0xacbfe8b3000
> > > 
> > > Note the protection flags on the block.  No access is allowed.  This trick
> > > works just fine; here is what top says:
> > > 
> > > load averages:  0.15,  0.13,  0.09 frerin.otp.ericsson.se 08:49:46
> > > 48 processes: 47 idle, 1 on processor up 13:49
> > > CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% 
> > > idle
> > > CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% 
> > > idle
> > > Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 0K/8155M
> > > 
> > >   PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
> > > COMMAND
> > > 68196 raimo  20   29G   15M sleep poll  0:00  1.42% beam
> > > 
> > > So I have a process with a data size of 29 GB on a machine with 16 GB
> > > memory + swap.  I have also tried to start an additional Erlang VM that
> > > also allocates 29 GB of virtual memory which also works.
> > > 
> > > That this is allowed is just fine for me - this trick of allocating a
> > > "large enough" PROT_NONE memory to get one address range for some special
> > > data type is very useful for the Erlang VM.  But I wonder how to limit the
> > > actual memory use?  Setting down ulimit -m and ulimit -l to 1 kbytes
> > > did not prevent this process from getting 15 MByte of "RES" memory...
> > > 
> > > Is there some way to limit the actual amount of memory for a process when 
> > > I
> > > need to set up the datasize to allow for large unused virtual memory
> > > blocks?
> > 
> > I have found clues in getrlimit,setrlimit(2):
> > 
> >  RLIMIT_DATA The maximum size (in bytes) of the data segment for a
> >  process; this includes memory allocated via malloc(3)
> >  and all other anonymous memory mapped via mmap(2).
> > :
> >  RLIMIT_RSS  The maximum size (in bytes) to which a process's
> >  resident set size may grow.  This imposes a limit
> >  on the amount of physical memory to be given to a
> >  process; if memory is tight, the system will prefer
> >  to take memory from processes that are exceeding
> >  their declared resident set size.
> > 
> > Now I try to figure out the implications of this...  If I set the data size
> > so the sum of the data sizes for all processes in the system is larger than
> > physical memory + swap, then any process may allocate the last block of
> > memory in the system so a more important process later wil

Re: Large datasize - how to limit physical memory?

2016-10-03 Thread Raimo Niskanen
On Fri, Sep 30, 2016 at 01:10:45PM +0200, Otto Moerbeek wrote:
> On Fri, Sep 30, 2016 at 01:02:10PM +0200, Otto Moerbeek wrote:
> 
> 
> > > > Note that the machine has got 8 GB of physical memory and 8 GB of swap 
> > > > and
> > > > that I have set datasize=infinity in /etc/login.conf. I got
> > > > datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> 
> The number you are looking for is MAXDSIZ, whichs is 32G on amd64,

Ok.  A different entity with the same value.  Thank you!


> 
>   -Otto

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Large datasize - how to limit physical memory?

2016-09-30 Thread Raimo Niskanen
On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> Dear misc@
> 
> I have searched the archives and read the documentation of login.conf(5),
> ksh(1):ulimit and can not find how to limit the amount of physical memory a
> process may use.
> 
> I have the following limits where I have set down ulimit -m and ulimit -l
> to 1 kbytes in an attempt to limit the process I spawn which is
> the Erlang VM.
> 
> $ ulimit -a
> time(cpu-seconds)unlimited
> file(blocks) unlimited
> coredump(blocks) unlimited
> data(kbytes) 33554432
> stack(kbytes)8192
> lockedmem(kbytes)1
> memory(kbytes)   1
> nofiles(descriptors) 1024
> processes1024
> 
> Note that the machine has got 8 GB of physical memory and 8 GB of swap and
> that I have set datasize=infinity in /etc/login.conf. I got
> datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> The datasize is twice the physical memory + swap.
> 
> Then I start the Erlang VM and tell it to allocate an address block of 3
> MByte for future use where it will store all literal data in the same block
> (this is a garbage collector optimization).  Not much of this data is
> actually used.
> 
>  68196 beam CALL  
> mmap(0,0x75300,0,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
>  68196 beam RET   mmap 11871265173504/0xacbfe8b3000
> 
> Note the protection flags on the block.  No access is allowed.  This trick
> works just fine; here is what top says:
> 
> load averages:  0.15,  0.13,  0.09 frerin.otp.ericsson.se 08:49:46
> 48 processes: 47 idle, 1 on processor up 13:49
> CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 0K/8155M
> 
>   PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
> 68196 raimo  20   29G   15M sleep poll  0:00  1.42% beam
> 
> So I have a process with a data size of 29 GB on a machine with 16 GB
> memory + swap.  I have also tried to start an additional Erlang VM that
> also allocates 29 GB of virtual memory which also works.
> 
> That this is allowed is just fine for me - this trick of allocating a
> "large enough" PROT_NONE memory to get one address range for some special
> data type is very useful for the Erlang VM.  But I wonder how to limit the
> actual memory use?  Setting down ulimit -m and ulimit -l to 1 kbytes
> did not prevent this process from getting 15 MByte of "RES" memory...
> 
> Is there some way to limit the actual amount of memory for a process when I
> need to set up the datasize to allow for large unused virtual memory
> blocks?

I have found clues in getrlimit,setrlimit(2):

 RLIMIT_DATA The maximum size (in bytes) of the data segment for a
 process; this includes memory allocated via malloc(3)
 and all other anonymous memory mapped via mmap(2).
:
 RLIMIT_RSS  The maximum size (in bytes) to which a process's
 resident set size may grow.  This imposes a limit
 on the amount of physical memory to be given to a
 process; if memory is tight, the system will prefer
 to take memory from processes that are exceeding
 their declared resident set size.

Now I try to figure out the implications of this...  If I set the data size
so the sum of the data sizes for all processes in the system is larger than
physical memory + swap, then any process may allocate the last block of
memory in the system so a more important process later will fail to
allocate?

And the memoryuse limit is rather toothless since there is no immediate
check of this limit.  When the system gets low on memory; is all that
happens that processes that exceed their memoryuse limit probably will get
blocks swapped out?

If this is correct then programs that for efficiency reasons allocates
large address ranges of which most is rarely used are hard to control
safely with this resource limit model, or programs that use this behaviour
must be considered ill-behaved whith this resource limit model...

Or have I misunderstood something?


> 
> dmesg | head -4:
> OpenBSD 6.0 (GENERIC.MP) #1: Fri Sep 23 08:53:49 CEST 2016
> 
> r...@stable-60-amd64.mtier.org:/binpatchng/work-binpatch60-amd64/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 8283037696 (7899MB)
> avail mem = 8027529216 (7655MB)
> 

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Large datasize - how to limit physical memory?

2016-09-28 Thread Raimo Niskanen
Dear misc@

I have searched the archives and read the documentation of login.conf(5),
ksh(1):ulimit and can not find how to limit the amount of physical memory a
process may use.

I have the following limits where I have set down ulimit -m and ulimit -l
to 1 kbytes in an attempt to limit the process I spawn which is
the Erlang VM.

$ ulimit -a
time(cpu-seconds)unlimited
file(blocks) unlimited
coredump(blocks) unlimited
data(kbytes) 33554432
stack(kbytes)8192
lockedmem(kbytes)1
memory(kbytes)   1
nofiles(descriptors) 1024
processes1024

Note that the machine has got 8 GB of physical memory and 8 GB of swap and
that I have set datasize=infinity in /etc/login.conf. I got
datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
The datasize is twice the physical memory + swap.

Then I start the Erlang VM and tell it to allocate an address block of 3
MByte for future use where it will store all literal data in the same block
(this is a garbage collector optimization).  Not much of this data is
actually used.

 68196 beam CALL  
mmap(0,0x75300,0,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
 68196 beam RET   mmap 11871265173504/0xacbfe8b3000

Note the protection flags on the block.  No access is allowed.  This trick
works just fine; here is what top says:

load averages:  0.15,  0.13,  0.09 frerin.otp.ericsson.se 08:49:46
48 processes: 47 idle, 1 on processor up 13:49
CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 0K/8155M

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
68196 raimo  20   29G   15M sleep poll  0:00  1.42% beam

So I have a process with a data size of 29 GB on a machine with 16 GB
memory + swap.  I have also tried to start an additional Erlang VM that
also allocates 29 GB of virtual memory which also works.

That this is allowed is just fine for me - this trick of allocating a
"large enough" PROT_NONE memory to get one address range for some special
data type is very useful for the Erlang VM.  But I wonder how to limit the
actual memory use?  Setting down ulimit -m and ulimit -l to 1 kbytes
did not prevent this process from getting 15 MByte of "RES" memory...

Is there some way to limit the actual amount of memory for a process when I
need to set up the datasize to allow for large unused virtual memory
blocks?

dmesg | head -4:
OpenBSD 6.0 (GENERIC.MP) #1: Fri Sep 23 08:53:49 CEST 2016

r...@stable-60-amd64.mtier.org:/binpatchng/work-binpatch60-amd64/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8283037696 (7899MB)
avail mem = 8027529216 (7655MB)

Best Regards
-- 
/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Erlang 19.0

2016-09-21 Thread Raimo Niskanen
On Wed, Sep 21, 2016 at 08:25:54AM +, Bogdan Andu wrote:
> Hello,
> I have 2 OpenBSD amd64 machines 5.4. and 5.7 which I cannot upgrade them now
> I want to compile erlang 19 on these machine and the compilation fails:
> The compilation process is:1. apply successfully 6.0
> ports/lang/erlang/19/patches to the standard tree2. autoreconf-2.693. gmake
> .. and the error...=== Entering application hipe
> gmake[3]: Entering directory `/home/andu/otp_src_19.0/lib/hipe/misc'
> ??ERLC ../ebin/hipe_consttab.beam
> erts_mmap: Failed to create super carrier of size 1024 MB
> gmake[3]: *** [../ebin/hipe_consttab.beam] Error 1
> gmake[3]: Leaving directory `/home/andu/otp_src_19.0/lib/hipe/misc'
> gmake[2]: *** [opt] Error 2
> gmake[2]: Leaving directory `/home/andu/otp_src_19.0/lib/hipe'
> gmake[1]: *** [opt] Error 2
> gmake[1]: Leaving directory `/home/andu/otp_src_19.0/lib'
> gmake: *** [secondary_bootstrap_build] Error 2
> 
> This means mmap cannot allocate 1024 MB of memory?
> I wanted to disable hipe but hipe is compiled regardless
> if --disable-hipe switch is passed to configure or not

Yes. A few basic things from hipe is still compiled.

> However I want this mmap test to pass because may be other
> components use these features of mmap and I consider this a test by itself.

I guess you need to increase datasize in /etc/login.conf for the compiling
user and that you simply run out of memory when compiling.

> 
> 6.0 has erlang-19 so the compilation was successful.
> what can I do to succesffully compile erlang-19 on 5.4 and 5.7 amd64?
> Thank you,
> 
> Bogdan

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Erlang 19.0

2016-09-21 Thread Raimo Niskanen
...and does not this question belong better at
erlang-questions(at)erlang(dot)org?

On Wed, Sep 21, 2016 at 08:25:54AM +, Bogdan Andu wrote:
> Hello,
> I have 2 OpenBSD amd64 machines 5.4. and 5.7 which I cannot upgrade them now
> I want to compile erlang 19 on these machine and the compilation fails:
> The compilation process is:1. apply successfully 6.0
> ports/lang/erlang/19/patches to the standard tree2. autoreconf-2.693. gmake
> .. and the error...=== Entering application hipe
> gmake[3]: Entering directory `/home/andu/otp_src_19.0/lib/hipe/misc'
> ??ERLC ../ebin/hipe_consttab.beam
> erts_mmap: Failed to create super carrier of size 1024 MB
> gmake[3]: *** [../ebin/hipe_consttab.beam] Error 1
> gmake[3]: Leaving directory `/home/andu/otp_src_19.0/lib/hipe/misc'
> gmake[2]: *** [opt] Error 2
> gmake[2]: Leaving directory `/home/andu/otp_src_19.0/lib/hipe'
> gmake[1]: *** [opt] Error 2
> gmake[1]: Leaving directory `/home/andu/otp_src_19.0/lib'
> gmake: *** [secondary_bootstrap_build] Error 2
> 
> This means mmap cannot allocate 1024 MB of memory?
> I wanted to disable hipe but hipe is compiled regardless
> if --disable-hipe switch is passed to configure or not
> However I want this mmap test to pass because may be other
> components use these features of mmap and I consider this a test by itself.
> 
> 6.0 has erlang-19 so the compilation was successful.
> what can I do to succesffully compile erlang-19 on 5.4 and 5.7 amd64?
> Thank you,
> 
> Bogdan

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: openbsd.org, openssh.com server(s) down

2016-03-15 Thread Raimo Niskanen
On Tue, Mar 15, 2016 at 09:36:33AM -0400, Matt Schwartz wrote:
> Seems like there might be an outage. I cannot reach either openbsd.org or
> openssh.com.
> 
> On Mar 15, 2016 9:32 AM, "Rudolf Sykora" wrote:
> >
> > Hello,
> >
> > is it only I who cannot connect to either
> > of openbsd.org and openssh.com, or
> > is the server down?

Not just for you nor me:

  http://www.isup.me/www.openbsd.org

> >
> > Thanks
> > Ruda

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Unable to sufficiently clean up softraid metadata

2015-12-02 Thread Raimo Niskanen
On Wed, Dec 02, 2015 at 10:44:44AM +0100, Patrik Lundin wrote:
> On Tue, Dec 01, 2015 at 05:53:22PM -0800, Nathan Wheeler wrote:
> > I have a similar sort of setup during installs and I clear out the
> > first 10m before setting up the CRYPTO disk and it works for me. I
> > don't think you're zeroing out enough at the beginning of the disk.
> > 
> > dd if=/dev/zero of=/dev/rsd0c bs=10m count=1
> > 
> 
> Following your tip i tried the following series of commands which failed in 
> the
> same way:
> ===
> # dd if=/dev/zero of=/dev/rsd0d
> # dd if=/dev/zero of=/dev/rsd0a bs=1m count=1
> # dd if=/dev/zero of=/dev/rsd0c bs=10m count=1
> ===
> 
> I then tried the following variant which also failed:
> ===
> # dd if=/dev/zero of=/dev/rsd0d
> # dd if=/dev/zero of=/dev/rsd0a bs=10m count=1
> # dd if=/dev/zero of=/dev/rsd0c bs=1m count=1
> ===
> 
> Finally i tried this just to cover all possibilities, which also failed:
> ===
> # dd if=/dev/zero of=/dev/rsd0d
> # dd if=/dev/zero of=/dev/rsd0a bs=10m count=1
> # dd if=/dev/zero of=/dev/rsd0c bs=10m count=1
> ===
> 
> I might have been unclear on this point (and I am not sure if this is
> how you are doing it) but the above commands are executed on the running
> system before rebooting into the installer. Could it be that the kernel
> writes out some in-memory softraid information to the disk before
> rebooting?

If you are zeroing the char devices under the feet of a running OS i would
not dare to guess what happens.  Can you try to zero the key disk and the
first 1MB of the RAID partition from bsd.rd instead?

> 
> I have noticed that just leaving a "dd if=/dev/zero of=/dev/rsd0c bs=1m"
> "long enough" will work, but it feels too brittle, and my optimal
> situation would be that the system is able to operate after the above
> commands are run, only having an impact after a reboot or power outage,
> which the unbounded dd does not achieve (this might not be an achievable
> goal at all of course).
> 
> -- 
> Patrik Lundin

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Is OpenBSD dump/restore compatible with Linux?

2015-11-30 Thread Raimo Niskanen
On Sat, Nov 28, 2015 at 06:09:45PM -0500, Alan Corey wrote:
> More of my running out of BIOS-accessible space saga.
> 
> My Linux partitions with a few gigs of Android development stuff are
> intact on my laptop, I just need to rearrange the boot partition (I
> think).  I've got empty partitions set aside for installing Linux on
> my desktop machine.  Can I do a dump from OpenBSD over my lan to a
> Linux restore (I've now got an Arch Linux live cd)?  Or should I do
> the restore to OpenBSD on the desktop writing to the Linux partitions?
>  I don't know about permissions and ownerships of files, specifically.

The BSD utilities dump/restore are very file system specific so I would not
use a dump file from e.g OpenBSD to restore to e.g FreeBSD.  I was almost
sure that dump/restore did not exist on Linux...

If you want to keep all properties of filesystem objects dump/restore is
the way to go, but you should only use dump and restore on the same
filesystem type.  Of course you can store the dump file anywhere inbetween.

The tools tar/cpio/pax all have rather standard archive formats that are
fairly portable between Unix filesystems.  There are ugly details though
e.g how long filenames are handled for some tar vs. gtar archive formats,
and also in how some tools handle extended attributes.

/ Raimo Niskanen


> 
> I set that up differently.  I used 2 cylinders to make a grub-only
> partition at the start of the Linux space.  I'm hoping to load grub as
> an option from the XP bootloader then having it boot Debian and
> probably Arch.  If I get it working I'll set my laptop up the same way
> and clone it back.  I could use rsync or make tarballs but I think
> dump/restore is probably best.
> 
> Oh, the grub in ports/sysutils seems to be the original, not grub2
> which development has also stopped on.  The original got to be too
> patched together, so grub2 is a complete rewrite.  stage1 and stage2
> is old stuff.  A Grub partition needs to be at least 12 megs.
> 
> -- 
> Credit is the root of all evil.  - AB1JX

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Hitting the bootable cylinder limit?

2015-11-23 Thread Raimo Niskanen
Hi Alan.

So you are still using Lilo - I thought Grub had taken over :/

I would try to use 32GB FAT32L, 1GB Linux /boot, 64 GB OpenBSD,
and the rest of the Linux filesystems including / in Extended DOS.
This puts Linux /boot and OpenBSD wd0a within in this case 35GB
from disk start.

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



On Mon, Nov 23, 2015 at 11:52:10AM -0500, Alan Corey wrote:
> It seems like there should be a better way to detect this other than
> trial and error.  I put a new 1 TB drive in my laptop (Seagate
> ST1000LM024) about a month ago.  Being aware there was such a limit I
> made small boot partitions at the beginning of the drive (I thought):
> 32 GB Windows, 64 GB OpenBSD, 32 GB Linux.  As predicted everything
> worked at first, then installing MeTV keys made my Linux unbootable
> with an error from Lilo about the key file being corrupt and I suspect
> it's related to this limit.  The original position of the file was
> probably OK, the new file got made in an unreachable position.
>
> So I've probably got some storage-only partitions that won't boot, but
> I want to avoid the same thing happening when I put a 1 TB drive
> (Seagate
> ST31000340AS) in my laptop machine (Dell Optiplex GX270) because I
> really would like Linux working somewhere since I want to play with
> Android stuff.  I need to be able to build kernels for my phones and
> use Android Studio.
>
> So on the laptop:
> Disk: wd0 geometry: 121601/255/63 [1953525168 Sectors]
> Offset: 0 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
> *0: 0C  0   1   1 -   4079 254  63 [  63:65545137 ] Win95
FAT32L
>  1: A6   4080   0   1 -  12365 254  63 [65545200:   133114590 ] OpenBSD
>  2: 83  12366   0   1 -  16444 254  63 [   198659790:65529135 ] Linux
files*
>  3: 05  16445   0  62 - 121600 254  63 [   264188986:  1689331079 ] Extended
DOS
> Offset: 264188986 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
>  0: 0B  16445   1   1 -  20524 254  63 [   264188988:65545137 ] Win95
FAT-32
>  1: 05  20525   0   1 -  24604 254  63 [   329734125:65545200 ] Extended
DOS
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  3: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> Offset: 329734125 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
>  0: 0B  20525   1   1 -  24604 254  63 [   329734188:65545137 ] Win95
FAT-32
>  1: 05  24605   0   1 -  25114 254  63 [   395279325: 8193150 ] Extended
DOS
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  3: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> Offset: 395279325 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
>  0: 82  24605   1   1 -  25114 254  63 [   395279388: 8193087 ] Linux
swap
>  1: 05  25115   0   1 -  88856  76  52 [   403472475:  1024004005 ] Extended
DOS
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  3: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> Offset: 403472475 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
>  0: A6  25115  63  37 -  88856  76  52 [   403476480:  102400 ] OpenBSD
>  1: 05  88857   0   1 - 121600 254  63 [  1427487705:   526032360 ] Extended
DOS
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  3: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> Offset: 1427487705Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
>
-
--
>  0: 83  88857   1   1 - 121600 254  63 [  1427487768:   526032297 ] Linux
files*
>  1: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  2: 00  0   0   0 -  0   0  

Re: OpenBSD Tablet-ish

2015-10-20 Thread Raimo Niskanen
On Fri, Feb 20, 2015 at 05:44:55PM -0600, Adam Thompson wrote:
> On 2015-02-20 05:22 PM, Robert wrote:
> > After a quick check on lenovo.com, the Yoga 2 (10) seems to be interesting. 
> > Incl. LTE it's about 350 EUR.
> > But I can't find any indications on the web that someone installed any 
> > alternative OS on it. I'm also not sure if it matters if you buy the 
> > Android or Windows version - the hardware seems to be the same.
> 
> I would buy the Windows version, at least you know that version is 
> guaranteed to run Windows, and you could always run whatever you wanted 
> (e.g. OpenBSD) inside VirtualBox if you were desperate.  The Android 
> versions are more likely to have custom IPL (pre-boot) environments 
> instead of even UEFI - but that's a guess.
> 
> > And the Bay Trail graphics are not supported yet:
> > http://marc.info/?l=openbsd-misc=142106787528337
> >
> > Ok... so let's wait. Maybe there will be some money this year to throw at 
> > an experiment ;)
> 
> Oh, if you want to spend ridiculous amounts of money, I guarantee 
> Fujitsu will have *something* that both boots OpenBSD and meets your 
> form-factor requirements, but it'll probably cost €1000+. Specifically, 
> the Q704 and Q572 both might work.  Fujitsu also has convertibles that 
> are quite light & small now.  I can't find anything that confirms 
> whether they support legacy boot; again, the older and more 
> corporate-oriented the device, the more likely it will.  The newer and 
> cheaper and more consumer-oriented the device, the less likely it'll be 
> able to boot OpenBSD.

I would like to find something that could replace my "Smart"phone,
so any experince with running OpenBSD on a modern small device would
be interesting.  Say 8..10 inch screen.  If there is a virtual keyboard
in X that can be used with a touchscreen pointing device that might do
or else some kind of keyboard.  Anything close to pocket (handbag) portable.

To replace a Smartphone it would need some networking; either supported WiFi
againts a "Dumb"phone hotspot, or a 3G/4G modem maybe over USB.

Thoughts, anyone?  I am looking at Fujitsu STYLISTIC Q335 and the likes.
Lenovo Yoga 2 was mentioned above but that seems to use bluetooth for the
keyboard.  Asus Aspire Switch 10 and Asus Transformer Book T90 Chi might be
interesting.

Best Regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Remove removed utilities?

2015-10-19 Thread Raimo Niskanen
Hello misc@

I just noticed from the detailed changelog 5.7->5.8:
http://www.openbsd.org/plus58.html
that e.g tcopy, tip and lmccontrol were removed, but after upgrading from
5.7 to 5.8 I still have /usr/bin/tip, /usr/bin/tcopy and /sbin/lmccontrol
in the filesystem, with old dates.

The upgrade guide:
http://www.openbsd.org/faq/upgrade58.html
listed files to remove concerning the removed sudo, but nothing about
the utilities above.

There are also more old files hanging around, e.g:
/usr/bin/perl5.20.1
/usr/bin/perlthanks
and in /usr/lib there are old versions of libtls, libssl, libkvm, libedit,
libcrypt, libc, etc...

I vaguely remember reading something about old libraries remaining after an
upgrade, but can not find it now, at least not up front in the FAQ.

So my question is; should these old utilities be removed after upgrading to
5.8?  And what about old libraries?
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Remove removed utilities?

2015-10-19 Thread Raimo Niskanen
On Mon, Oct 19, 2015 at 08:12:31AM -0400, Nick Holland wrote:
> On 10/19/15 04:24, Raimo Niskanen wrote:
> > Hello misc@
> > 
> > I just noticed from the detailed changelog 5.7->5.8:
> > http://www.openbsd.org/plus58.html
> > that e.g tcopy, tip and lmccontrol were removed, but after upgrading from
> > 5.7 to 5.8 I still have /usr/bin/tip, /usr/bin/tcopy and /sbin/lmccontrol
> > in the filesystem, with old dates.
> > 
> > The upgrade guide:
> > http://www.openbsd.org/faq/upgrade58.html
> > listed files to remove concerning the removed sudo, but nothing about
> > the utilities above.
> > 
> > There are also more old files hanging around, e.g:
> > /usr/bin/perl5.20.1
> > /usr/bin/perlthanks
> > and in /usr/lib there are old versions of libtls, libssl, libkvm, libedit,
> > libcrypt, libc, etc...
> > 
> > I vaguely remember reading something about old libraries remaining after an
> > upgrade, but can not find it now, at least not up front in the FAQ.
> > 
> > So my question is; should these old utilities be removed after upgrading to
> > 5.8?  And what about old libraries?
> 
> We used to have a little disclaimer that upgrading was not equivalent to
> wiping and reloading with the new version; yes, stuff will get left
> behind.  It was decided the upgrade docs were too big and scary so that
> was one of the things removed to shorten it up (see older versions, like
> upgrade55.html, for example).
> 
> Things that are out-right replaced (i.e., sudo) should be actively
> deleted.  Even if it still works after upgrade, some day it is going to
> break, and you should be pushed to use the new application (or the
> package of the old application).  Things like tip?  what's the point?
>   case 1) It still works.  No harm done.
>   case 2) It no longer works.  useless file on system...so what?
> Either way...no harm.

I understand the policy.  Sounds just fine to me.

> 
> Library files are far more "interesting"...depending on the upgrade,
> they still may work with old packages still on the system.  Delete them,
> you have broken the old packages before you got them upgraded, usually
> no big deal, but sometimes, may really annoy you.
> 
> Again... who cares?  Yes, after ten upgrades, old libraries can start to
> add up, but on a modern system, you will go through a lot of upgrades
> before you can save a GB of data deleting old stuff.  Just not worth the
> trouble.
> 
> Quoting upgrade55.html:
> "However, the results are not intended to precisely match the results of
> a wipe-and-reload installation. Old library files in particular are not
> removed in the upgrade process, as they may be required by older
> applications that may or may not be upgraded at this time. If you REALLY
> wish to get rid of all these old files, you are probably better off
> reinstalling from scratch."

Aaah...  There it was!

> 
> If OCD is causing you to twitch at seeing the old files, reinstall...or
> use this as a therapy.

Therapy is working...  I feel soothed.
Thank you for the explanation!

/ Raimo

> 
> Nick.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Passwd cipher for YP

2015-10-15 Thread Raimo Niskanen
On Wed, Oct 14, 2015 at 09:10:55PM -0600, Devin Reade wrote:
> --On Wednesday, October 14, 2015 08:51:06 AM -0600 Theo de Raadt 
> <dera...@cvs.openbsd.org> wrote:
> 
> >> Do you have any other tips on how to handle logins in a mixed OS YP
> >> network?
> >
> > These days, I would recommend using YP in fewer places.  I wrote the
> > code, but even I don't use it.  Each time I make changes that need testing
> > in a YP environment, my test group has shrunk again...
> 
> I suspect that the best bet for general interop will be an LDAP-based
> infrastructure.  You may need to verify that all OSes can use a
> common subset of a valid schema, as well as probably needing a minimal
> PKI for SSL. If NFS is in the picture, watch for NFS version compatibility
> and username mapping ideosyncracies (search for idmapd).

Thank you for the hints.  NFS is indeed in the picture.

> 
> Devin

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Passwd cipher for YP

2015-10-15 Thread Raimo Niskanen
On Wed, Oct 14, 2015 at 08:51:06AM -0600, Theo de Raadt wrote:
> > I just found out that ypcipher=old is no longer supported in login.conf.
> 
> That is correct.
> 
> We have deprecated and removed the legacy ciphers.  Passing such simple
> hashes over ethernet in 2015 is not best practice.
> 
> > Since I have a mixed platform lab network using YP (FreeBSD servers) I am
> > curious if anyone has some experience of how portable blowfish is as a
> > cipher for YP passwords.
> 
> Don't know if they are compatible.  Blowfish itself has had a few
> generations.  There was the original in 2001 or so, soon followed by a
> fix in 2002(?).  Then a few years ago a Linux version of blowfish was
> found to have a bug in rare configurations, but to keep everyone safe
> we all adopted some small changes and made a newer version -- $2b$

I made some tests.  A $2b$ password generated on OpenBSD 5.8 works for
FreeBSD 10.1, but not for SLES 10 nor Ubuntu 14.04.

> 
> > FreeBSD man pages say that they support it.  I also have lots of old and new
> > linux clients and just a few OpenBSD clients in the network.  Linux as usual
> > shines being badly documented so I can not find out if any of those support
> > blowfish.  Therefore I ask this list if anyone knows about this?  
> > 
> > Are there more password ciphers planned for the future e.g sha256 and 
> > sha512?
> 
> No, we will not be adding those.
> 
> Those simple hashes do not provide the future-proof, high-cost-to-crack
> features of bcrypt, which has made it successful as industry staple.
> The dumb hashes even arrived years after bcrypt, seems likely the result
> of choosing ideas "not invented by openbsd"

Ouch!  And I have not seen any other upcoming ciphers mentioned.  These seem
to be state of the art in the Linux world :/


> 
> > Do you have any other tips on how to handle logins in a mixed OS YP network?
> 
> These days, I would recommend using YP in fewer places.  I wrote the
> code, but even I don't use it.  Each time I make changes that need testing
> in a YP environment, my test group has shrunk again...

I guess I will have to look into LDAP, then.

Thank you for your clear answers!


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Passwd cipher for YP

2015-10-14 Thread Raimo Niskanen
Hi misc@

I just found out that ypcipher=old is no longer supported in login.conf.

Since I have a mixed platform lab network using YP (FreeBSD servers) I am
curious if anyone has some experience of how portable blowfish is as a
cipher for YP passwords.

FreeBSD man pages say that they support it.  I also have lots of old and new
linux clients and just a few OpenBSD clients in the network.  Linux as usual
shines being badly documented so I can not find out if any of those support
blowfish.  Therefore I ask this list if anyone knows about this?  

Are there more password ciphers planned for the future e.g sha256 and sha512?

Do you have any other tips on how to handle logins in a mixed OS YP network?
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Passwd cipher for YP

2015-10-14 Thread Raimo Niskanen
Some answers from myself after experimenting and finding Wikipedia :/

On Wed, Oct 14, 2015 at 02:36:09PM +0200, Raimo Niskanen wrote:
> Hi misc@
> 
> I just found out that ypcipher=old is no longer supported in login.conf.
> 
> Since I have a mixed platform lab network using YP (FreeBSD servers) I am
> curious if anyone has some experience of how portable blowfish is as a
> cipher for YP passwords.
> 
> FreeBSD man pages say that they support it.  I also have lots of old and new
> linux clients and just a few OpenBSD clients in the network.  Linux as usual
> shines being badly documented so I can not find out if any of those support
> blowfish.  Therefore I ask this list if anyone knows about this?  

FreeBSD and OpenBSD have Blowfish in common.
FreeBSD and recent Linux'es have SHA-2 (SHA-256 and SHA-512) in common.

> 
> Are there more password ciphers planned for the future e.g sha256 and sha512?
> 
> Do you have any other tips on how to handle logins in a mixed OS YP network?
> -- 
> 
> / Raimo Niskanen, Erlang/OTP, Ericsson AB

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: "dd if=/dev/srandom of=/dev/wd0e bs=1024 count=1" WIPES my wd0 disklabel. Is this intended, bug, how come, how workaround ??? Incl reproduction script+console output+dmesg

2015-10-08 Thread Raimo Niskanen
On Thu, Oct 08, 2015 at 12:50:59AM +0800, Mikael wrote:
:
> *Impression:*
> Based on what Benny and I think someone else said, I got an approximative
> impression something like that the whole disklabelling system is actually
> designed with the intention that every disklabel is required to
> 
> 1) Have an "a" partition that
> 2) Starts on sector 64 and continues at least up to and including sector 79
> and

You are assuming in your example that the OpenBSD part of the disk starts at
sector 64 on a 512-byte sector disk.  That does not have to be true.
For an i386 or amd64 system it is where the OpenBSD fdisk partition starts
that is important.  And from that point it is 8 kByte that is reserved
for e.g boot code and BSD disklabel.

> 2) Be of the 4.2BSD or RAID type,

The utilities using these avoids using the first 8 kByte so that is just
fine.

In another contemporary thread it was linked to a document about softraid
key disks and there it was clearly stated that to back up and restore a
such a key partition you should avoid the first 8 kByte of the partition
e.g by using dd bs=8192 skip=1 for backup and dd bs=8192 seek=1 for restore.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: disklabel fs types, where can I find the whole list of supported types?

2015-10-05 Thread Raimo Niskanen
On Mon, Oct 05, 2015 at 04:18:02AM -0400, Eric Furman wrote:
> Its been explained to you already.
> You're just being a troll now.

I, like the OP, can not find that explanation so far in this thread.
All that has been explained is about fdisk's MBR partitions which is not what
the OP:s asked about.

In the disklabel(8) tool using e.g the -E switch to get to the interactive
editor; there you have commands 'a' and 'm' in which you may enter a filesystem
type for a partition that per default is "4.2BSD" for partitions a and d 
upwards,
and default "swap" for partition b.  You can also use the filesystem type "RAID"
for a partition to use by softraid(4) which is documented in softraid(4).

The question is: what are the allowed values for this "filesystem type" in
the disklabel(8) tool and where are they documented?

> 
> On Mon, Oct 5, 2015, at 03:53 AM, Mikael wrote:
> > Right, I am fully aware of that (i.e. that you can type in MBR partition
> > type as HEX code in the fdisk tool) - please correct me if I'm wrong, but
> > that is specific to the FDISK (and the MBR partition table only), and the
> > BSD disklabel and hence what you're working with in the disklabel tool,
> > is
> > separate altogether from that;
> > 
> > My question was (third time now), which FS types are available in the
> > disklabel tool?
> > 
> > Is it "4.2BSD", "swap", "RAID" and "unknown" only, or are there any more?
> > 
> > 
> > 2015-10-05 15:41 GMT+08:00 Dusan Sukovic <dusan.suko...@gmail.com>:
> > 
> > > Yes, but beside ffs HEX id inside fdisk prompt you have also ffs partition
> > > id values in plain English..
> > >
> > > Regards,
> > >
> > > Dusan
> > >
> > > 2015-10-05 9:28 GMT+02:00 Mikael <mikael.tr...@gmail.com>:
> > >
> > > >
> > > > And, the disklabel filesystem type is requested as a string (unlike the
> > > > fdisk partition type which is an 8-bit unsigned integer typically 
> > > > entered
> > > > in hex) and hence you need to know which options are available:

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: disklabel fs types, where can I find the whole list of supported types?

2015-10-05 Thread Raimo Niskanen
On Mon, Oct 05, 2015 at 10:57:51AM +0059, Jason McIntyre wrote:
> On Mon, Oct 05, 2015 at 11:14:09AM +0200, Ingo Schwarze wrote:
> > 
> > > On Mon, Oct 5, 2015, at 03:53 AM, Mikael wrote:
> > 
> > >> which FS types are available in the disklabel tool?
> > 
> > The list is in the header file /usr/include/sys/disklabel.h,
> >   static char *fstypenames[]
> > 
> > I don't think this is documented, not even in readlabelfs(3) or
> > in disklabel(5).
> > 
> 
> we're not talking about the list in fstab(5)?
> jmc

Since neither "4.2BSD", "4.1BSD" nor "RAID" is in that list
I would guess not.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: OpenBSD sendfile

2015-10-02 Thread Raimo Niskanen
On Fri, Oct 02, 2015 at 08:19:28AM +, Bogdan Andu wrote:
> Running linux in production is not an option, for me at least.
> 
> I was surprised too. They put it recently into deps tree.
> 
> Is disabled at run time, but is required 
> at compile time..
> I have scrambled the Makefiles and rebar.configs 
> and rebar.config.scripts and got rid of sendfileand compiles and runs fine on 
> OpenBSD (amd64/5.7)
> So, basically I have a non-sendfile-Yaws tree. Hurray!

Report that to the Yaws project.  They should not make themselves dependent
on features that are optional for Erlang and system dependent.

> 
> Bogdan
>  
> 
> 
>  On Friday, October 2, 2015 10:38 AM, Stuart Henderson 
> <s...@spacehopper.org> wrote:
>
> 
>  On 2015-09-30, Bogdan Andu <bo...@yahoo.com> wrote:
> > If one needs this linux-like crap, sendfile,and cannot disable it, how is 
> > he suppose to handle it?
> 
> Run it on linux?
> 
> I'm surprised Yaws needs it though, from what it says on their website
> it looks optional.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



EDID checksum is invalid

2015-09-17 Thread Raimo Niskanen
t pciide1 channel 0 drive 0: 
wd0: 1-sector PIO, LBA48, 76319MB, 156301488 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 6
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x02: apic 2
int 19
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-5300CL5 SO-DIMM
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at ichpcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com2 at isa0 port 0x3e8/8 irq 5: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
uhidev0 at uhub2 port 2 configuration 1 interface 0 "ATEN UC-10KM V1.3.124"
rev 1.10/1.00 addr 2
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes, country code 3
wskbd1 at ukbd0 mux 1
wskbd1: connecting to wsdisplay0
uhidev1 at uhub2 port 2 configuration 1 interface 1 "ATEN UC-10KM V1.3.124"
rev 1.10/1.00 addr 2
uhidev1: iclass 3/1
ums0 at uhidev1: 5 buttons, Z dir
wsmouse0 at ums0 mux 0
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
root on wd0a (ac2cb057523bcf3b.a) swap on wd0b dump on wd0b
error: [drm:pid12111:drm_edid_block_valid] *ERROR* EDID checksum is
invalid, remainder is 116
Raw EDID:

00 ff ff ff ff ff ff 00  2e cd 08 43 07 01 01 01 
0a 0f 01 03 08 22 1b 78  2a 6e a6 a1 54 4c 99 26 
19 4f 54 bf ef 80 81 40  71 4f 81 4a 81 99 01 01 
01 01 01 01 01 01 30 2a  00 98 51 00 2a 40 30 70 
13 00 52 0e 11 00 00 1e  00 00 00 ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
error: [drm:pid12111:drm_edid_block_valid] *ERROR* EDID checksum is
invalid, remainder is 130
Raw EDID:

00 ff ff ff ff ff ff 00  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
error: [drm:pid12111:drm_edid_block_valid] *ERROR* EDID checksum is
invalid, remainder is 54
Raw EDID:

00 ff ff ff ff ff ff 00  2e cd 08 43 07 01 01 01 
0a 0f 01 03 08 22 1b 78  2a 6e a6 a1 54 4c 99 26 
33 ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff 
:
:



-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Recommended Industrial PCs?

2015-08-27 Thread Raimo Niskanen
On Wed, Aug 26, 2015 at 09:11:22PM +0200, Martin Haufschild wrote:
 Hello,
 
 can someone recommend me an Industrial PC (IPC) to use with OpenBSD? I 
 would like to have a lot of hardware supported from this IPC by OpenBSD.
 
 Regards
 Martin

Have a look at the PC Engines machines:
  http://pc-engines.ch

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: GROUP CHANGED

2015-06-15 Thread Raimo Niskanen
On Mon, Jun 15, 2015 at 09:53:56AM +0900, Joel Rees wrote:
 My memories of Debiandora are fading slightly, but, ...
:
 ... I think the numeric id for wheel group in Linux is not 0.

At least on Ubuntu 12.04 there is no wheel group and the numeric id for the
root group is 0.


 
 Which is relevant to the OP's misplaced concerns.
 
 (Not to mention the topic of power grabs.)

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Install file sets from msdos fs

2015-06-02 Thread Raimo Niskanen
On Mon, Jun 01, 2015 at 07:23:07PM +0200, ludovic coues wrote:
 2015-06-01 16:53 GMT+02:00 Raimo Niskanen raimo+open...@erix.ericsson.se:
  Hello misc.
 
  Yesterday I upgraded a laptop (i386) from 5.6 snapshot to 5.7.  This laptop
  has no CD reader so I copied 5.7/i386 directory to an msdos formatted USB
  stick on a Windows 7 machine and adjusted all filenames manually according
  to the TRANS.TBL files.
 
  I tested the USB stick before upgrading and found some oddities regarding
  long vs short filenames, something like this:
  $ ls
  BSD BSD.RD ...
 
  $ ls BSD.RD
  BSD.RD
 
  $ ls bsd.rd
  BSD.RD
 
  So it seems the filenames are case insensitive and lists with capitals.
 
  The upgrade went fine, the sets were installed, but with these glitches:
  * The file INSTALL.i386 was not found but that could be ignored
  * The file SHA256.sig was not found but that could be ignored - skipping
the verification.
 
  I had verified the SHA256.sig after download, so no harm done.
 
  Note that both files did exist and could be listed at least when mounting
  msdos with the -l option and with the names the installer claimed could not
  be found.  And the toplevel directory had filenames that should force the
  usage of long filenames.  I am also pretty sure the filenames had lowercase
  suffixes when viewed on the Windows 7 machine.
 
  I suspect the installer lists the files and compares filenames by itself
  and therefore the filenames does not match.  If it would list by explicit
  names I guess it would find the files.
 
  I tried to mount the msdos filesystem myself (with long filenames) and
  use the installer option to install sets from a mounted filesystem,
  but then it could not find any sets at all.  What worked was to install
  from unmounted filesystem telling the installer which partition the sets
  were on and then it found all file sets but not the two files above.
 
  It is great that it worked, but installing sets from a msdos filesystem can
  be improved.  I think it is a useful way around having no CD reader.
 
  Best Regards
  --
 
  / Raimo Niskanen, Erlang/OTP, Ericsson AB
 
 
 For my upgrade from 5.6 to 5.7 on a laptop without CD reader, I got
 bsd.rd and put it at /bsd.57.rd on the machine to upgrade.
 
 Then on the prompt at early boot, I typed boot bsd.57.rd instead of
 waiting for the machine to boot. Just like instructed in the upgrade
 page of the FAQ.
 This work really well.

Well, yes, that was also how I booted the 5.7 bsd.rd kernel.

My post was about then from the bsd.rd kernel installing/upgrading
the file sets from an msdos filesystem, which only half-worked.

I found out that the bsd.rd kernel in 5.7 assembled my encrypted root
softraid disk automatically!  A great improvement - good work!

/ Raimo Niskanen


 
 -- 
 
 Cordialement, Coues Ludovic
 +336 148 743 42

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Install file sets from msdos fs

2015-06-01 Thread Raimo Niskanen
Hello misc.

Yesterday I upgraded a laptop (i386) from 5.6 snapshot to 5.7.  This laptop
has no CD reader so I copied 5.7/i386 directory to an msdos formatted USB
stick on a Windows 7 machine and adjusted all filenames manually according
to the TRANS.TBL files.

I tested the USB stick before upgrading and found some oddities regarding
long vs short filenames, something like this:
$ ls
BSD BSD.RD ...

$ ls BSD.RD
BSD.RD

$ ls bsd.rd
BSD.RD

So it seems the filenames are case insensitive and lists with capitals.

The upgrade went fine, the sets were installed, but with these glitches:
* The file INSTALL.i386 was not found but that could be ignored
* The file SHA256.sig was not found but that could be ignored - skipping
  the verification.

I had verified the SHA256.sig after download, so no harm done.

Note that both files did exist and could be listed at least when mounting
msdos with the -l option and with the names the installer claimed could not
be found.  And the toplevel directory had filenames that should force the
usage of long filenames.  I am also pretty sure the filenames had lowercase
suffixes when viewed on the Windows 7 machine.

I suspect the installer lists the files and compares filenames by itself
and therefore the filenames does not match.  If it would list by explicit
names I guess it would find the files.

I tried to mount the msdos filesystem myself (with long filenames) and
use the installer option to install sets from a mounted filesystem,
but then it could not find any sets at all.  What worked was to install
from unmounted filesystem telling the installer which partition the sets
were on and then it found all file sets but not the two files above.

It is great that it worked, but installing sets from a msdos filesystem can
be improved.  I think it is a useful way around having no CD reader.

Best Regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Patch 009 fails on BASE-5.6 amd64

2015-03-11 Thread Raimo Niskanen
On Sat, Mar 07, 2015 at 08:33:07PM -0500, Ted Unangst wrote:
 Andrew Lester wrote:
  Hi All,
  
  I’ve just performed a fresh install of OpenBSD 5.6-BASE (not an upgrade) 
  using the purchased disc set, and have been applying the patches in order, 
  and all have been successful. However, the httpd patch (#009) has failed, 
  and I ended up with several “rej” files. This system could not be more 
  vanilla, no additional software was installed. The X packages and games 
  were not installed. I have taken the output of the signify command which 
  provides information about the specific failures, as well as the four rej 
  files that were generated and put them into a tar archive if anybody would 
  like to see the specifics of the failure. Here is a shared link from 
  Dropbox:
  https://www.dropbox.com/s/yk9olnqdeeru7m6/009_httpd_failures.tar?dl=0
  
  Has anybody else encountered this issue? I am holding off on applying the 
  additional patches in the mean time. Please let me know if there is any 
  additional information I can provide.
 
 The CDs shipped with a slightly different source tree, that missed a few
 changes to the httpd directory. You can use cvs to update the httpd, either to
 OPENBSD_5_6_BASE and apply the patch, or to OPENSD_5_6 which is simpler.

Or use a downloaded source tree and apply the patch..

 
 I should add a note to the web page; we didn't discover this until some time
 after the patch. Thanks for reminding me.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Installing OpenBSD 5.6 using a USB Flash drive

2015-02-19 Thread Raimo Niskanen
On Wed, Feb 18, 2015 at 05:30:29PM +0100, Alexander Hall wrote:
 On February 18, 2015 11:43:56 AM CET, Markus Kolb open...@tower-net.de 
 wrote:
 Am 2015-02-17 17:27, schrieb A Y:
  dmesg|grep ^.d0 returns only sd0
  sysctl hw.disknames returns sd0 and rd0
  
  my machine is a 10.1 inch netbook Lenovo E10-30 running Intel Celeron
 
  N2830
  Dual Core 64 bit. Do you think I should have used amd64 installation 
  instead
  of i386?
 
 Will depend mostly on your available RAM.
 i386 is 32 bit.
 
 Either way, I see no reason not to run amd64 on that processor.

Won't i386 use less memory making it more efficient up to about 2 GB of RAM
which this machine has?

Of course, if RAM would be added you would regret not having installed
amd64...

 
 /Alexander 
 
 
 See https://en.wikipedia.org/wiki/RAM_limit#32-bit_x86_RAM_limit

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: OpenBSD usb cannot be read on Windows

2015-02-18 Thread Raimo Niskanen
On Wed, Feb 18, 2015 at 09:37:31AM +, A Y wrote:
 I used the following command under OpenBSD 5.6:
 
 #dd if=/location/install56.fs of=/dev/rsd1c bs=1m
 
 When I try to reformat it under Windows, it formats only 240 M. So is it
 possible to format is under OpenBSD so that I can get the full size (16G)
 back?

Zero out the MBR from OpenBSD.
# dd if=/dev/null of=/dev/rsd1c bs=512 count=1

Then format it from Windows.

 
  Date: Wed, 18 Feb 2015 11:17:31 +0200
  Subject: Re: OpenBSD usb cannot be read on Windows
  From: pr...@kivisoo.ee
  To: afyous...@hotmail.com
  CC: misc@openbsd.org
 
   I used the dd'' command to make a bootable USB drive. The USB is 16G.
   After I
   am done with the installation, I want to use the USB under Windows for
   other
   purposes. Windows reads only 240 M.
   How can I recover the 16G on the USB?
  
  
  Reformat it.
 
  Priit

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Installing OpenBSD 5.6 using a USB Flash drive

2015-02-17 Thread Raimo Niskanen
On Tue, Feb 17, 2015 at 10:36:20AM +, A Y wrote:
 Hi all,
 I used the following command to create a USB flash drive installation media
 (with all file sets included):
 # dd if=/location/install56.fs of=/dev/rsd0c bs=1m
 The USB flash drive was created successfully.
 The boot process from the USB was done. However, when we came to installing
 file sets, the following prompt was displayed:
 Location of sets? (disk http or 'done') [http]
 Now, what can I do to direct the installation process to look for the file
 sets in the USB flash drive?
 The documentation says:
 Once the install kernel is booted, you have several options of where to get
 the install file sets:
 CD-ROM, HTTP, Local disk partition, NFS (no mention to USB)
 As adviced, I did the following from the shell:
 cd /devsh MAKEDEV sd1 mkdir /mnt1mount /dev/sd1a /mnt1
 But I got the following error:
 Device not configured
 Thank you

Strange.  I think 'disk' should be among the possible set locations.

What kind of machine is this?

Use the shell for some diagnostics.
Check your dmesg.  Does the install kernel (bsd.rd) detect the flash drive?
Check what sysctl hw.disknames says.

It seems the USB disk is not detected even though BIOS and boot(8) manages to
boot the kernel.  If so there might be BIOS options that can help e.g
setting the disks to AHCI mode, depending on what kind of machine this is.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Installing OpenBSD 5.6 using a USB Flash drive

2015-02-17 Thread Raimo Niskanen
On Tue, Feb 17, 2015 at 12:51:41PM +0100, Raimo Niskanen wrote:
 On Tue, Feb 17, 2015 at 10:36:20AM +, A Y wrote:
  Hi all,
  I used the following command to create a USB flash drive installation media
  (with all file sets included):
  # dd if=/location/install56.fs of=/dev/rsd0c bs=1m
  The USB flash drive was created successfully.
  The boot process from the USB was done. However, when we came to installing
  file sets, the following prompt was displayed:
  Location of sets? (disk http or 'done') [http]
  Now, what can I do to direct the installation process to look for the file
  sets in the USB flash drive?
  The documentation says:
  Once the install kernel is booted, you have several options of where to get
  the install file sets:
  CD-ROM, HTTP, Local disk partition, NFS (no mention to USB)
  As adviced, I did the following from the shell:
  cd /devsh MAKEDEV sd1 mkdir /mnt1mount /dev/sd1a /mnt1
  But I got the following error:
  Device not configured
  Thank you
 
 Strange.  I think 'disk' should be among the possible set locations.

Oops!  I did not see that 'disk' actually was among the possible set
locations.  Have you tried that?

 
 What kind of machine is this?
 
 Use the shell for some diagnostics.
 Check your dmesg.  Does the install kernel (bsd.rd) detect the flash drive?
 Check what sysctl hw.disknames says.
 
 It seems the USB disk is not detected even though BIOS and boot(8) manages to
 boot the kernel.  If so there might be BIOS options that can help e.g
 setting the disks to AHCI mode, depending on what kind of machine this is.
 
 -- 
 
 / Raimo Niskanen, Erlang/OTP, Ericsson AB

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: DUMP: fopen on /dev/tty fails

2015-01-07 Thread Raimo Niskanen
On Mon, Jan 05, 2015 at 12:19:13PM +0100, Otto Moerbeek wrote:
 On Mon, Jan 05, 2015 at 11:33:13AM +0100, Jan Stary wrote:
 
  On Jan 05 10:58:02, o...@drijf.net wrote:
   On Mon, Jan 05, 2015 at 10:19:54AM +0100, Jan Stary wrote:
   
This is a daily mail from my Alix router.
I do a dump in daily.local (see below)
and most of the time it works just fine.
Occasionaly though, the DUMP fails saying

   DUMP: End of tape detected
   DUMP: Volume 1 completed at: Mon Jan  5 01:30:44 2015
   DUMP: Volume 1 took 0:00:07
   DUMP: Volume 1 transfer rate: 2101 KB/s
   DUMP: Change Volumes: Mount volume #2
   DUMP: fopen on /dev/tty fails: Device not configured
   DUMP: The ENTIRE dump is aborted.

That puzzles me, as I dump to stdout,
redirecting to a file (see below).

(I vaguely remember that the reason I switched from
dump -f file.dump ... to dump -f - ...  file.dump
was that I was advised her by a developer about
the tape legacy of dump, but I forgot what exactly
was the problem then and can't find it in archives.)

Why would dump -f -  ...  file.dump think
that it reached an end of tape?
   
   Because dump is a bit dumb. You need to use -a, see man page.
  
  But I do, see the code below.
 
 Hmm indeed, then it's my guess you are running out of disk. The
 numbers do not seems to warrant that, though. So I have no real clue. 
 Or did you play with tunefs -m ?
 
   -Otto

How about using the -n flag to find out what dump wants by being logged on
as a user in group 'operator' while dump runs?


/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Patch 009_httpd.patch did not apply cleanly

2014-12-01 Thread Raimo Niskanen
On Sat, Nov 29, 2014 at 05:44:17PM -0500, Ted Unangst wrote:
 On Sat, Nov 29, 2014 at 16:08, Libertas wrote:
  On 11/27/2014 07:38 AM, Raimo Niskanen wrote:
  I have also learned to use the -C flag to patch...
  
  Have we ever considered changing the suggested shell commands in the
  patches to ensure that the patch will apply cleanly before trying? We
  could wrap the actual patch command an if-block with a 'patch -C' condition.
 
 There are few circumstances in which that would matter. The
 expectation is that the patch should apply.

I do not remember from the output of patch if it printed at the bottom that
some hunks were rejected.  I kind of remember to have to scroll back quite
far to find that out - which I did because the compilation failed.

So if my vague memories are incorrect and patch indeed warns at the end of
the run that some hunks in some file(s) were rejceted then all is well.
Otherwise it might be nice to have patch... || echo Warning in the patch
instruction, or a warning after all files from patch.

 
 Now, you are always welcome to run patch -C on your systems, but
 otherwise it complicates the instructions and we'd prefer to keep them
 simple.

Yes. Observing the result from patch should be sufficient since it leaves
backup and reject files behind so you can analyze and revert if want.

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: Patch 009_httpd.patch did not apply cleanly

2014-11-27 Thread Raimo Niskanen
On Tue, Nov 25, 2014 at 11:45:26AM -0500, trondd wrote:
 I had noticed the same thing.  The src tarball on the CD is different from
 the tarball on the mirrors.  I had taken a quick look and it was just
 whitespace differences that I saw.
 
 Tim.

I have investigated more now, and it sure seems as the 5.6 CD src.tar.gz
does not have the same content as the download site's 5.6 src.tar.gz
(besides sys.tar.gz, of course).  Some parts of patch 009 (on httpd)
were already present in my source tree which is the CD src.tar.gz.

But on the downloadable 5.6 src.tar.gz patch 009 did apply cleanly.

That suggests that the CD src.tar.gz is a slightly later src tree then the
downloadable src.tar.gz.  I have not compared both trees in full,
only ./usr.sbin/httpd.

And yes, I have checked the signatures and SHA sums for both tarballs
against the pub key installed by the 5.6 CD set.

I have also learned to use the -C flag to patch...

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Patch 009_httpd.patch did not apply cleanly

2014-11-25 Thread Raimo Niskanen
Hi.

I applied patch 009_httpd.patch extracted from a downloaded 5.6.tar.gz
according to inline instructions on a source tree from the 5.6 release CD,
and approximately 6 hunks in 3 files were rejected.

This was late last evening and I will try again any day to produce proper
logs and double check what source tree I tried to patch, but if this is a 
simple mistake from the patch writer's side I think the information above
might be sufficient...

Best Regards
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Re: uscom/ucom hardware question [was: OpenBSD 5.6 Released]

2014-11-06 Thread Raimo Niskanen
On Mon, Nov 03, 2014 at 05:17:40PM +0100, Raimo Niskanen wrote:
 On Sat, Nov 01, 2014 at 10:32:52PM +0100, ropers wrote:
   o New uscom(4) driver for simple USB serial adapters.
  
  This reminds me of something I've been meaning to ask for some time:
  
  * Has anyone here used a USB-only laptop with a USB-to-serial adapter
  as a serial console? (You know, instead of hardware that has a native
  RS-232 port?
 
 Yes.
 
  * Does uscom(4) make this any easier/is it more compatible than ucom(4)?
  * If I buy a random USB-to-serial dongle, is it likely that it'll work
  with either uscom(4) or ucom(4)? If not, does anyone have any hardware
  recommendations, i.e. what do you use?
 
 I have got a cable labled ST-Lab USB-SERIAL-4 purchased from Dustin in
 Sweden, and it works well.  I do not remember what device driver attaches
 to it and have not tried any other cables so I can not say even if it
 happens that some cable does not work due to e.g Windows-only drivers...

Now I just bought an Aten UC-232A-B and it seems to contain the same chip:
uplcom0 at uhub1 port 2 Prolific Technology Inc. USB-Serial Controller D 
rev 1.10/3.00 addr 2
ucom0 at uplcom0

 
  
  Thanks for any input
  
  regards,
  ropers
 
 -- 
 
 / Raimo Niskanen, Erlang/OTP, Ericsson AB

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



  1   2   3   >