Re: disklabel partition auto allocation problem

2021-06-10 Thread Otto Moerbeek
On Wed, Jun 09, 2021 at 12:40:13PM -0400, electronmuontau neutrino wrote:

> disklabel in OpenBSD 6.9 doesn't seem to be allocating partition sizes
> correctly according to the actual size of my OpenBSD partition.  I dual
> booted my ThinkPad X1 Carbon 5th gen laptop with Windows 10 and OpenBSD.  I
> allocated about half the disk space to OpenBSD.  When I installed OpenBSD,
> it allocated partitions as if my disk size was > 2.5 GB instead of >= 10 GB
> as shown in the disklabel man page.  It allocated 2GB to /, 256M to swap,
> 3G to /usr, 2G to /home and apparently did not allocate the rest of the
> free space.  I've included the output of disklabel, fdisk and dmesg below.
> I haven't tried installing OpenBSD 6.8 to see if it does the same thing.  I
> believe auto allocation worked fine in OpenBSD 6.7.  If I'm not including
> any info that might help diagnose this problem, if it really is a problem,
> please let me know.
> 
> 
> # disklabel sd0
> 
> # /dev/rsd0c:
> type: SCSI
> disk: SCSI disk
> label: SAMSUNG MZVLB1T0
> duid: 9a51a841a90239b3
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 124519
> total sectors: 2000409264
> boundstart: 1002668032
> boundend: 1998360576
> drivedata: 0
> 
> 16 partitions:
> #size   offset  fstype [fsize bsize   cpg]
>   a:  4194304   1002668032  4.2BSD   2048 16384 12960 # /
>   b:   524288   1006862336swap# none
>   c:   20004092640  unused
>   d:  6291456   1007386624  4.2BSD   2048 16384 12960 # /usr
>   e:  4194304   1013678080  4.2BSD   2048 16384 12960 # /home
>   i:   532480 2048   MSDOS
>   j:32768   534528 unknown
>   k:   1002100736   567296   MSDOS
>   l:  2048000   1998360576 unknown
> 
> 
> 
> # fdisk sd0
> 
> Disk: sd0   Usable LBA: 34 to 2000409230 [2000409264 Sectors]
>#: type [   start: size ]
> 
>0: EFI Sys  [2048:   532480 ]
>1: e3c9e316-0b5c-4db8-817d-f92df00215ae [  534528:32768 ]
>2: FAT12[  567296:   1002100736 ]
>3: OpenBSD  [  1002668032:995692544 ]
>4: Win Recovery [  1998360576:  2048000 ]


Hi,

I created a vnd with this layout:
$ doas dd bs=512 count=1 seek=2000409264 of=image if=/dev/null 
$ doas vnconfig vnd0 image
$ doas fdisk -ig vnd0
$ doas fdisk -e vnd0

... add partitions, using A5 (FreeBSD) for the "unknown" above

$ doas fdisk vnd0
Disk: vnd0   Usable LBA: 64 to 2000409200 [2000409264 Sectors]
   #: type [   start: size ]

   0: EFI Sys  [2048:   532480 ]
   1: FreeBSD  [  534528:32768 ]
   2: FAT12[  567296:   1002100736 ]
   3: OpenBSD  [  1002668032:995692544 ]
   4: FreeBSD  [  1998360576:  2048000 ]

If I run disklabel -A I get the expected outcome, leavibng the
"foreign" partitions intact:

$ doas disklabel -A vnd0 
# /dev/rvnd0c:
type: vnd
disk: vnd device
label: fictitious
duid: 
flags:
bytes/sector: 512
sectors/track: 100
tracks/cylinder: 1
sectors/cylinder: 100
cylinders: 20004092
total sectors: 2000409264
boundstart: 1002668032
boundend: 1998360576
drivedata: 0 

16 partitions:
#size   offset  fstype [fsize bsize   cpg]
  a:  2097152   1002668032  4.2BSD   2048 16384 1 # /
  b:  8414472   1004765184swap
  c:   20004092640  unused
  d:  8388576   1013179680  4.2BSD   2048 16384 1 # /tmp
  e: 24168960   1021568256  4.2BSD   2048 16384 1 # /var
  f: 12582912   1045737216  4.2BSD   2048 16384 1 # /usr
  g:  2097152   1058320128  4.2BSD   2048 16384 1 # /usr/X11R6
  h: 41943040   1060417280  4.2BSD   2048 16384 1 # /usr/local
  i:   532480 2048   MSDOS
  j:32768   534528 unknown
  k:   1002100736   567296   MSDOS
  l:  2048000   1998360576 unknown
  m:  4194304   1102360320  4.2BSD   2048 16384 1 # /usr/src
  n: 12582912   1106554624  4.2BSD   2048 16384 1 # /usr/obj
  o:629145600   1119137536  4.2BSD   4096 32768 1 # /home


Re: Pf tables and ruleset optimizations

2021-05-31 Thread Otto Moerbeek
On Mon, May 31, 2021 at 10:32:56AM +0200, Heinrich Rebehn wrote:

> Hi list,
> 
> My /etc/pf.conf contains a table which is initialized from a file:
> 
> table  file "/root/pf/tables/myservers”
> 
> This table ist not referred to in pf.conf, but in an anchor which is loaded 
> later on.
> I found out that even when the anchor is loaded, the table does not exist.

See the "persist" keywoard in pf.conf.

-Otto

> 
> # pfctl -t myservers -T show
> pfctl: Table does not exist
> # pfctl -sT
> private
> rtun0
> rtun1
> trusted
> 
> If I load pf with "# pfctl -o none -f /etc/pf.conf", the table appears. If I 
> use
> 
> set ruleset-optimization none
> 
> it doesn’t.
> 
> Is this expected behavior?
> 
> Also rcctl(8) does not allow eating flags for pf
> 
> # rcctl set pf flags "-o none"
> rcctl: "pf" is a special variable, cannot "set flags”
> 
> Workaounds would be setting flag in /etc/rc.conf.local or adding "pfctl -o 
> none -f /etc/pf.conf” to rc.local
> 
> Any thoughts?
> 
> -Heinrich
> 



Re: .profile not being loaded (ksh) when opening shell in X

2021-04-27 Thread Otto Moerbeek
On Tue, Apr 27, 2021 at 12:19:36PM +, tetrahe...@danwin1210.me wrote:

> On Tue, Apr 27, 2021 at 09:37:05AM +0200, Alexandre Ratchov wrote:
> > If you're using a display manager (xenodm or whatever), you've to
> > include your .profile in your session login script (X equivalent of
> > shell's ~/.profile concept), so the envoronment (and other global
> > login settings) from your .profile become visible to all X programs,
> > not only xterm. For instance put:
> > 
> > . ~/.profile
> > 
> > at the beginning of our ~/.xsession
> > 
> > If you're using xinit(1), your ~/.profile is already loaded by
> > the login shell.
> 
> That seems the right way to go, if the other suggested solution of defining
> ENV doesn't do the trick.
> 

Note that ENV processing is only done for interactive shells.
Traditionally, ENV would point the a ~/.kshrc file that contains init
commands only relevant for interactive use. See the ksh man page for
details.

-Otto



Re: default Offset to 1MB boundaries for improved SSD (and Raid Virtual Disk) partition alignment

2021-04-21 Thread Otto Moerbeek
On Wed, Apr 21, 2021 at 09:56:59AM +0100, Tom Smyth wrote:

> Hello Otto, Christian,
> 
> I was relying on that paper for the pictures of the alignment issue,
> 
> VMFS  (vmware file system)since version 5 of vmwarehas allocation
> units of 1MB each
> 
> https://kb.vmware.com/s/article/2137120
> 
> my understanding is that SSDs   have a similar allocation unit setup of 1MB,
> 
> and that aligning your file system to 1MB would improve performance
> 
> 
> |OpenBSD Filesystem --|  FFS-Filesystem
> |VMDK Virtual Disk file for Guest |  OpenBSD-Gusest-Disk0.vmdk
> |vmware datastore--  |   1MB allocation
> |Logical Storage Device / RAID---|
> |SSD or DISK storage --|1MB allocation  unit (on some SSDs)
> 
> Figure 2 of the following paper shows what
> https://www.usenix.org/legacy/event/usenix09/tech/full_papers/rajimwale/rajimwale.pdf
> as your writes start to cross another underlying block boundary you
> see a degradation of performance
> largest impact is on a write o1 1MB (misaligned) across 2 blocks,

Max unit OpenBSD writes in one go is 64k. So the issue is not that
relevant.  Only 1 in 16 blocks would potentially cross a boundary.

You are free to setup your disks in a way that suits you, but in
general I don't think we should enforce 1Mb alignment of start of
partition and/or size because *some* *might* get a benefit.

-Otto

> but it repeats as you increase the number  of MB in a transaction but
> the % overhead
> reduces for each additional 1MB in the Transaction.
> 
> If there is no downside to allocating  /Offsetting  filesystems on 1MB
> boundaries,
> can we do that by default to reduce wear on SSDs, and improve performance
> in Virtualized Environments with large allocation units on what ever storage
> subsystem they are running.
> 
> Thanks for your time
> 
> Tom Smyth
> 
> 
> 
> 
> On Wed, 21 Apr 2021 at 08:49, Otto Moerbeek  wrote:
> >
> > On Wed, Apr 21, 2021 at 08:20:10AM +0100, Tom Smyth wrote:
> >
> > > Hi Christian,
> > >
> > > if you were to have a 1MB file or  a database that needed to read 1MB
> > > of data,  i
> > > f the partitions are not aligned then
> > > your underlying storage system need to load 2 chunks  or write 2
> > > chunks for 1 MB of data, written,
> > >
> > > So *worst* case you would double the workload for the storage hardware
> > > (SSD or Hardware RAID with large chunks)  for each transaction
> > > on writing to SSDs if you are not aligned one could *worst *case
> > > double the write / wear rate.
> > >
> > > The improvement would be less for accessing small files and writing small 
> > > files
> > > (as they would need to be across  2 Chunks )
> > >
> > > The following paper explains (better  than I do )
> > > https://www.vmware.com/pdf/esx3_partition_align.pdf
> > >
> > > if the cost is  1-8MB at the start of the disk (assuming partitions are 
> > > sized
> > >  so that they dont loose the ofset of 2048 sectors)
> > > I think it is worth pursuing. (again I only have experience on amd64
> > > /i386 hardware)
> >
> > Doing a quick scan trhough the pdf I only see talk about 64k boundaries.
> >
> > FFS(2) will split up any partiition in multiple cylinder groups. Each
> > cylinder group starts with a superblock copy, inode tables and other
> > meta datas before the data blocks of that cylinder group. Having the
> > start of a partion a 1 1MB boundary does not get you those data blocks
> > at a specific boundary. So I think your resoning does not apply to FFS(2).
> >
> > It might make sense to move the start to offset 128 for big
> > partitions, so you align with the 64k boundary mentioned in the pdf,
> > the block size is already 64k (for big parttiions).
> >
> > -Otto
> >
> > >
> > > Thanks
> > > Tom Smyth
> > >
> > > On Tue, 20 Apr 2021 at 22:52, Christian Weisgerber  
> > > wrote:
> > > >
> > > > Tom Smyth:
> > > >
> > > > > just installing todays snapshot and the default offset on amd64 is 64,
> > > > >  (as it has been for as long as I can remember)
> > > >
> > > > It was changed from 63 in 2010.
> > > >
> > > > > Is it worth while updating the defaults so that OpenBSD partition
> > > > > layout will be optimal for SSD or other Virtualized RAID environments
> > > > > with 1MB  Chunks,
> > > >
> > > > What are you trying to o

Re: default Offset to 1MB boundaries for improved SSD (and Raid Virtual Disk) partition alignment

2021-04-21 Thread Otto Moerbeek
On Wed, Apr 21, 2021 at 08:20:10AM +0100, Tom Smyth wrote:

> Hi Christian,
> 
> if you were to have a 1MB file or  a database that needed to read 1MB
> of data,  i
> f the partitions are not aligned then
> your underlying storage system need to load 2 chunks  or write 2
> chunks for 1 MB of data, written,
> 
> So *worst* case you would double the workload for the storage hardware
> (SSD or Hardware RAID with large chunks)  for each transaction
> on writing to SSDs if you are not aligned one could *worst *case
> double the write / wear rate.
> 
> The improvement would be less for accessing small files and writing small 
> files
> (as they would need to be across  2 Chunks )
> 
> The following paper explains (better  than I do )
> https://www.vmware.com/pdf/esx3_partition_align.pdf
> 
> if the cost is  1-8MB at the start of the disk (assuming partitions are sized
>  so that they dont loose the ofset of 2048 sectors)
> I think it is worth pursuing. (again I only have experience on amd64
> /i386 hardware)

Doing a quick scan trhough the pdf I only see talk about 64k boundaries.

FFS(2) will split up any partiition in multiple cylinder groups. Each
cylinder group starts with a superblock copy, inode tables and other
meta datas before the data blocks of that cylinder group. Having the
start of a partion a 1 1MB boundary does not get you those data blocks
at a specific boundary. So I think your resoning does not apply to FFS(2).

It might make sense to move the start to offset 128 for big
partitions, so you align with the 64k boundary mentioned in the pdf,
the block size is already 64k (for big parttiions).

-Otto

> 
> Thanks
> Tom Smyth
> 
> On Tue, 20 Apr 2021 at 22:52, Christian Weisgerber  wrote:
> >
> > Tom Smyth:
> >
> > > just installing todays snapshot and the default offset on amd64 is 64,
> > >  (as it has been for as long as I can remember)
> >
> > It was changed from 63 in 2010.
> >
> > > Is it worth while updating the defaults so that OpenBSD partition
> > > layout will be optimal for SSD or other Virtualized RAID environments
> > > with 1MB  Chunks,
> >
> > What are you trying to optimize with this?  FFS2 file systems reserve
> > 64 kB at the start of a partition, and after that it's filesystem
> > blocks, which are 16/32/64 kB, depending on the size of the filesystem.
> > I can barely see an argument for aligning large partitions at 128
> > sectors, but what purpose would larger multiples serve?
> >
> > > Is there a down side  to moving the default offset to 2048 ?
> >
> > Not really.  It wastes a bit of space, but that is rather insignificant
> > for today's disk sizes.
> >
> > --
> > Christian "naddy" Weisgerber  na...@mips.inka.de
> >
> 
> 
> -- 
> Kindest regards,
> Tom Smyth.
> 



Re: Another potential awk or xargs bug?

2021-04-16 Thread Otto Moerbeek
On Fri, Apr 16, 2021 at 02:26:27AM -0700, Jordan Geoghegan wrote:

> 
> 
> On 4/15/21 7:49 AM, Otto Moerbeek wrote:
> > On Thu, Apr 15, 2021 at 04:29:17PM +0200, Christian Weisgerber wrote:
> >
> >> Jordan Geoghegan:
> >>
> >>> --- /tmp/bad.txt  Wed Apr 14 21:06:51 2021
> >>> +++ /tmp/good.txt  Wed Apr 14 21:06:41 2021
> >> I'll note that no characters have been lost between the two files.
> >> Only the order is different.
> >>
> >>> The only thing that changed between these runs was me using either xargs 
> >>> -P 1 or -P 2.
> >> What do you expect?  You run two processes in parallel that write
> >> to the same file.  Obviously their output will be interspersed in
> >> unpredictable order.
> >>
> >> You seem to imagine that awk's output is line-buffered.  But when
> >> it writes to a pipe or file, its output is block-buffered.  This
> >> is default stdio behavior.  Output is written in block-size increments
> >> (16 kB in practice) without regard to lines.  So, yes, you can end
> >> up with a fragment from a line written by process #1, followed by
> >> lines from process #2, followed by the remainder of the line from
> >> #1, etc.
> >>
> >> -- 
> >> Christian "naddy" Weisgerber  na...@mips.inka.de
> >>
> > Right, a fflush() call after the printf makes the issue go away, but
> > only since awk is being nice and issues a single write call for that
> > single printf. Since awk afaik does not give such a guarantee, it is
> > better to have each parallel invocation write to a separate file and
> > then cat them together after all the awk runs are done.
> >
> > -Otto
> 
> Hello Christian and Otto,
> 
> Thank you for setting me straight. The block vs line buffering issue should 
> have been obvious to me. What got me confused was that this solution worked 
> well, for a long time - until it didn't. One would assume that it would 
> consistently mangle output...

Buffering issues depend on the (size of) the data being written. I
think it is pretty consistent: if the bugs appears it always does in
the same way.

> 
> While fflush does seem to fix the issue, I wanted to explore your suggestion 
> Otto of writing to a temporary file from within awk.
> 
> Is something like the following a sane approach to safely generating 
> temporary files from within awk?:
> 
> BEGIN{ cmd = "mktemp -q /tmp/workdir/tmp.XXX" ; if( ( cmd | getline 
> result ) > 0 ) TMPFILE = result ; else exit 1 }
> 
> Unless I'm missing something obvious, It seems there is no way to capture 
> both the stdout and return code of an external command from within awk. My 
> workaround solution to error check the call to mktemp here is to abort if 
> mktemp returns no data. Is this sane?
> 
> Regards,
> 
> Jordan

I think that would work, but maybe it is nicer to wrap the code in a
shell script that generates tmp file names, passes the names to awk
and then do the catting of the result files in the shell script? To
run the cat command you need to know the names of the files anayway.

-Otto



Re: Another potential awk or xargs bug?

2021-04-15 Thread Otto Moerbeek
On Thu, Apr 15, 2021 at 04:29:17PM +0200, Christian Weisgerber wrote:

> Jordan Geoghegan:
> 
> > --- /tmp/bad.txt  Wed Apr 14 21:06:51 2021
> > +++ /tmp/good.txt  Wed Apr 14 21:06:41 2021
> 
> I'll note that no characters have been lost between the two files.
> Only the order is different.
> 
> > The only thing that changed between these runs was me using either xargs -P 
> > 1 or -P 2.
> 
> What do you expect?  You run two processes in parallel that write
> to the same file.  Obviously their output will be interspersed in
> unpredictable order.
> 
> You seem to imagine that awk's output is line-buffered.  But when
> it writes to a pipe or file, its output is block-buffered.  This
> is default stdio behavior.  Output is written in block-size increments
> (16 kB in practice) without regard to lines.  So, yes, you can end
> up with a fragment from a line written by process #1, followed by
> lines from process #2, followed by the remainder of the line from
> #1, etc.
> 
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 

Right, a fflush() call after the printf makes the issue go away, but
only since awk is being nice and issues a single write call for that
single printf. Since awk afaik does not give such a guarantee, it is
better to have each parallel invocation write to a separate file and
then cat them together after all the awk runs are done.

-Otto



Re: Last shutdown date of old OpenBSD machine

2021-04-15 Thread Otto Moerbeek
On Thu, Apr 15, 2021 at 11:42:14AM +0200, Ales Tepina wrote:

> Hi!
> 
> I have a really old machine (it has DIN keyboard connector) with OpenBSD 
> installed on it that was used as a router and its been sitting 
> in the basement for quite a few years. I would like to find out the date 
> when the machine was last shutdown.
> 
> What would be the best way to go about looking for that info?
> 
> I have two options as far as i can see but have not tried any of them to
> avoid messing up the date of last boot/shutdown:
> 1. Boot the machine and check the log files in /var/log
> 2. Attach the disk drive to another machine and mount the partition and
>   also check the info on some files
> 
> Also, one important caveat. There is a good chance i won't be able to
> guess the password anymore. I think i know what it is, but i'm not sure
> since it was so long ago.
> Therefore booting into single user mode is probably the only choice for
> option 1.
> 
> Thank you for your suggestions.
> 
> Br, Ales
> 

check last(1). Can be used with option 1 and 2 above.

-Otto



Re: Non-default partitions and upgrades

2021-04-12 Thread Otto Moerbeek
On Mon, Apr 12, 2021 at 08:08:12PM -0700, Paul Pace wrote:

> Hello!
> 
> I generally try and run things as a project recommends, but I am wondering
> about running different additional partitions (e.g., add /var/www) or
> changing partition letter (e.g., move /var to the end for convenient VPS
> expansion).
> 
> I know it isn't the biggest thing in the world, but would this ever have an
> impact on running version upgrades?
> 
> Thank you,
> 
> Paul
> 

That would work , unless you do crazy things. The upgrade script
mounts the filesystems using the fstab on the system to be upgraded. 

-Otto



Re: Go programs only using one CPU core

2021-03-27 Thread Otto Moerbeek
On Sat, Mar 27, 2021 at 06:44:52PM +1100, john slee wrote:

> Hi,
> 
> > On 2021-03-26, Richard Ulmer  wrote:
> > > The `go` directive starts a new goroutine, which I would expect to be
> > > put into it's own process here. However, using htop(1) I can see, that
> > > only one of my two cores gets load. Running the same program on Linux,
> > > two cores are utilized.
> 
> That's not how the Go runtime works, I think?
> 
> You shouldn't expect to see a 1:1 mapping of goroutines:OS processes.
> 
> Quoting Russ Cox on the golang-nuts list:
> 
>   "This is a popular split but hardly the only definition
>   of those terms. One reason we use the name goroutine
>   is to avoid preconceptions about what those terms mean.
>   For many people threads also connotes management by
>   the operating system, while goroutines are managed first
>   by the Go runtime"
> 
> More here:
> 
> https://medium.com/the-polyglot-programmer/what-are-goroutines-and-how-do-they-actually-work-f2a734f6f991
> 
> Are you actually seeing a problem (an actual problem, not "I can only see
> one line for my app in "top") specific to OpenBSD?
> 
> John

The actual problem is that htop is buggy on OpenBSD. It is much better
to use the native tools, they are more activly maintained.

-Otto



Re: Swap partition should equal exactly RAM size, for crash dump+savecore(8) to always work on crash?

2021-03-14 Thread Otto Moerbeek
On Sun, Mar 14, 2021 at 10:53:22AM +, Joseph Mayer wrote:

> On Sunday, 14 March 2021 08:46, Otto Moerbeek  wrote:
> > On Sun, Mar 14, 2021 at 01:17:05AM +, Joseph Mayer wrote:
> >
> > > Hi,
> > > Apologies if I missed any earlier clarification on the mailing list of
> > > this question:
> > > What should the size of my swap partition be exactly, at least, for it
> > > to guaranteedly be big enough to contain a whole kernel crash dump, if
> > > the kernel crashes?
> > > I would presume the exact size of the RAM, or are there headings that add
> > > some bytes or kilobytes, or some further annotations that may take how
> > > much, a gigabyte extra?
> > > Thanks,
> > > Joseph
> >
> > A crash dump needs a bit more than physical RAM. If you use the
> > autoallocater when creating a disklabel, it uses max 2 * physmem + 256M,
> > to have room for two crash dumps. See
> > src/sbin/disklable/editor.c:editor_allocspace().
> >
> > -Otto
> 
> Hi Otto,
> 
> Thank you very much for your response.
> 
> Just curious, when would a dump partition ever contain two crash dumps,
> would this be in case the subsequent reboot would crash before reaching
> savecore(8)?

Ugh, I was wrong */var* is sized to be able to contain two crash dumps.
swap is set to physmem + 256MB.

> (Then followup on an old feature request: If crash dumping could be
> done to swap files would be great. To my best awareness this is not
> supported today.
> 
> Actually for machines that ordinarily don't actually use swap memory
> anyhow as in all memory used always fits in RAM, crash dumping is the
> only reason to have a swap partition today.)
> 
> Joseph

Dumping to a swap file is much more complex than dumping to "real"
swap, is you would need to be able to interpret filesystems.

-Otto



Re: Swap partition should equal exactly RAM size, for crash dump+savecore(8) to always work on crash?

2021-03-13 Thread Otto Moerbeek
On Sun, Mar 14, 2021 at 01:17:05AM +, Joseph Mayer wrote:

> Hi,
> 
> Apologies if I missed any earlier clarification on the mailing list of
> this question:
> 
> What should the size of my swap partition be exactly, at least, for it
> to guaranteedly be big enough to contain a whole kernel crash dump, if
> the kernel crashes?
> 
> I would presume the exact size of the RAM, or are there headings that add
> some bytes or kilobytes, or some further annotations that may take how
> much, a gigabyte extra?
> 
> Thanks,
> Joseph
> 

A crash dump needs a bit more than physical RAM. If you use the
autoallocater when creating a disklabel, it uses max 2 * physmem + 256M,
to have room for two crash dumps. See
src/sbin/disklable/editor.c:editor_allocspace().

-Otto



malloc cache changes

2021-03-09 Thread Otto Moerbeek
Hi,

you might be interested in

https://marc.info/?l=openbsd-tech=161527759429295=2

Followup on tech@ please,

-Otto


   



Re: update to docs/faq for partition sizes ?

2021-02-15 Thread Otto Moerbeek
On Mon, Feb 15, 2021 at 01:57:23AM -0800, harold felton wrote:

> howdee,
> 
> this is totally not critical, but i will go ahead and ask:
> 
> assuming that i have added the following to /etc/mk.conf...
> DEBUG=-g
> KEEPKERNELS=yes
> SUDO=doas
> WARNINGS=-Wall
> 
> how big do i need the /usr/obj partition to be ?

DEBUG=-g makes things way bigger. KEEPKERNELS is also leaving more
cruft afaik.

You are not compiling with standard options, so how should we know?

-Otto
> 
> i do not remember specifically adjusting the /usr/xxx partitions
> when i was initially installing this system - but i clearly changed
> something...
> because the answer for 'df -h' is as follows:
> Filesystem SizeUsed   Avail Capacity  Mounted on
> /dev/sd0a  986M121M816M13%/
> /dev/sd0k 18.8G1.5G   16.3G 8%/home
> /dev/sd0d  3.9G   10.0K3.7G 0%/tmp
> /dev/sd0f  5.8G2.8G2.7G51%/usr
> /dev/sd0g  986M236M701M25%/usr/X11R6
> /dev/sd0h 15.7G286K   14.9G 0%/usr/local
> /dev/sd0j  5.8G4.9G636M89%/usr/obj
> /dev/sd0i  1.9G1.2G618M67%/usr/src
> /dev/sd0e 11.6G8.8M   11.0G 0%/var
> /dev/sd0l  1.9G344K1.8G 0%/var/log
> /dev/sd0m  9.7G1.3M9.2G 0%/var/www
> /dev/sd0n 27.1G2.0K   25.8G 0%/xtra
> 
> i will include the dmesg below, altho i doubt it is critical other than
> to note that the size of my sd0 is 120Gb...
> 
> i was also going to ask about a bunch of warnings that i saw scroll by
> "dwarf2 only supports one compilation unit" or something...
> i didnt think that these warning were important since i noticed that freebsd
> had already dealt with them around 2016-2018 when compiling golang...
> (see: https://github.com/golang/go/issues/14705)
> 
> anyways - i was going to try and learn how to use the debugger, etc...
> and thought it would be useful to have all the symbols and code available...
> i got thru the https://man.openbsd.org/release step-2 and was trying to
> do the 'make build' of base when i ran-out-of-space...
> 
> i have a feeling that i would have been fine only compiling non-debug,
> but figured this question might be an faq-type answer that i could ask...
> 
> ok - heres my current dmesg (was running -current from a few days back
> until i had just-compiled the -current system i ran out of space on)
> 
> fyi - this is a pcengines apu4d4 - if that helps...
> tia, h.
> 
> 
> 
> OpenBSD 6.9-beta (GENERIC.MP) #0: Sat Feb 13 09:41:26 PST 2021
> hfeltonad...@fw.hfelton.net:/sys/arch/amd64/compile/GENERIC.MP
> real mem = 4259868672 (4062MB)
> avail mem = 4115427328 (3924MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xcfe8a040 (13 entries)
> bios0: vendor coreboot version "v4.12.0.6" date 10/29/2020
> bios0: PC Engines apu4
> acpi0 at bios0: ACPI 6.0
> acpi0: sleep states S0 S1 S4 S5
> acpi0: tables DSDT FACP SSDT MCFG TPM2 APIC HEST SSDT SSDT DRTM HPET
> acpi0: wakeup devices PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) PBR8(S4) UOH1(S3)
> UOH2(S3) UOH3(S3) UOH4(S3) UOH5(S3) UOH6(S3) XHC0(S4)
> acpitimer0 at acpi0: 3579545 Hz, 32 bits
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf800, bus 0-64
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: AMD GX-412TC SOC, 998.29 MHz, 16-30-01
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
> cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB
> 64b/line 16-way L2 cache
> cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
> cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, IBE
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: AMD GX-412TC SOC, 998.13 MHz, 16-30-01
> cpu1:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
> cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB
> 64b/line 16-way L2 cache
> cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
> cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: 

Re: sysupgrade failure logs

2021-02-14 Thread Otto Moerbeek
On Sun, Feb 14, 2021 at 06:39:15PM -0500, Judah Kocher wrote:

> I had this nicely formatted when I sent it, but it seems to have been
> reformatted elsewhere in transit. Hopefully this helps but if not I will
> leave it be.
> 
> On 2/14/21 6:27 PM, Judah Kocher wrote:
> > Thanks to each of you for your replies,
> 
> > Lesson 1: always get machines with remote console access. It wil save
> > the day some day and help in diagnosing issues.
> Having remote console access would be sweet, but unfortunately that goes far
> beyond the hobbyist price point I currently have to work with.

No idea what your paying, but there are companies that do provide
(serial) console on all there offerings, like arp networks or
openbsd.amsterdam.

-Otto



Re: sysupgrade failure logs

2021-02-14 Thread Otto Moerbeek
On Sun, Feb 14, 2021 at 12:02:07PM -0500, Judah Kocher wrote:

> Hello folks,
> 
> I am having an issue with sysupgrade and I have had trouble finding the
> source of the problem so I hope someone here might be able and willing to
> point me in the right direction.
> 
> I have 6 small systems running OpenBSD -current and I have a basic script
> which upgrades to the latest snapshot weekly. The systems are all relatively
> similar. Three are the exact same piece of hardware, two are slightly
> different, and one is a VM configured to match the first three as closely as
> possible with virtual hardware.
> 
> The script checks the current kernel version, (e.g. "GENERIC.MP#302") logs
> it, runs sysupgrade, and after the reboot it checks the kernel version
> again. If it is different it logs it as a "success" and if it is still the
> same it logs it as a failure.
> 
> All 6 systems were configured using the same autoinstall configuration and
> the upgrade script is identical on each unit. However, two of the three
> identical units always fail. When I remote into either system and manually
> run the upgrade script it also fails. I was able to get onsite with one of
> them where I connected a monitor and keyboard and manually ran the script to
> observe the results but oddly enough it succeeded so I learned nothing
> actionable. However it continues to fail the weekly upgrade. I have
> confirmed that the script permissions are identical on the working and
> nonworking units.
> 
> The 4 units that successfully upgrade leave a mail message with a log of the
> upgrade process. However I have been unable to find any record or log on the
> systems that are failing to help me figure out why this isn't working. The
> only difference I can identify between the systems is that
> "auto_upgrade.conf" and "bsd.upgrade" are both present in "/" on the two
> systems that fail, but are properly removed on the 4 that succeed.
> 
> I would appreciate any suggestions of what else I can try or check to figure
> out what is causing this issue.
> 
> Thanks
> 
> Judah
> 

Lesson 1: always get machines with remote console access. It wil save
the day some day and help in diagnosing issues.

On the system that succeeded when you were watching on the console,
did automaic sysupgardes started to work after that?

In general, my guess would be a boot.conf contents that prevent the
automatic upgrade to work. Or maybe you have very old bootloaders on
the failing mahcines.

BTW, kernel # cannot be used to identify a kernel.

-Otto



Re: Installation overwritten... Accidental disklabel and newfs

2021-02-10 Thread Otto Moerbeek
On Wed, Feb 10, 2021 at 07:11:53PM +, Ed Gray wrote:

> Hi Otto,
> 
> Thanks for your reply. This is what I see on a shell from bad.rd when I try
> to access the first SATA HDD.
> 
> # disklabel sd0
> disklabel: /dev/rsd0: no such file or directory
> 
> # disklabel sd0c
>  disklabel: /dev/rsd0c: no such file or directory
> 
> Same for rsd0 and rsd0c.
> 
> The device nodes don't exist until the install or upgrade program detects
> the disk and creates them.
> 
> Likewise for wd0 as although outdated for ahci disks.
> 
> Dmesg identifies the disk as:
> sd0 at scsibus0 targ0 lun0 ATA ST1000DM003...
> sd0 953869mb 
> 
> This is why I had to run the install program and accidentally went too far.

A cd /dev; ./MAKEDEV sd0 would have been enough to continue. 

-Otto


> 
> It would be helpful to be able to use disklabel and other tools such as
> newfs, growfs without running through the installer.
> 
> In my case I forgot that the installer continues automatically with the
> next command and also used the wrong switch to disklabel.
> 
> It's a good thing I take backups seriously nowadays.
> 
> Regards
> Ed Gray
> 
> On Wed, 10 Feb 2021, 3:52 pm Otto Moerbeek,  wrote:
> 
> > On Wed, Feb 10, 2021 at 03:35:06PM +, Ed Gray wrote:
> >
> > > Okay, thanks Stuart.
> > >
> > > I have left testdisk running a deep scan and will see if it finds my
> > /var.
> > > I know I'll still have to mount the partitions and I don't know if an
> > fsck
> > > would be able to fix any damage done by newfs.
> > >
> > > I think at this point I'm better off starting again as like others I've
> > > done many upgrades. It's probably not worth trying to fix for the sake of
> > > getting a few configuration files and settings back and maybe some files
> > I
> > > have elsewhere.
> > >
> > > I would be interested in finding out a way to access my SATA HDD (sd0)
> > with
> > > disklabel and other tools on the ramdisk without first running the
> > install
> > > or upgrade programs.
> >
> > If you starft a shell on the initial prompt of a bsd.rd boot you get a
> > shell and a fine selection of commands that are useful for recovery.
> >
> > -Otto
> >
> > >
> > > Regards
> > > Ed Gray
> > >
> > > On Wed, 10 Feb 2021, 8:33 am Stuart Henderson, 
> > wrote:
> > >
> > > > On 2021-02-09, Ed Gray  wrote:
> > > > > I have backups and will probably not have lost anything important
> > but I
> > > > > just wondered if anyone had any suggestions as to whether this is
> > fixable
> > > > > and what steps to take before I give up and re-install? I followed a
> > > > how-to
> > > > > I found which suggested using scan_ffs to rebuild my disklabel but
> > it's
> > > > > finding some of the volumes and not all of them.
> > > >
> > > > If you were able to recover /var, check in /var/backups where you will
> > > > hopefully find some disklabel.* files.
> > > >
> > > > scan_ffs does not support FFS2, previously used only for large
> > > > filesystems but on newer installations now used for all filesystems.
> > > >
> > > >
> > > >
> >



Re: Installation overwritten... Accidental disklabel and newfs

2021-02-10 Thread Otto Moerbeek
On Wed, Feb 10, 2021 at 03:35:06PM +, Ed Gray wrote:

> Okay, thanks Stuart.
> 
> I have left testdisk running a deep scan and will see if it finds my /var.
> I know I'll still have to mount the partitions and I don't know if an fsck
> would be able to fix any damage done by newfs.
> 
> I think at this point I'm better off starting again as like others I've
> done many upgrades. It's probably not worth trying to fix for the sake of
> getting a few configuration files and settings back and maybe some files I
> have elsewhere.
> 
> I would be interested in finding out a way to access my SATA HDD (sd0) with
> disklabel and other tools on the ramdisk without first running the install
> or upgrade programs.

If you starft a shell on the initial prompt of a bsd.rd boot you get a
shell and a fine selection of commands that are useful for recovery.

-Otto

> 
> Regards
> Ed Gray
> 
> On Wed, 10 Feb 2021, 8:33 am Stuart Henderson,  wrote:
> 
> > On 2021-02-09, Ed Gray  wrote:
> > > I have backups and will probably not have lost anything important but I
> > > just wondered if anyone had any suggestions as to whether this is fixable
> > > and what steps to take before I give up and re-install? I followed a
> > how-to
> > > I found which suggested using scan_ffs to rebuild my disklabel but it's
> > > finding some of the volumes and not all of them.
> >
> > If you were able to recover /var, check in /var/backups where you will
> > hopefully find some disklabel.* files.
> >
> > scan_ffs does not support FFS2, previously used only for large
> > filesystems but on newer installations now used for all filesystems.
> >
> >
> >



Re: Unknown process modifying routing table

2021-02-06 Thread Otto Moerbeek
On Sat, Feb 06, 2021 at 12:18:40PM +, James wrote:

> I've disabled my VPN on the machine as well as dhclient, connecting via a
> fixed static IP address and DNS servers. My routing table is still being
> modifed by PID 0 (which I assume to be the kernel) every 30 minutes or so.
> Ntpd is also disabled.
> 
> I have also caught my machine communicating to one the of the IPs via TCP
> and have a pcap dump from wireshark. No actual data was sent other than a
> TCP timestamp.
> 
> > If your default route is a VPN,
> > please show how you establish the VPN to be your default route.
> > 
> The default route is established mannually in a script that is run after the
> VPN starts. Essentially it does the following:
> 
>     route add $VPN_HOST $DEFAULT_GW
> 
>     route change default $VPN_HOST
> 
> 
> I do not belive the VPN to be the cause of this problem.
> 
> 
> Any tips on debugging the kernel to track the cause of these route changes
> would be greatly appreciated.
> 
> 
> Thanks,
> 

The kernel uses the routing table to store things like PMTU discovery
data and ARP entries,

-Otto



Re: OpenBSD (memory management) performance issues

2021-01-28 Thread Otto Moerbeek
On Thu, Jan 28, 2021 at 03:25:46PM +, Marek Klein wrote:

> > On Wed, Jan 27, 2021 at 08:39:46AM +0100, Otto Moerbeek wrote:
> > 
> > > On Tue, Jan 26, 2021 at 04:08:40PM +, Marek Klein wrote:
> > >
> > > > Hi,
> > > >
> > > > We are working on an appliance like product that is based on OpenBSD.
> > > > Recently we found out that our performance critical C++ program is
> > > > ~2.5 times slower on OpenBSD compared to Ubuntu 20.04.
> > > >
> > > > The program basically just reads data from stdin, does some
> > > > transformation of the data, and returns the result on stdout, thus
> > > > the program does not perform any further I/O operations nor interacts
> > > > with other programs. We extensively use the C++ standard library string
> > > > class for manipulation of data.
> > > >
> > > > We started searching for the reason, and eliminated I/O as a factor.
> > > > During some experiments we found out that one, perhaps not the only
> > > > one, factor is OpenBSD's memory management. To test this assumption we
> > > > wrote a simple program that allocates and frees memory in a loop.
> > > > Something like:
> > > >
> > > > for (...) {
> > > >   void *buffer = malloc(...);
> > > >   ...
> > > >   free(buffer);
> > > > }
> > > >
> > > > We compiled it on OpenBSD with clang
> > > > $ /usr/bin/c++ --version
> > > > OpenBSD clang version 10.0.1
> > > > Target: amd64-unknown-openbsd6.8
> > > > Thread model: posix
> > > > InstalledDir: /usr/bin
> > > >
> > > > using options '-O3 -DNDEBUG -std=gnu++11' and ran it without memory
> > > > junking.
> > > >
> > > > $ time MALLOC_OPTIONS=jj ./memory_allocs --cycles 123456789 --size
> > 1024
> > > >
> > > > real0m27.218s
> > > > user0m27.220s
> > > > sys 0m0.020s
> > > >
> > > > We compiled the same program on Ubuntu 20.04 with g++
> > > > $ /usr/bin/c++ --version
> > > > c++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > > >
> > > > using the same options '-O3 -DNDEBUG -std=gnu++11'
> > > >
> > > > $ time ./memory_allocs --cycles 123456789 --size 1024
> > > >
> > > > real0m1,920s
> > > > user0m1,915s
> > > > sys 0m0,004s
> > > >
> > > > Both systems were tested in the same virtualized environment (VSphere),
> > > > thus we can assume the "hardware" is the same.
> > > >
> > > > Given the virtual environment, the tests might not be scientifically
> > > > the best choice, but they serve the observation well enough. We
> > > > actually ruled out virtualization as a cause in other tests.
> > >
> > > Short story: the slowness is because you get more security.
> > >
> > > Somewhat longer story: depending on the size if the allocation actual
> > > unmaps take place on free. This will catch use-after-free always. For
> > > smaller allocations, caching takes place, sadly you did not tell us
> > > how big the total of your allocations are. So I cannot predict if
> > > enlargering the cache will help you.
> > >
> > > Now the differnence is quite big so I like to know what you are doing
> > > exactly in your test program.  Please provide the full test porogram
> > > so I can take a look.
> > >
> > > >
> > > > What other options are there we could try in order to speed the memory
> > > > management up?
> > >
> > > Some hintss: allocate/free less, use better algorithms that do not
> > > allocate as much.  With C++ make sure your code uses moves of objects
> > > instead of copies whenever possible. Use reserve() wisely. If all else
> > > fails you might go for custom allocaters, but you will loose security
> > > features.
> > >
> > >   -Otto
> > >
> > > >
> > > > Also are there any other known areas, for CPU bound processing, where
> > > > OpenBSD performs worse than other "common" platforms?
> > > >
> > > > Cheers,
> > > > Marek
> > > >
> > >
> > 
> > To reply to myself.
> > 
> > Be VERY careful when drawing conclusions from these kinds of test
> > programs. To demonstr

Re: OpenBSD (memory management) performance issues

2021-01-27 Thread Otto Moerbeek
On Wed, Jan 27, 2021 at 08:39:46AM +0100, Otto Moerbeek wrote:

> On Tue, Jan 26, 2021 at 04:08:40PM +, Marek Klein wrote:
> 
> > Hi,
> > 
> > We are working on an appliance like product that is based on OpenBSD.
> > Recently we found out that our performance critical C++ program is
> > ~2.5 times slower on OpenBSD compared to Ubuntu 20.04.
> > 
> > The program basically just reads data from stdin, does some
> > transformation of the data, and returns the result on stdout, thus
> > the program does not perform any further I/O operations nor interacts
> > with other programs. We extensively use the C++ standard library string
> > class for manipulation of data.
> > 
> > We started searching for the reason, and eliminated I/O as a factor.
> > During some experiments we found out that one, perhaps not the only
> > one, factor is OpenBSD's memory management. To test this assumption we
> > wrote a simple program that allocates and frees memory in a loop.
> > Something like:
> > 
> > for (...) {
> >   void *buffer = malloc(...);
> >   ...
> >   free(buffer);
> > }
> > 
> > We compiled it on OpenBSD with clang
> > $ /usr/bin/c++ --version
> > OpenBSD clang version 10.0.1
> > Target: amd64-unknown-openbsd6.8
> > Thread model: posix
> > InstalledDir: /usr/bin
> > 
> > using options '-O3 -DNDEBUG -std=gnu++11' and ran it without memory
> > junking.
> > 
> > $ time MALLOC_OPTIONS=jj ./memory_allocs --cycles 123456789 --size 1024
> > 
> > real0m27.218s
> > user0m27.220s
> > sys 0m0.020s
> > 
> > We compiled the same program on Ubuntu 20.04 with g++
> > $ /usr/bin/c++ --version
> > c++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > 
> > using the same options '-O3 -DNDEBUG -std=gnu++11'
> > 
> > $ time ./memory_allocs --cycles 123456789 --size 1024
> > 
> > real0m1,920s
> > user0m1,915s
> > sys 0m0,004s
> > 
> > Both systems were tested in the same virtualized environment (VSphere),
> > thus we can assume the "hardware" is the same.
> > 
> > Given the virtual environment, the tests might not be scientifically
> > the best choice, but they serve the observation well enough. We
> > actually ruled out virtualization as a cause in other tests.
> 
> Short story: the slowness is because you get more security.
> 
> Somewhat longer story: depending on the size if the allocation actual
> unmaps take place on free. This will catch use-after-free always. For
> smaller allocations, caching takes place, sadly you did not tell us
> how big the total of your allocations are. So I cannot predict if
> enlargering the cache will help you.
> 
> Now the differnence is quite big so I like to know what you are doing
> exactly in your test program.  Please provide the full test porogram
> so I can take a look.
> 
> > 
> > What other options are there we could try in order to speed the memory
> > management up?
> 
> Some hintss: allocate/free less, use better algorithms that do not
> allocate as much.  With C++ make sure your code uses moves of objects
> instead of copies whenever possible. Use reserve() wisely. If all else
> fails you might go for custom allocaters, but you will loose security
> features.
> 
>   -Otto
> 
> > 
> > Also are there any other known areas, for CPU bound processing, where
> > OpenBSD performs worse than other "common" platforms? 
> > 
> > Cheers,
> > Marek
> > 
> 

To reply to myself.

Be VERY careful when drawing conclusions from these kinds of test
programs. To demonstrate, the loop in the test program below gets
compiled out by some compilers with some settings. 

So again, please provide your test program.

-Otto

#include 
#include 
#include 
#include 

int
main(int argc, char *argv[])
{
size_t count, sz, i;
char *p;
const char *errstr;

count = strtonum(argv[1], 0, LONG_MAX, );
if (errstr) 
errx(1, "%s: %s", argv[1], errstr);
sz = strtonum(argv[2], 0, LONG_MAX, );
if (errstr) 
errx(1, "%s: %s", argv[2], errstr);

printf("Run with %zu %zu\n", count, sz);

for (i = 0; i < count; i++) {
p = malloc(sz);
if (p == NULL)
err(1, NULL);
*p = 1;
free(p);
}
}





Re: OpenBSD (memory management) performance issues

2021-01-26 Thread Otto Moerbeek
On Tue, Jan 26, 2021 at 04:08:40PM +, Marek Klein wrote:

> Hi,
> 
> We are working on an appliance like product that is based on OpenBSD.
> Recently we found out that our performance critical C++ program is
> ~2.5 times slower on OpenBSD compared to Ubuntu 20.04.
> 
> The program basically just reads data from stdin, does some
> transformation of the data, and returns the result on stdout, thus
> the program does not perform any further I/O operations nor interacts
> with other programs. We extensively use the C++ standard library string
> class for manipulation of data.
> 
> We started searching for the reason, and eliminated I/O as a factor.
> During some experiments we found out that one, perhaps not the only
> one, factor is OpenBSD's memory management. To test this assumption we
> wrote a simple program that allocates and frees memory in a loop.
> Something like:
> 
> for (...) {
>   void *buffer = malloc(...);
>   ...
>   free(buffer);
> }
> 
> We compiled it on OpenBSD with clang
> $ /usr/bin/c++ --version
> OpenBSD clang version 10.0.1
> Target: amd64-unknown-openbsd6.8
> Thread model: posix
> InstalledDir: /usr/bin
> 
> using options '-O3 -DNDEBUG -std=gnu++11' and ran it without memory
> junking.
> 
> $ time MALLOC_OPTIONS=jj ./memory_allocs --cycles 123456789 --size 1024
> 
> real  0m27.218s
> user  0m27.220s
> sys   0m0.020s
> 
> We compiled the same program on Ubuntu 20.04 with g++
> $ /usr/bin/c++ --version
> c++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> 
> using the same options '-O3 -DNDEBUG -std=gnu++11'
> 
> $ time ./memory_allocs --cycles 123456789 --size 1024
> 
> real  0m1,920s
> user  0m1,915s
> sys   0m0,004s
> 
> Both systems were tested in the same virtualized environment (VSphere),
> thus we can assume the "hardware" is the same.
> 
> Given the virtual environment, the tests might not be scientifically
> the best choice, but they serve the observation well enough. We
> actually ruled out virtualization as a cause in other tests.

Short story: the slowness is because you get more security.

Somewhat longer story: depending on the size if the allocation actual
unmaps take place on free. This will catch use-after-free always. For
smaller allocations, caching takes place, sadly you did not tell us
how big the total of your allocations are. So I cannot predict if
enlargering the cache will help you.

Now the differnence is quite big so I like to know what you are doing
exactly in your test program.  Please provide the full test porogram
so I can take a look.

> 
> What other options are there we could try in order to speed the memory
> management up?

Some hintss: allocate/free less, use better algorithms that do not
allocate as much.  With C++ make sure your code uses moves of objects
instead of copies whenever possible. Use reserve() wisely. If all else
fails you might go for custom allocaters, but you will loose security
features.

-Otto

> 
> Also are there any other known areas, for CPU bound processing, where
> OpenBSD performs worse than other "common" platforms? 
> 
> Cheers,
> Marek
> 



Re: Provide a code example

2021-01-24 Thread Otto Moerbeek
On Sun, Jan 24, 2021 at 09:12:16AM +0100, Ivan wrote:

> Can you send a C code example of obtaining sysctl vm.loadavg via sysctl(2) 
> function?
> -- 
> Regards,
> Ivan
> 

Youn have the source: grepping for VM_LOADAVG turns up 

usr.bin/top/machine.c:  static int sysload_mib[] = {CTL_VM, VM_LOADAVG};

amongst others.

-Otto



Re: Understanding memory statistics

2021-01-23 Thread Otto Moerbeek
On Sat, Jan 23, 2021 at 04:40:06AM +, Anindya Mukherjee wrote:

> Thanks for the explanation! I noticed during my earlier investigation
> that the source of the data size is struct vmspace::vm_dused. This is
> updated mostly in uvm_mapanon and uvm_map. The second function seems to
> be a more general case. I think during my file mapping the second
> function is called and the part where vm_dused is updated is skipped.
> I'm setting up a VM on my machine with a debug kernel (this should be an
> interesting exercise) to do some exploratory kernel debugging, just to
> understand the process. So far this seems to make sense. I never doubted
> the system is doing the "right" thing :)
> 
> I also had some questions about the page and the buffer caches. So far I
> gathered the following facts (please correct me if I am wrong):
> 
> OpenBSD has separate page and buffer caches, i.e., no UBC.
> Q: Is this done for security reasons or due to some complication? Just
> curious.

page cache is a misnomer, it does not exist.

UBC is the concept that file i/o via mmapped files and the read/write
systemc calls use the same mechanism for caching the files in memory.
One of the consequences is that if a file is both mmapped and being
written to via write(2), that change is also visible in the mmapped
view immediately. Vice-versa the same.

We do not have that unification, a long time ago an attempt was done
(you can still find that in cvs) but that was reverted since it caused
all kinds of probems the author wasn't able to solve in reasonable
time. So My guess it would be we do not have it because of
complexity/nobody did and finished the work.

So the machanism we have is a buffer cache (which in the end uses
pages) used by the read and write system calls. Plus we have the
mmapping mechanism used for both file mapping and anonymous mappings
(mappings not corresponding to any file, like program data).

> 
> The Inactive pages seem to back the page cache. I ran my file mapping
> code a few times mapping/releasing a large file (about 300 MB) with
> systat running in the uvm view, and saw the page counts for Active and
> Inactive swing back and forth, keeping the total fixed.
> 
> Then I ran md5 on another 100 MB file, and this time the Cache number in
> top grew by about 100 MB, with some brief disk activity (I'm on SSD so
> things are zippy). I next ran my file mapping program on it. This time
> the Active pages grew by about 100 MB, raising the total by the same
> amount. When the program ended, those pages moved to Inactive, keeping
> the total fixed. There was no disk activity during this and Cache
> remained unchanged.
> 
> This seems to indicate that the data for the new file was copied from
> the buffer cache to the page cache during the mapping, and both copies
> were maintained.

Mmapped file activity would indeed show in Active or Inactive pages
(which one depends on how much access is done to the pages) while I/O
via read or write shows up in Cache indeed. But I would not know from
the top of my head if pages in the Buffer Cache from a file are
explicitly copied to new pages when the same files is mmapped or just
cause some changes on the page admin level so that the mapping refers
to pages that are already in mem via the Buffer Cache. Thinking about
it it would involve some form of unification, so likely some mem to
mem copying is going on. It shows my experience is mostly userland
memory management

-Otto

> 
> Regards,
> Anindya
> 
> From: Otto Moerbeek 
> Sent: January 22, 2021 12:01 AM
> To: Anindya Mukherjee 
> Cc: misc@openbsd.org 
> Subject: Re: Understanding memory statistics 
>  
> On Thu, Jan 21, 2021 at 10:38:59PM +, Anindya Mukherjee wrote:
> 
> > Hi,
> > 
> > Just to follow up, I was playing with allocating memory from a test
> > program in various ways in order to produce a situation when SIZE is
> > less than RES. The following program causes this to happen. If I mmap a
> > large file, the SIZE remains tiny, indicating that the mapped region is
> > not counted as part of text + data + stack. Then when I go ahead and
> > touch all the memory, SIZE remains tiny but RES grows to the size of the
> > file. Very interesting.
> 
> So SIZE does not include mappings backed by a file system object, but
> RES does. RES only grows once the pages are touched, this is demand
> paging in action (anon pages act the same way).
> 
> Nice. I already suspected would be something like that, but never took
> the time to find out by experimenting or code study.
> 
> Now the next quesion is if SIZE *should* include non-anonymous
> pages. getrlimit(2) explicitly says RLIMIT_DATA (which is limiting
> SIZE) only includes anonymous data. So that hints SIZE indeed should not
>

Re: Understanding memory statistics

2021-01-22 Thread Otto Moerbeek
On Thu, Jan 21, 2021 at 10:38:59PM +, Anindya Mukherjee wrote:

> Hi,
> 
> Just to follow up, I was playing with allocating memory from a test
> program in various ways in order to produce a situation when SIZE is
> less than RES. The following program causes this to happen. If I mmap a
> large file, the SIZE remains tiny, indicating that the mapped region is
> not counted as part of text + data + stack. Then when I go ahead and
> touch all the memory, SIZE remains tiny but RES grows to the size of the
> file. Very interesting.

So SIZE does not include mappings backed by a file system object, but
RES does. RES only grows once the pages are touched, this is demand
paging in action (anon pages act the same way).

Nice. I already suspected would be something like that, but never took
the time to find out by experimenting or code study.

Now the next quesion is if SIZE *should* include non-anonymous
pages. getrlimit(2) explicitly says RLIMIT_DATA (which is limiting
SIZE) only includes anonymous data. So that hints SIZE indeed should not
include those anon pages. 

To back this up:

ps(1) lists several size related stats:

DescKeywFunctionValue
Datadsizdsize   p_vm_dsize
Resident 1  rss p_rssizep_vm_rssize
Resident 2  rsz rssize  p_vm_rssize
Stack   ssizssize   p_vm_ssize
Text (code) tsiztsize   p_vm_tsize
Virtual vsizvsize   p_vm_dsize + p_vm_ssize + _vm_tsize


top(1) uses the equivalent of vsiz for SIZE and rss for RES. So this
is consistent with your observations.

I note that the rss vs rsz distinciton ps(1) mentions does not
actually seems to be implemented in ps(1).

BTW: the proper way to get the size is by opening the file and use fstat(2).

-Otto


> 
> Quick and very dirty code below:
> 
> /* This demonstrates why the SIZE column in OpenBSD top can be less than
>  * the RES column. This is because mmapped areas of virtual memory are
>  * not counted as text, data, or size, but counted as part of the
>  * resident pages, when touched. The program maps a (preferably large)
>  * file and then waits for the user to examine the process memory
>  * statistics.
>  */
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int main(int argc, char **argv)
> {
>   char ch;
>   char *pch;
>   void *result;
>   FILE *fp;
>   int fd;
>   int i;
>   size_t mapSize;
>   int current = 0;
>   const int increment = 10;
>   double percent;
>   double mapRatio;
> 
>   if (argc < 2)
>   {
>   printf("No file name supplied.\n");
>   exit(1);
>   }
> 
>   printf("About to mmap. Press Enter... ");
>   getchar();
>   fp = fopen(argv[1], "r");
>   if (fp == NULL)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   fd = fileno(fp);
> 
>   if (fseek(fp, 0, SEEK_END) == -1)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   mapSize = ftell(fp);
>   if (mapSize == -1)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   if (fseek(fp, 0, SEEK_SET) == -1)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   result = mmap(NULL, mapSize, PROT_READ, MAP_PRIVATE, fd, 0);
>   if(close(fd) == -1)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   if (result == MAP_FAILED)
>   {
>   perror(NULL);
>   exit(1);
>   }
>   printf("%zu bytes mmapped at %p. Press Enter... ", mapSize, result);
>   getchar();
> 
>   pch = (char *)result;
>   printf("Touching mapped memory... ");
>   mapRatio = 100.0 / mapSize;
>   for (i = 0; i < mapSize; i++)
>   {
>   ch = pch[i];
>   percent = (i + 1) * mapRatio;
>   if (current < percent)
>   {
>   while (current < percent)
>   current+= increment;
>   if (current > percent)
>   current -=increment;
>   if (current < 100)
>   {
>   printf("%d%%... ", current);
>   fflush(stdout);
>       current+= increment;
>   }
>   }
>   }
>   printf("100%%\nRead done. Press Enter... ");
>   getchar();
>   if(munmap(result, 

Re: rm: fts_read: No such file or directory

2021-01-13 Thread Otto Moerbeek
On Wed, Jan 13, 2021 at 09:46:27PM +0100, Paul de Weerd wrote:

> Hi all,
> 
> While doing some clean-up on my backup filesystem (which extensively
> uses hardlinks), I came across the error in Subject:
> 
>   rm: fts_read: No such file or directory
> 
> Traversing the hierarchy I was trying to remove, I get similar
> fts_read errors when I `ls` in certain places, but a repeated rm runs
> to completion fine (the tree is gone afterwards).
> 
> There's nothing in dmesg suggesting filesystem corruption, the
> filesystem unmounts and remounts cleanly, I'm running a forced fsck
> now which says "** File system is already clean".  It's a rather large
> filesystem with many inodes in use, so it'll take some time to
> complete.  Also, it's on a softraid crypto device, if that matters:
> 
> sd2: 5231654MB, 512 bytes/sector, 10714427745 sectors
> 
> Reading fts_read(3) wasn't really enlightening as to why a directory
> that's supposedly there, wouldn't be there anymore.  (note that I
> wasn't running another rm in the same tree in parallel when I got
> these errors - I did try to force the error by doing just that, but
> that went through without a single error).
> 
> Could there be some TOCTOU issue here somewhere?  Or some cache
> misbehaviour?  Or is it really dying hardware?

My first bet would be some form of corruption. FLipped bits in e..g
directories while operating normally cannot be seen by the
clean/unclean flag in the superblock. That one only records if the
filesystem was unmounted before reboot, shutdown or crash.

The forced fsck might reveal more.

-Otto


> 
> Paul 'WEiRD' de Weerd
> 
> OpenBSD 6.8-current (GENERIC.MP) #267: Sat Jan  9 19:23:55 MST 2021
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 34311208960 (32721MB)
> avail mem = 33256046592 (31715MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe6690 (57 entries)
> bios0: vendor Dell Inc. version "2.10.0" date 05/24/2018
> bios0: Dell Inc. PowerEdge R210 II
> acpi0 at bios0: ACPI 4.0
> acpi0: sleep states S0 S4 S5
> acpi0: tables DSDT FACP SPMI DMAR ASF! HPET APIC MCFG BOOT SSDT ASPT SSDT 
> SSDT SPCR HEST ERST BERT EINJ
> acpi0: wakeup devices P0P1(S4) GLAN(S0) EHC1(S4) EHC2(S4) XHC_(S4) RP01(S5) 
> PXSX(S4) RP02(S5) PXSX(S4) RP03(S5) PXSX(S4) RP04(S5) PXSX(S4) RP05(S5) 
> PXSX(S4) RP06(S5) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 14318179 Hz
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.91 MHz, 06-2a-07
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 1, core 0, package 0
> cpu2 at mainbus0: apid 2 (application processor)
> cpu2: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 1, package 0
> cpu3 at mainbus0: apid 3 (application processor)
> cpu3: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07
> cpu3: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 1, core 1, package 0
> cpu4 at mainbus0: apid 4 (application processor)
> cpu4: Intel(R) Xeon(R) CPU E31260L @ 

Re: Understanding memory statistics

2021-01-10 Thread Otto Moerbeek
On Sun, Jan 10, 2021 at 09:34:49PM +, Anindya Mukherjee wrote:

> Hi, I'm trying to understand the various numbers reported for memory
> usage from top, vmstat, and systat. I'm running OpenBSD 6.8 on a Dell
> Optiplex 7040 with an i7 6700, 3.4 Ghz and 32 GB RAM. The GPU is an
> Intel HD Graphics 530, integrated. Everything is running smoothly. For
> my own edification, I have a few questions. I searched the mailing lists
> for similar questions in the past, and found some, but they did not
> fully satisfy my curiosity.
> 
> dmesg reports:
> real mem = 34201006080 (32616MB)
> avail mem = 33149427712 (31613MB)
> I think the difference is due to the GPU reserving some memory.

That might be, I think it at least includes mem used by the kernel
for its code and static data.

> Q: Is there a way to view the total amount of video memory, the amount
> currently being used, and the GPU usage?

AFAIK not. Some bioses have settings for the video mem used (if you
have it shared with main mem).

> 
> When I run top, it reports the following memory usage:
> Memory: Real: 1497M/4672M act/tot Free: 26G Cache: 2236M Swap: 0K/11G
> If I sum up the RES numbers for all the processes, it is close to the
> act number = 1497 M (this is mostly due to Firefox). I read that the
> cache number is included in tot, but even if I subtract cache and act
> from tot there is 939 MB left.
> Q: What is this 939 MB being used for, assuming the above makes sense?

inactive pages?

> Q: What is the cache number indicating exactly?

memoy used for file systemn caching.

> 
> If I sum up tot + free * 1024 I get 31296 MB, which less than the 31613
> MB of available memory reported by dmesg. I initially assumed that the
> difference might be kernel wired memory. However the uvm view of systat
> shows 7514 wired pages = approx 30 MB which is very small.
> Q: What is the remaining memory being used for?

I think you are looking at dynamic allocations done by the kernel.

> Q: What is in kernel wired memory? In particular, is the file system
> cache in kernel wired memory or in the cache number?

Kernel wired means data pages allocated by the kernel that will not be
paged out. The file system mem will also not be paged out (when
evecting those they are discarded if not dirty or written to the file
if dirty) but the file system cache pages are not in the wired count.

> In the man page for systat(1) the active memory is described as being
> used by processes active in the last 20 seconds (recently), while the
> total is for all processes. These are the same two numbers as act and
> tot in top, and act = avm as reported by vmstat. This confused me
> because adding up the RES sizes of all the processes I get nowhere near
> to tot (even after subtracting cache).

Accounting of shared pages is hard and ambiguous. To illustrate: if
you switch on S in top, you'll see a bunch of kenel space processes al
at SIZE 0 and the same RES size. They do share the same (kernel)
memory.

> 
> There is another thing that confused me in the top output. At first I
> assumed that SIZE is the total virtual memory size of the process
> (allocated), while RES is the resident size. For example, this is so on
> Linux and hence in that case by definition SIZE should always be greater
> than RES. However here in many cases SIZE < RES.

I am unsure how that is caused. It is possibly a shared pages thing.

> 
> I read in the man page for top that SIZE is actually text + data + stack
> for the process. However this did not clear up my confusion or
> misunderstanding. Perhaps something to do with shared memory not being
> counted?
> Q: How can SIZE be less that RES? An example showing how this could
> happen would be really helpful.

I guess doing some experimenting and code analysis and share your findings.

> 
> Q: Finally, where can I find documentation about the classification for
> memory pages (active, inactive, wired, etc.)? I suspect some digging
> around in the source in order, but could use some pointers.

The start would be man uvm_init. But the rest is code.

> 
> I hope these make sense and are not too pedantic. Looking forward to
> comments from the experts, thanks!
> 
> Anindya Mukherjee
> 

-Otto



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-09 Thread Otto Moerbeek
On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote:

> On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:
> > Hey folks,
> > 
> > I've noticed some surprising behaviour from cmp(1) when using the '-s'
> > flag.
> > 
> > It appears that cmp -s is ignoring the byte offset arguments I'm giving
> > it.
> 
> > Not sure what to make of this, I noticed this same behaviour on
> > DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious.
> > This certainly caused some frustration before I figured out what was going
> > on.
> 
> The bug seems to be in the short-circuit optimization for regular files[1]:
> 
> void
>   c_regular(int fd1, char *file1, off_t skip1, off_t len1,
>   int fd2, char *file2, off_t skip2, off_t len2)
>   {
>   u_char ch, *p1, *p2;
>   off_t byte, length, line;
>   int dfound;
>   
>   if (sflag && len1 != len2)
>   exit(1);
>   
>   if (skip1 > len1)
>   eofmsg(file1);
>   len1 -= skip1;
>   if (skip2 > len2)
>   eofmsg(file2);
>   len2 -= skip2;
> 
> The short-circuit should probably be moved below the subsequent chunk of
> code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag,
> so it'll be quiet.[2] Doing this works for me. See patch at end of message.
> 
> Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I
> can confirm FreeBSD still has the problem. (DragonflyBSD has nearly
> identical code.) But that implementation duplicates the short-circuit, along
> with the bug of not accounting for skip1 and skip2, in cmp.c as part of
> implementing the -z flag[5]:
> 
>   if (special)
>   c_special(fd1, file1, skip1, fd2, file2, skip2);
>   else {
>   if (zflag && sb1.st_size != sb2.st_size) {
>   if (!sflag)
>   (void) printf("%s %s differ: size\n",
>   file1, file2);
>   exit(DIFF_EXIT);
>   }
>   c_regular(fd1, file1, skip1, sb1.st_size,
>   fd2, file2, skip2, sb2.st_size);
>   }
>   exit(0);
> 
> It appears that the June 20, 2000 fix to the short-circuit in regular.c
> wasn't recognized during the July 14, 2000 -z feature addition.[6][7]
> 
> [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12
> [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7
> [3] 
> https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c
> [4] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551
> [5] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551=markup#l193
> [6] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883=markup
> [7] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup=63157
> 
> --- regular.c 6 Feb 2015 23:21:59 -   1.12
> +++ regular.c 9 Jan 2021 07:51:13 -
> @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
>   off_t byte, length, line;
>   int dfound;
>  
> - if (sflag && len1 != len2)
> - exit(1);
> -
>   if (skip1 > len1)
>   eofmsg(file1);
>   len1 -= skip1;
>   if (skip2 > len2)
>   eofmsg(file2);
>   len2 -= skip2;
> +
> + if (sflag && len1 != len2)
> + exit(1);
>  
>   length = MINIMUM(len1, len2);
>   if (length > SIZE_MAX) {
> 

I came to the same diff independently. In the meantime it has been committed.

-Otto



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-08 Thread Otto Moerbeek
On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:

> Hey folks,
> 
> I've noticed some surprising behaviour from cmp(1) when using the '-s' flag.
> 
> It appears that cmp -s is ignoring the byte offset arguments I'm giving it.
> 
> I don't want to waste time babbling, so here's an example snippet to show 
> what I'm talking about:
> 
> #!/bin/sh
> 
> echo 'my line' > /tmp/1.txt
> echo 'my other line' >> /tmp/1.txt
> echo 'same same' >> /tmp/1.txt
> 
> echo 'my differnt line' > /tmp/2.txt
> echo 'my other different line' >> /tmp/2.txt
> echo 'same same' >> /tmp/2.txt
> 
> # Determine byte offsets (we only want to compare lines >= 3)
> offset1="$(head -2 /tmp/1.txt | wc -c)"
> offset2="$(head -2 /tmp/2.txt | wc -c)"
> 
> # Compare files and show exit code
> cmp /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
> printf '\nReturn code = %s\n' "$?"
> 
> cmp -s /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
> printf '\nReturn code with "-s" = %s\n' "$?"
> 
> As you can see, 'cmp -s' returns an exit code of '1', unlike cmp without the 
> '-s' which returns '0'.
> 
> Not sure what to make of this, I noticed this same behaviour on DragonflyBSD 
> and FreeBSD, so maybe I'm just missing something obvious. This certainly 
> caused some frustration before I figured out what was going on.
> 
> Regards,
> 
> Jordan
> 

This is a bug. It has been there since the beginning, according to
http://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c

FreeBSD has it fixed, NetBSD not.

-Otto

Index: regular.c
===
RCS file: /cvs/src/usr.bin/cmp/regular.c,v
retrieving revision 1.12
diff -u -p -r1.12 regular.c
--- regular.c   6 Feb 2015 23:21:59 -   1.12
+++ regular.c   9 Jan 2021 06:53:20 -
@@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
off_t byte, length, line;
int dfound;
 
-   if (sflag && len1 != len2)
-   exit(1);
-
if (skip1 > len1)
eofmsg(file1);
len1 -= skip1;
if (skip2 > len2)
eofmsg(file2);
len2 -= skip2;
+
+   if (sflag && len1 != len2)
+   exit(1);
 
length = MINIMUM(len1, len2);
if (length > SIZE_MAX) {



Re: misc panics

2020-12-28 Thread Otto Moerbeek
On Mon, Dec 28, 2020 at 10:25:08AM +0100, Bastien Durel wrote:

> Le lundi 28 décembre 2020 à 09:17 +, Stuart Henderson a écrit :
> > > So hardware failure confirmed :/ Do you think I can change the RAM
> > > or
> > > it's more likely a CPU/Chipset failure ?
> > > 
> > > Thanks,
> > > 
> > 
> > If you have multiple sticks of RAM, try removing some.
> I have only one

trying to reaset it is worth a try.

-Otto



Re: seasons greetings and a network question

2020-12-20 Thread Otto Moerbeek
On Sun, Dec 20, 2020 at 10:49:49AM +, Laura Smith wrote:

> 
> 
> ‐‐‐ Original Message ‐‐‐
> On Sunday, 20 December 2020 10:28, Peter J. Philipp  wrote:
> 
> 
> > The story is, that I log time to lives (TTL) with a setsockopt() on my 
> > logging
> > DNS server. Whenever mail.openbsd.org sends a mail it does not ask its cache
> > but does a dns query every time. This is a great beacon on the Internet (at
> > least for me) it tells something of the dedication openbsd has for 
> > delivering
> > the mail from the mailing lists.
> >
> >
> 
> Erm 
> 
> Doesn't mail.openbsd.org, alongside most other openbsd servers, originate 
> from Theo's basement ?

No. Assumptions etc

-Otto

> 
> If so, it tells you nothing "beacon" about the state of the internet.  There 
> are many, many, many better projects out there that you can monitor if you 
> wish to have a true "beacon" view of what's going on on the internet.
> 
> Nor does it tell you anything of openbsd's dedication to anything apart from 
> its obstinance in insisting that Theo's basement is the best place to host 
> servers and refuse to answer questions from the community in that respect 
> (see discussions when Theo came begging for money for electricity or whatever 
> it was a few years back).
> 
> Still, I'm sure Theo will be happy for you to blow his trumpet for him. ;-)
> 



Re: Potential dig bug?

2020-12-17 Thread Otto Moerbeek
On Thu, Dec 17, 2020 at 11:10:59AM +0100, Otto Moerbeek wrote:

> On Thu, Dec 17, 2020 at 12:27:00AM -0800, Jordan Geoghegan wrote:
> 
> > 
> > 
> > On 12/16/20 11:19 PM, Otto Moerbeek wrote:
> > > On Wed, Dec 16, 2020 at 02:37:19PM -0800, Jordan Geoghegan wrote:
> > > 
> > > > Hi folks,
> > > > 
> > > > I've found some surprising behaviour in the 'dig' utility. I've noticed 
> > > > that
> > > > dig doesn't seem to support link local IPv6 addresses. I've got unbound
> > > > listening on a link local IPv6 address on my router and all queries 
> > > > seem to
> > > > be working. I'm advertising this DNS info with rad, and I confirmed with
> > > > tcpdump that my devices such as iPhones, Macs, Windows, Linux desktops 
> > > > etc
> > > > are all properly querying my unbound server over IPv6.
> > > > 
> > > > dhclient doesn't seem to allow you to specify an IPv6 address in it's
> > > > 'supersede'  options, so I manually edited my OpenBSD desktops 
> > > > resolv.conf
> > > > to specify the IPv6 unbound server first. Again, I confirmed with 
> > > > tcpdump
> > > > that my desktop was properly querying the unbound server over IPv6 (ie
> > > > Firefox, ping, ssh etc all resolved domains using this server).
> > > > 
> > > > I used 'dig' to make a query, and I noticed it was ignoring my link 
> > > > local
> > > > IPv6 nameserver in my resolv.conf. I'll save you guys the long form Ted 
> > > > talk
> > > > here and just make my point:
> > > > 
> > > > $ cat resolv.conf
> > > >     nameserver fe80::f29f:c2ff:fe17:b8b2%em0
> > > >     nameserver 2606:4700:4700::
> > > >     lookup file bind
> > > >     family inet6 inet4
> > > > 
> > > > $ dig google.ca
> > > >     [snip]
> > > >     ;; Query time: 12 msec
> > > >     ;; SERVER: 2606:4700:4700::#53(2606:4700:4700::)
> > > >     [snip]
> > > > 
> > > > There's a bit of a delay as it waits for a time out, and then it falls 
> > > > back
> > > > to the cloudflare IPv6 server.
> > > > 
> > > > I tried specifying the server with '@' as well as specifying source
> > > > IP/interface with '-I' to no avail. It seems dig really doesn't like the
> > > > 'fe80::%em0' notation, as  '@' and '-I' worked fine when used without a
> > > > link-local address.
> > > > 
> > > > Is this a bug or a feature? Am I just doing something stupid? Any 
> > > > insight
> > > > would be appreciated.
> > > I think it is a bug and I can reproduce. Will invesigate deeper later.
> > > 
> > >   -Otto
> > > 
> > 
> > Hi Otto,
> > 
> > Thanks for looking into this! I took Bodie's advice and tested nslookup and
> > host, and they both seem to have the same behaviour as dig.
> > 
> > Regards,
> > 
> > Jordan
> > 
> 
> That is no big surprise, as they are essentially the same program
> with a different user interface, all built from the same source.
> 
>   -Otto
> 

Fix below, further discussion on tech@

-Otto

Index: dig.c
===
RCS file: /cvs/src/usr.bin/dig/dig.c,v
retrieving revision 1.18
diff -u -p -r1.18 dig.c
--- dig.c   15 Sep 2020 11:47:42 -  1.18
+++ dig.c   17 Dec 2020 11:06:49 -
@@ -1358,7 +1358,7 @@ dash_option(char *option, char *next, di
} else
srcport = 0;
if (have_ipv6 && inet_pton(AF_INET6, value, ) == 1)
-   isc_sockaddr_fromin6(_address, , srcport);
+   isc_sockaddr_fromin6(_address, , srcport, 0);
else if (have_ipv4 && inet_pton(AF_INET, value, ) == 1)
isc_sockaddr_fromin(_address, , srcport);
else
Index: dighost.c
===
RCS file: /cvs/src/usr.bin/dig/dighost.c,v
retrieving revision 1.34
diff -u -p -r1.34 dighost.c
--- dighost.c   15 Sep 2020 11:47:42 -  1.34
+++ dighost.c   17 Dec 2020 11:06:49 -
@@ -540,7 +540,7 @@ get_addresses(const char *hostname, in_p
struct sockaddr_in6 *sin6;
sin6 = (struct sockaddr_in6 *)tmpai->ai_addr;
isc_sockaddr_fromin6([i], >sin6_addr,
-dstport);

Re: Potential dig bug?

2020-12-17 Thread Otto Moerbeek
On Thu, Dec 17, 2020 at 12:27:00AM -0800, Jordan Geoghegan wrote:

> 
> 
> On 12/16/20 11:19 PM, Otto Moerbeek wrote:
> > On Wed, Dec 16, 2020 at 02:37:19PM -0800, Jordan Geoghegan wrote:
> > 
> > > Hi folks,
> > > 
> > > I've found some surprising behaviour in the 'dig' utility. I've noticed 
> > > that
> > > dig doesn't seem to support link local IPv6 addresses. I've got unbound
> > > listening on a link local IPv6 address on my router and all queries seem 
> > > to
> > > be working. I'm advertising this DNS info with rad, and I confirmed with
> > > tcpdump that my devices such as iPhones, Macs, Windows, Linux desktops etc
> > > are all properly querying my unbound server over IPv6.
> > > 
> > > dhclient doesn't seem to allow you to specify an IPv6 address in it's
> > > 'supersede'  options, so I manually edited my OpenBSD desktops resolv.conf
> > > to specify the IPv6 unbound server first. Again, I confirmed with tcpdump
> > > that my desktop was properly querying the unbound server over IPv6 (ie
> > > Firefox, ping, ssh etc all resolved domains using this server).
> > > 
> > > I used 'dig' to make a query, and I noticed it was ignoring my link local
> > > IPv6 nameserver in my resolv.conf. I'll save you guys the long form Ted 
> > > talk
> > > here and just make my point:
> > > 
> > > $ cat resolv.conf
> > >     nameserver fe80::f29f:c2ff:fe17:b8b2%em0
> > >     nameserver 2606:4700:4700::
> > >     lookup file bind
> > >     family inet6 inet4
> > > 
> > > $ dig google.ca
> > >     [snip]
> > >     ;; Query time: 12 msec
> > >     ;; SERVER: 2606:4700:4700::#53(2606:4700:4700::)
> > >     [snip]
> > > 
> > > There's a bit of a delay as it waits for a time out, and then it falls 
> > > back
> > > to the cloudflare IPv6 server.
> > > 
> > > I tried specifying the server with '@' as well as specifying source
> > > IP/interface with '-I' to no avail. It seems dig really doesn't like the
> > > 'fe80::%em0' notation, as  '@' and '-I' worked fine when used without a
> > > link-local address.
> > > 
> > > Is this a bug or a feature? Am I just doing something stupid? Any insight
> > > would be appreciated.
> > I think it is a bug and I can reproduce. Will invesigate deeper later.
> > 
> > -Otto
> > 
> 
> Hi Otto,
> 
> Thanks for looking into this! I took Bodie's advice and tested nslookup and
> host, and they both seem to have the same behaviour as dig.
> 
> Regards,
> 
> Jordan
> 

That is no big surprise, as they are essentially the same program
with a different user interface, all built from the same source.

-Otto



Re: Potential dig bug?

2020-12-16 Thread Otto Moerbeek
On Wed, Dec 16, 2020 at 02:37:19PM -0800, Jordan Geoghegan wrote:

> Hi folks,
> 
> I've found some surprising behaviour in the 'dig' utility. I've noticed that
> dig doesn't seem to support link local IPv6 addresses. I've got unbound
> listening on a link local IPv6 address on my router and all queries seem to
> be working. I'm advertising this DNS info with rad, and I confirmed with
> tcpdump that my devices such as iPhones, Macs, Windows, Linux desktops etc
> are all properly querying my unbound server over IPv6.
> 
> dhclient doesn't seem to allow you to specify an IPv6 address in it's
> 'supersede'  options, so I manually edited my OpenBSD desktops resolv.conf
> to specify the IPv6 unbound server first. Again, I confirmed with tcpdump
> that my desktop was properly querying the unbound server over IPv6 (ie
> Firefox, ping, ssh etc all resolved domains using this server).
> 
> I used 'dig' to make a query, and I noticed it was ignoring my link local
> IPv6 nameserver in my resolv.conf. I'll save you guys the long form Ted talk
> here and just make my point:
> 
> $ cat resolv.conf
>    nameserver fe80::f29f:c2ff:fe17:b8b2%em0
>    nameserver 2606:4700:4700::
>    lookup file bind
>    family inet6 inet4
> 
> $ dig google.ca
>    [snip]
>    ;; Query time: 12 msec
>    ;; SERVER: 2606:4700:4700::#53(2606:4700:4700::)
>    [snip]
> 
> There's a bit of a delay as it waits for a time out, and then it falls back
> to the cloudflare IPv6 server.
> 
> I tried specifying the server with '@' as well as specifying source
> IP/interface with '-I' to no avail. It seems dig really doesn't like the
> 'fe80::%em0' notation, as  '@' and '-I' worked fine when used without a
> link-local address.
> 
> Is this a bug or a feature? Am I just doing something stupid? Any insight
> would be appreciated.

I think it is a bug and I can reproduce. Will invesigate deeper later.

-Otto



ntpd and no RTC

2020-12-06 Thread Otto Moerbeek
Hi,

As seen in another thread there are some questionms on how ntpd works
if no relatime clock is available or it's battery is dead. This poses
problems since ntpd needs DNS and if the time is not right, DNSSEC
validation might fail.

The goals it to work in as mancy cass as possible , but also stop
attempts quickly if ntpd sees it's not going to work out, to avoid
unneeded delays booting: ntpd only backgrounds if it succeeds in
setting the time or realises it is not going to work.

So the hardest case is a machine without proper RTC that only uses a
resolver running on itself.


It goes more or less like this:

1. ntpd check if DNS is working by doing a probe
2. if that fails it does a CD (Checking Disabled) probe. This disables
DNSSEC for that query and should get an answer even if the time is wrong.
3. If that failes it gives up.
4. Otherwise is resolves the needed names (potentially with CD flag)
and gets the time via constraints and NTP packets.
5. If the time is to be moved forward, it does so.
6. In all other cases it does not set the time and backgrounds.
7. After the time is synced, it wil no longer do CD queries and
re-query names.

I (and others) spend quite some time in getting that right and it
works in many cases. But there are some machines that might not work,
especially slow machines.  I consider RTC-less machines broken, so we
are not going to introduce extra delays to comfort those and hinder
other cases. You might get such a slow machine to work by introducing
a sleep in specific places, e.g. after network startup or resolver
starting.

But in many cases it is much easier to just add an extra resolver (not
runing on the machine in question) to resolv.conf.

-Otto



Re: clock not set on boot

2020-12-05 Thread Otto Moerbeek
On Sun, Dec 06, 2020 at 08:39:36AM +0100, Otto Moerbeek wrote:

> On Sun, Dec 06, 2020 at 08:33:33AM +0100, Otto Moerbeek wrote:
> 
> > On Sat, Dec 05, 2020 at 07:42:48PM -0700, Theo de Raadt wrote:
> > 
> > > Andy Goblins  wrote:
> > > 
> > > > Does ntpd need DNS to set the time? Because my reslov.conf points to
> > > > 127.0.0.1 and unbound needs the time before it will work properly.
> > > 
> > > A problem of your own creation.
> > > 
> > 
> > We do attept to work even in this situation, by doing CD (checking
> > disbaled) queries if needed. But in some cases this does not work,
> > esepcially if the net and/or unbound is slow to start. 
> > 
> > e.g. I have one machine that nees a !sleep 1 in it's hostname.if
> > file since the network interface it tto slow to startup and ntpd
> > gives up doing the initial settime.
> > 
> > Add ntpd_flags=-vv and show me the /var/log/daemon lines.
> > 
> > -Otto
> > 
> 
> Ah, I see the info is already there in another post. This edgerouter
> is a slow machine. You can try what I suggested, e.g. by putting a
> sleep after unbound starts in /etc/rc.
> 
> But an easier solution is not to rely on a single resolved and add
> another one in /etc/resolve.conf

Sorry about the typos. *resolver and */etc/resolv.conf

-Otto



Re: clock not set on boot

2020-12-05 Thread Otto Moerbeek
On Sun, Dec 06, 2020 at 08:33:33AM +0100, Otto Moerbeek wrote:

> On Sat, Dec 05, 2020 at 07:42:48PM -0700, Theo de Raadt wrote:
> 
> > Andy Goblins  wrote:
> > 
> > > Does ntpd need DNS to set the time? Because my reslov.conf points to
> > > 127.0.0.1 and unbound needs the time before it will work properly.
> > 
> > A problem of your own creation.
> > 
> 
> We do attept to work even in this situation, by doing CD (checking
> disbaled) queries if needed. But in some cases this does not work,
> esepcially if the net and/or unbound is slow to start. 
> 
> e.g. I have one machine that nees a !sleep 1 in it's hostname.if
> file since the network interface it tto slow to startup and ntpd
> gives up doing the initial settime.
> 
> Add ntpd_flags=-vv and show me the /var/log/daemon lines.
> 
>   -Otto
> 

Ah, I see the info is already there in another post. This edgerouter
is a slow machine. You can try what I suggested, e.g. by putting a
sleep after unbound starts in /etc/rc.

But an easier solution is not to rely on a single resolved and add
another one in /etc/resolve.conf

-Otto



Re: clock not set on boot

2020-12-05 Thread Otto Moerbeek
On Sat, Dec 05, 2020 at 07:42:48PM -0700, Theo de Raadt wrote:

> Andy Goblins  wrote:
> 
> > Does ntpd need DNS to set the time? Because my reslov.conf points to
> > 127.0.0.1 and unbound needs the time before it will work properly.
> 
> A problem of your own creation.
> 

We do attept to work even in this situation, by doing CD (checking
disbaled) queries if needed. But in some cases this does not work,
esepcially if the net and/or unbound is slow to start. 

e.g. I have one machine that nees a !sleep 1 in it's hostname.if
file since the network interface it tto slow to startup and ntpd
gives up doing the initial settime.

Add ntpd_flags=-vv and show me the /var/log/daemon lines.

-Otto



Re: clock not set on boot

2020-12-05 Thread Otto Moerbeek
On Sat, Dec 05, 2020 at 09:10:19PM +, Maurice McCarthy wrote:

> Perhaps add
> 
> ntpd_flags="-s"
> 
> to /etc/rc.conf.local
> 

Nope, that's no longer needed.



Re: OpenBSD as a NAS

2020-12-05 Thread Otto Moerbeek
On Sat, Dec 05, 2020 at 12:36:04PM +, Roderick wrote:

> 
> On Sat, 5 Dec 2020, Georg Bege wrote:
> 
> > keep in mind that the ZFS supported versions may be quite different.
> > 
> > The "one ZFS for many OS" isn't really working in reality,
> > 
> > you may not be able to import your pool into different OS than the one
> > you've created it with.
> 
> Indeed there is this risk. I only superficially testet long ago
> compatibility between Ilumos and FreeBSD.
> 
> But ist there a waranty that UFS partitions created in one system can
> be always mounted without big problems in other systems?
> 
> Rod.
> 

In general there is no warranty at all, see the license. But there are
good chances it will work, as long as the endianess of the old and new
system are the same. 

-Otto



Re: PayPal pool for developer M1 Mac mini for OpenBSD port

2020-12-02 Thread Otto Moerbeek
On Thu, Dec 03, 2020 at 03:18:54AM +0200, Mihai Popescu wrote:

> I have only good wishes for the project, but I still don't get one thing:
> why do some people start to behave oddly whenever Apple comes into
> discussion.
> They are doing a proprietary thing, closed as hell, no documentation and so
> on. Why is this impulse to write code for such a thing. Just asking ...

It's a new interesting ARM platform with very good performance. Yes,
it is closed but it's also kind of a nice challenge to overcome that
hurdle. So mixed feelings about that part.

-Otto



Re: Large Filesystem

2020-11-15 Thread Otto Moerbeek
On Sun, Nov 15, 2020 at 02:57:49PM -0500, Kenneth Gober wrote:

> On Sun, Nov 15, 2020 at 8:59 AM Mischa  wrote:
> 
> > On 15 Nov at 14:52, Otto Moerbeek  wrote:
> > > fsck wil get slower once you start filling it, but since your original
> > > fs had about 104k files it expect it not getting too bad. If the speed
> > > for your usecase is good as well I guess you should be fine.
> >
> > Will see how it behaves and try to document as much as possible.
> > I can always install another BSD on it. ;)
> >
> 
> To give a very rough idea, here is a sample running fsck on an FFS2
> file system with a fairly large number of files:
> 
> 
> $ df -ik /nfs/archive
> 
> Filesystem  1K-blocks  Used Avail Capacity iused   ifree  %iused
> Mounted on
> 
> /dev/sd1g   12308149120 7477490128 421525153664% 4800726 383546408
> 1%   /nfs/archive
> 
> $ doas time fsck -f /nfs/archive
> 
> ** /dev/sd1g (6d3438729df51b22.g) (NO WRITE)
> 
> ** Last Mounted on /nfs/archive
> 
> ** Phase 1 - Check Blocks and Sizes
> 
> ** Phase 2 - Check Pathnames
> 
> ** Phase 3 - Check Connectivity
> 
> ** Phase 4 - Check Reference Counts
> 
> ** Phase 5 - Check Cyl groups
> 
> 4800726 files, 934686266 used, 603832374 free (35534 frags, 75474605
> blocks, 0.0% fragmentation)
>  3197.25 real35.86 user66.03 sys
> 
> This is on older hardware, and not running the most recent release.
> The server is a Dell PowerEdge 2900 with a PERC H700 controller, and
> 4 WD Red Pro 8TB disks (WD8001FFWX-6) forming a RAID10 volume
> containing 3 small 1TB file systems and 1 large 12TB file system.  The
> OS is OpenBSD 6.1/amd64.  All the file systems on this volume are
> mounted with the softdep option and the big one has noatime as well.

If you upgrade; there's a good chance fskc wil be faster.

-Otto
> 
> The time to run fsck is really only an issue when the server reboots
> unexpectedly (i.e. due to a power outage).  Coming up after a proper
> reboot or shutdown is very fast due to the file systems being clean.
> A UPS can help avoid most of these power-related reboots.  Alas, this
> particular server was connected to a UPS with a bad battery so it has
> rebooted due to power outages at least a half-dozen times this year,
> each of them involving a fairly long fsck delay.  I finally took the time
> last week to replace the UPS batteries so going forward this should
> be much less of a problem.  I do recommend the use of a UPS (and
> timely replacement of batteries when needed) if you are going to
> host very large FFS2 volumes.
> 
> I have never lost files due to a problem with FFS2 (or with FFS for that
> matter), but that is no reason not to perform regular backups.  For this
> particular file system I only back it up twice a year, but the data on it
> doesn't change often.  File systems with more 'normal' patterns of usage
> get backed up weekly.  The practice of taking regular backups also helps
> ensure that 'bit rot' is detected early enough that it can be corrected.
> 
> -ken



Re: Large Filesystem

2020-11-15 Thread Otto Moerbeek
On Sun, Nov 15, 2020 at 02:43:03PM +0100, Mischa wrote:

> On 15 Nov at 14:25, Otto Moerbeek  wrote:
> > On Sun, Nov 15, 2020 at 02:14:47PM +0100, Mischa wrote:
> > 
> > > On 15 Nov at 13:04, Otto Moerbeek  wrote:
> > > > On Sat, Nov 14, 2020 at 05:59:37PM +0100, Otto Moerbeek wrote:
> > > > 
> > > > > On Sat, Nov 14, 2020 at 04:59:22PM +0100, Mischa wrote:
> > > > > 
> > > > > > On 14 Nov at 15:54, Otto Moerbeek  wrote:
> > > > > > > On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:
> > > > > > > 
> > > > > > > > Hey,
> > > > > > > > my largest filesystem with OpenBSD on it is 12TB and for the 
> > > > > > > > minimal usecase
> > > > > > > > i have it works fine. I did not loose any data or so. I have it 
> > > > > > > > mounted with
> > > > > > > > the following flags:
> > > > > > > > 
> > > > > > > > > local, noatime, nodev, noexec, nosuid, softdep
> > > > > > > > 
> > > > > > > > The only thing i should mention is that one time the server 
> > > > > > > > crashed and i
> > > > > > > > had to do a fsck during the next boot. It took around 10 hours 
> > > > > > > > for the 12TB.
> > > > > > > > This might be something to keep in mind if you want to use this 
> > > > > > > > on a server.
> > > > > > > > But if my memory serves me well otto did some changes to fsck 
> > > > > > > > on ffs2, so
> > > > > > > > maybe thats a lot faster now.
> > > > > > > > 
> > > > > > > > I hope this helps you a little bit!
> > > > > > > > Greetings from Vienna
> > > > > > > > Leo
> > > > > > > > 
> > > > > > > > Am 14.11.2020 um 13:50 schrieb Mischa:
> > > > > > > > > I am currently in the process of building a large filesystem 
> > > > > > > > > with
> > > > > > > > > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to 
> > > > > > > > > serve as a
> > > > > > > > > central, mostly download, platform with around 100 concurrent
> > > > > > > > > connections.
> > > > > > > > > 
> > > > > > > > > The current system is running FreeBSD with ZFS and I would 
> > > > > > > > > like to
> > > > > > > > > see if it's possible on OpenBSD, as it's one of the last two 
> > > > > > > > > systems
> > > > > > > > > on FreeBSD left.:)
> > > > > > > > > 
> > > > > > > > > Has anybody build a large filesystem using FFS2? Is it a good 
> > > > > > > > > idea?
> > > > > > > > > How does it perform? What are good tests to run?
> > > > > > > > > 
> > > > > > > > > Your help and suggestions are really appriciated!
> > > > > > > > 
> > > > > > > 
> > > > > > > It doesn't always has to be that bad, on current:
> > > > > > > 
> > > > > > > [otto@lou:22]$ dmesg | grep sd[123]
> > > > > > > sd1 at scsibus1 targ 2 lun 0:  
> > > > > > > naa.5000c500c3ef0896
> > > > > > > sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > > > sd2 at scsibus1 targ 3 lun 0:  
> > > > > > > naa.5000c500c40e8569
> > > > > > > sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > > > sd3 at scsibus3 targ 1 lun 0: 
> > > > > > > sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors
> > > > > > > 
> > > > > > > [otto@lou:20]$ df -h /mnt 
> > > > > > > Filesystem SizeUsed   Avail Capacity  Mounted on
> > > > > > > /dev/sd3a 28.9T5.1G   27.4T 0%/mnt
> > > > > > > 
> > > > > > > [otto@lou:20]$ time doas fsck -f /dev/rsd3a 
> > > > > > > ** /dev/rsd3a
> > > > > > > ** File system is alr

Re: Large Filesystem

2020-11-15 Thread Otto Moerbeek
On Sun, Nov 15, 2020 at 02:14:47PM +0100, Mischa wrote:

> On 15 Nov at 13:04, Otto Moerbeek  wrote:
> > On Sat, Nov 14, 2020 at 05:59:37PM +0100, Otto Moerbeek wrote:
> > 
> > > On Sat, Nov 14, 2020 at 04:59:22PM +0100, Mischa wrote:
> > > 
> > > > On 14 Nov at 15:54, Otto Moerbeek  wrote:
> > > > > On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:
> > > > > 
> > > > > > Hey,
> > > > > > my largest filesystem with OpenBSD on it is 12TB and for the 
> > > > > > minimal usecase
> > > > > > i have it works fine. I did not loose any data or so. I have it 
> > > > > > mounted with
> > > > > > the following flags:
> > > > > > 
> > > > > > > local, noatime, nodev, noexec, nosuid, softdep
> > > > > > 
> > > > > > The only thing i should mention is that one time the server crashed 
> > > > > > and i
> > > > > > had to do a fsck during the next boot. It took around 10 hours for 
> > > > > > the 12TB.
> > > > > > This might be something to keep in mind if you want to use this on 
> > > > > > a server.
> > > > > > But if my memory serves me well otto did some changes to fsck on 
> > > > > > ffs2, so
> > > > > > maybe thats a lot faster now.
> > > > > > 
> > > > > > I hope this helps you a little bit!
> > > > > > Greetings from Vienna
> > > > > > Leo
> > > > > > 
> > > > > > Am 14.11.2020 um 13:50 schrieb Mischa:
> > > > > > > I am currently in the process of building a large filesystem with
> > > > > > > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to 
> > > > > > > serve as a
> > > > > > > central, mostly download, platform with around 100 concurrent
> > > > > > > connections.
> > > > > > > 
> > > > > > > The current system is running FreeBSD with ZFS and I would like to
> > > > > > > see if it's possible on OpenBSD, as it's one of the last two 
> > > > > > > systems
> > > > > > > on FreeBSD left.:)
> > > > > > > 
> > > > > > > Has anybody build a large filesystem using FFS2? Is it a good 
> > > > > > > idea?
> > > > > > > How does it perform? What are good tests to run?
> > > > > > > 
> > > > > > > Your help and suggestions are really appriciated!
> > > > > > 
> > > > > 
> > > > > It doesn't always has to be that bad, on current:
> > > > > 
> > > > > [otto@lou:22]$ dmesg | grep sd[123]
> > > > > sd1 at scsibus1 targ 2 lun 0:  
> > > > > naa.5000c500c3ef0896
> > > > > sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > sd2 at scsibus1 targ 3 lun 0:  
> > > > > naa.5000c500c40e8569
> > > > > sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > sd3 at scsibus3 targ 1 lun 0: 
> > > > > sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors
> > > > > 
> > > > > [otto@lou:20]$ df -h /mnt 
> > > > > Filesystem SizeUsed   Avail Capacity  Mounted on
> > > > > /dev/sd3a 28.9T5.1G   27.4T 0%/mnt
> > > > > 
> > > > > [otto@lou:20]$ time doas fsck -f /dev/rsd3a 
> > > > > ** /dev/rsd3a
> > > > > ** File system is already clean
> > > > > ** Last Mounted on /mnt
> > > > > ** Phase 1 - Check Blocks and Sizes
> > > > > ** Phase 2 - Check Pathnames
> > > > > ** Phase 3 - Check Connectivity
> > > > > ** Phase 4 - Check Reference Counts
> > > > > ** Phase 5 - Check Cyl groups
> > > > > 176037 files, 666345 used, 3875083616 free (120 frags, 484385437
> > > > > blocks, 0.0% fragmentation)
> > > > > 1m47.80s real 0m14.09s user 0m06.36s system
> > > > > 
> > > > > But note that fsck for FFS2 will get slower once more inodes are in
> > > > > use or have been in use.
> > > > > 
> > > > > Also, creating the fs with both blockszie and fragment size of 64k
> > > > > will make fsck faster (d

Re: Large Filesystem

2020-11-15 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 05:59:37PM +0100, Otto Moerbeek wrote:

> On Sat, Nov 14, 2020 at 04:59:22PM +0100, Mischa wrote:
> 
> > On 14 Nov at 15:54, Otto Moerbeek  wrote:
> > > On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:
> > > 
> > > > Hey,
> > > > my largest filesystem with OpenBSD on it is 12TB and for the minimal 
> > > > usecase
> > > > i have it works fine. I did not loose any data or so. I have it mounted 
> > > > with
> > > > the following flags:
> > > > 
> > > > > local, noatime, nodev, noexec, nosuid, softdep
> > > > 
> > > > The only thing i should mention is that one time the server crashed and 
> > > > i
> > > > had to do a fsck during the next boot. It took around 10 hours for the 
> > > > 12TB.
> > > > This might be something to keep in mind if you want to use this on a 
> > > > server.
> > > > But if my memory serves me well otto did some changes to fsck on ffs2, 
> > > > so
> > > > maybe thats a lot faster now.
> > > > 
> > > > I hope this helps you a little bit!
> > > > Greetings from Vienna
> > > > Leo
> > > > 
> > > > Am 14.11.2020 um 13:50 schrieb Mischa:
> > > > > I am currently in the process of building a large filesystem with
> > > > > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to serve as 
> > > > > a
> > > > > central, mostly download, platform with around 100 concurrent
> > > > > connections.
> > > > > 
> > > > > The current system is running FreeBSD with ZFS and I would like to
> > > > > see if it's possible on OpenBSD, as it's one of the last two systems
> > > > > on FreeBSD left.:)
> > > > > 
> > > > > Has anybody build a large filesystem using FFS2? Is it a good idea?
> > > > > How does it perform? What are good tests to run?
> > > > > 
> > > > > Your help and suggestions are really appriciated!
> > > > 
> > > 
> > > It doesn't always has to be that bad, on current:
> > > 
> > > [otto@lou:22]$ dmesg | grep sd[123]
> > > sd1 at scsibus1 targ 2 lun 0:  
> > > naa.5000c500c3ef0896
> > > sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > sd2 at scsibus1 targ 3 lun 0:  
> > > naa.5000c500c40e8569
> > > sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > sd3 at scsibus3 targ 1 lun 0: 
> > > sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors
> > > 
> > > [otto@lou:20]$ df -h /mnt 
> > > Filesystem SizeUsed   Avail Capacity  Mounted on
> > > /dev/sd3a 28.9T5.1G   27.4T 0%/mnt
> > > 
> > > [otto@lou:20]$ time doas fsck -f /dev/rsd3a 
> > > ** /dev/rsd3a
> > > ** File system is already clean
> > > ** Last Mounted on /mnt
> > > ** Phase 1 - Check Blocks and Sizes
> > > ** Phase 2 - Check Pathnames
> > > ** Phase 3 - Check Connectivity
> > > ** Phase 4 - Check Reference Counts
> > > ** Phase 5 - Check Cyl groups
> > > 176037 files, 666345 used, 3875083616 free (120 frags, 484385437
> > > blocks, 0.0% fragmentation)
> > > 1m47.80s real 0m14.09s user 0m06.36s system
> > > 
> > > But note that fsck for FFS2 will get slower once more inodes are in
> > > use or have been in use.
> > > 
> > > Also, creating the fs with both blockszie and fragment size of 64k
> > > will make fsck faster (due to less inodes), but that should only be
> > > done if the files you are going to store ar relatively big (generally
> > > much bigger than 64k).
> > 
> > Good to know. This will be mostly large files indeed.
> > That would be "newfs -i 64"?
> 
> Nope, newfs -b 65536 -f 65536 

To clarify: the default block size for large filesystems is already
2^16, but this value is taken from the label, so if another fs was on
that partition before, it might have changed. The default fragsize is
blocksize/8. When not specified on the command line, it is also taken
from the label.

Inode density is derived from the number of frgaments (normally 1
inoder per 4 fragments), if you increase framgent size, the number of
fragments drops and so the number if inodes.

A fragment is the minimal alloctation unit. So if you have lots of
small files you will waste a lot of space and potentially run out of
inodes. You only want to increase fragment size of you mostly store
large files.

-Otto



Re: Large Filesystem

2020-11-14 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 04:59:22PM +0100, Mischa wrote:

> On 14 Nov at 15:54, Otto Moerbeek  wrote:
> > On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:
> > 
> > > Hey,
> > > my largest filesystem with OpenBSD on it is 12TB and for the minimal 
> > > usecase
> > > i have it works fine. I did not loose any data or so. I have it mounted 
> > > with
> > > the following flags:
> > > 
> > > > local, noatime, nodev, noexec, nosuid, softdep
> > > 
> > > The only thing i should mention is that one time the server crashed and i
> > > had to do a fsck during the next boot. It took around 10 hours for the 
> > > 12TB.
> > > This might be something to keep in mind if you want to use this on a 
> > > server.
> > > But if my memory serves me well otto did some changes to fsck on ffs2, so
> > > maybe thats a lot faster now.
> > > 
> > > I hope this helps you a little bit!
> > > Greetings from Vienna
> > > Leo
> > > 
> > > Am 14.11.2020 um 13:50 schrieb Mischa:
> > > > I am currently in the process of building a large filesystem with
> > > > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to serve as a
> > > > central, mostly download, platform with around 100 concurrent
> > > > connections.
> > > > 
> > > > The current system is running FreeBSD with ZFS and I would like to
> > > > see if it's possible on OpenBSD, as it's one of the last two systems
> > > > on FreeBSD left.:)
> > > > 
> > > > Has anybody build a large filesystem using FFS2? Is it a good idea?
> > > > How does it perform? What are good tests to run?
> > > > 
> > > > Your help and suggestions are really appriciated!
> > > 
> > 
> > It doesn't always has to be that bad, on current:
> > 
> > [otto@lou:22]$ dmesg | grep sd[123]
> > sd1 at scsibus1 targ 2 lun 0:  
> > naa.5000c500c3ef0896
> > sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > sd2 at scsibus1 targ 3 lun 0:  
> > naa.5000c500c40e8569
> > sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > sd3 at scsibus3 targ 1 lun 0: 
> > sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors
> > 
> > [otto@lou:20]$ df -h /mnt 
> > Filesystem SizeUsed   Avail Capacity  Mounted on
> > /dev/sd3a 28.9T5.1G   27.4T 0%/mnt
> > 
> > [otto@lou:20]$ time doas fsck -f /dev/rsd3a 
> > ** /dev/rsd3a
> > ** File system is already clean
> > ** Last Mounted on /mnt
> > ** Phase 1 - Check Blocks and Sizes
> > ** Phase 2 - Check Pathnames
> > ** Phase 3 - Check Connectivity
> > ** Phase 4 - Check Reference Counts
> > ** Phase 5 - Check Cyl groups
> > 176037 files, 666345 used, 3875083616 free (120 frags, 484385437
> > blocks, 0.0% fragmentation)
> > 1m47.80s real 0m14.09s user 0m06.36s system
> > 
> > But note that fsck for FFS2 will get slower once more inodes are in
> > use or have been in use.
> > 
> > Also, creating the fs with both blockszie and fragment size of 64k
> > will make fsck faster (due to less inodes), but that should only be
> > done if the files you are going to store ar relatively big (generally
> > much bigger than 64k).
> 
> Good to know. This will be mostly large files indeed.
> That would be "newfs -i 64"?

Nope, newfs -b 65536 -f 65536 

-Otto

> Is there a way to see how many inodes that would create?
> 
> > As for the speed of general operation, I wouldn't know. I never used
> > such large firessytems for anything other than archive storage. The fs
> > above I only have been using for filesystem dev work.
> > 
> > -Otto
> 



Re: Large Filesystem

2020-11-14 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:

> Hey,
> my largest filesystem with OpenBSD on it is 12TB and for the minimal usecase
> i have it works fine. I did not loose any data or so. I have it mounted with
> the following flags:
> 
> > local, noatime, nodev, noexec, nosuid, softdep
> 
> The only thing i should mention is that one time the server crashed and i
> had to do a fsck during the next boot. It took around 10 hours for the 12TB.
> This might be something to keep in mind if you want to use this on a server.
> But if my memory serves me well otto did some changes to fsck on ffs2, so
> maybe thats a lot faster now.
> 
> I hope this helps you a little bit!
> Greetings from Vienna
> Leo
> 
> Am 14.11.2020 um 13:50 schrieb Mischa:
> > I am currently in the process of building a large filesystem with
> > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to serve as a
> > central, mostly download, platform with around 100 concurrent
> > connections.
> > 
> > The current system is running FreeBSD with ZFS and I would like to
> > see if it's possible on OpenBSD, as it's one of the last two systems
> > on FreeBSD left.:)
> > 
> > Has anybody build a large filesystem using FFS2? Is it a good idea?
> > How does it perform? What are good tests to run?
> > 
> > Your help and suggestions are really appriciated!
> 

It doesn't always has to be that bad, on current:

[otto@lou:22]$ dmesg | grep sd[123]
sd1 at scsibus1 targ 2 lun 0:  naa.5000c500c3ef0896
sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
sd2 at scsibus1 targ 3 lun 0:  naa.5000c500c40e8569
sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
sd3 at scsibus3 targ 1 lun 0: 
sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors

[otto@lou:20]$ df -h /mnt 
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/sd3a 28.9T5.1G   27.4T 0%/mnt

[otto@lou:20]$ time doas fsck -f /dev/rsd3a 
** /dev/rsd3a
** File system is already clean
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
176037 files, 666345 used, 3875083616 free (120 frags, 484385437
blocks, 0.0% fragmentation)
1m47.80s real 0m14.09s user 0m06.36s system

But note that fsck for FFS2 will get slower once more inodes are in
use or have been in use.

Also, creating the fs with both blockszie and fragment size of 64k
will make fsck faster (due to less inodes), but that should only be
done if the files you are going to store ar relatively big (generally
much bigger than 64k).

As for the speed of general operation, I wouldn't know. I never used
such large firessytems for anything other than archive storage. The fs
above I only have been using for filesystem dev work.

-Otto



Re: memory usage at a given time

2020-11-14 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 12:23:56PM +0100, Otto Moerbeek wrote:

> It could be I'm mistaken an cache is part of tot.

Indeed both act and cache are part of tot.

-Otto
> 
> Mihai Popescu  schreef op 14 november 2020 11:49:35 CET:
> >On Sat, Nov 14, 2020 at 12:30 PM Otto Moerbeek  wrote:
> >
> >> On Sat, Nov 14, 2020 at 12:21:30PM +0200, Mihai Popescu wrote:
> >>
> >> [ .. ]
> >> > CPU0: 22.4% user,  0.0% nice,  3.8% sys,  0.6% spin,  0.6% intr,
> >72.7%
> >> idle
> >> > CPU1: 21.2% user,  0.0% nice,  3.0% sys,  0.2% spin,  0.0% intr,
> >75.6%
> >> idle
> >> > Memory: Real: 1235M/2914M act/tot Free: 4505M Cache: 1054M Swap:
> >0K/7913M
> >>
> >> Cutting some corners, but the basics go like this:
> >>
> >> Currently, no swap is used.
> >>
> >> You see 4505G is free. Roughly you can use that amount more. If free
> >> becomes low, cache will be reduced and/or pages swapped out, making
> >> more pages free so they can become used and part of tot.
> >>
> >> tot + free + cache should add up to available RAM (which is a bit
> >less
> >> than what you have in the machine, since the kernel also needs to fit
> >> somewhere and uses memory of its own).
> >>
> >> -Otto
> >>
> >>
> >Here is my confusion.
> >tot + free + cache would be:
> >2914M + 4505M + 1054M = 8473M
> >
> >dmesg shows me this:
> >real mem = 8029429760 (7657MB)
> >avail mem = 7770787840 (7410MB)
> >here should be the memory for the integrated video card:
> >7657MB - 7410MB = 247MB <- plausible
> >
> >Am I correct, or am I hit by the bit/byte units conversion.
> >
> >Thank you.
> 
> -- 
> Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn beknoptheid.



Re: memory usage at a given time

2020-11-14 Thread Otto Moerbeek
It could be I'm mistaken an cache is part of tot.

Mihai Popescu  schreef op 14 november 2020 11:49:35 CET:
>On Sat, Nov 14, 2020 at 12:30 PM Otto Moerbeek  wrote:
>
>> On Sat, Nov 14, 2020 at 12:21:30PM +0200, Mihai Popescu wrote:
>>
>> [ .. ]
>> > CPU0: 22.4% user,  0.0% nice,  3.8% sys,  0.6% spin,  0.6% intr,
>72.7%
>> idle
>> > CPU1: 21.2% user,  0.0% nice,  3.0% sys,  0.2% spin,  0.0% intr,
>75.6%
>> idle
>> > Memory: Real: 1235M/2914M act/tot Free: 4505M Cache: 1054M Swap:
>0K/7913M
>>
>> Cutting some corners, but the basics go like this:
>>
>> Currently, no swap is used.
>>
>> You see 4505G is free. Roughly you can use that amount more. If free
>> becomes low, cache will be reduced and/or pages swapped out, making
>> more pages free so they can become used and part of tot.
>>
>> tot + free + cache should add up to available RAM (which is a bit
>less
>> than what you have in the machine, since the kernel also needs to fit
>> somewhere and uses memory of its own).
>>
>> -Otto
>>
>>
>Here is my confusion.
>tot + free + cache would be:
>2914M + 4505M + 1054M = 8473M
>
>dmesg shows me this:
>real mem = 8029429760 (7657MB)
>avail mem = 7770787840 (7410MB)
>here should be the memory for the integrated video card:
>7657MB - 7410MB = 247MB <- plausible
>
>Am I correct, or am I hit by the bit/byte units conversion.
>
>Thank you.

-- 
Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn beknoptheid.


Re: memory usage at a given time

2020-11-14 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 12:21:30PM +0200, Mihai Popescu wrote:

> On Sat, Nov 14, 2020 at 9:29 AM Otto Moerbeek  wrote:
> 
> >
> >
> > vmstat only swows pi an po, pages paged in and out, not swap usage.
> >
> > For sysyat: the vmstat view does not show swap usage, but it does show
> > paging/swap traffic. The swap view does (per swap device), as does the
> > uvm view (swpginuse, this is a total swap pages in use).
> >
> > top also shows swap usage.
> >
> > -Otto
> >
> 
> I know, but what I am trying to say is I don't know how to compute a "big
> total" memory usage needed before hitting swap. No matter how I add, the
> final number is not in the line with the dmesg reported memory. Here is top:
> 
> $ top
> load averages:  0.33,  0.23,  0.10thinkc.my.domain
> 12:20:39
> 47 processes: 46 idle, 1 on processor  up
>  4:17
> CPU0: 22.4% user,  0.0% nice,  3.8% sys,  0.6% spin,  0.6% intr, 72.7% idle
> CPU1: 21.2% user,  0.0% nice,  3.0% sys,  0.2% spin,  0.0% intr, 75.6% idle
> Memory: Real: 1235M/2914M act/tot Free: 4505M Cache: 1054M Swap: 0K/7913M

Cutting some corners, but the basics go like this:

Currently, no swap is used.

You see 4505G is free. Roughly you can use that amount more. If free
becomes low, cache will be reduced and/or pages swapped out, making
more pages free so they can become used and part of tot.

tot + free + cache should add up to available RAM (which is a bit less
than what you have in the machine, since the kernel also needs to fit
somewhere and uses memory of its own).

-Otto



Re: memory usage at a given time

2020-11-13 Thread Otto Moerbeek
On Sat, Nov 14, 2020 at 02:26:47AM +0200, Mihai Popescu wrote:

> Hello,
> 
> My computer has 2 x 4GB memory, as one can see in dmesg. A part of it is
> used by the video card, I'm not sure how much, maybe around 256MB or less I
> want to know if I will hit the swap space when I will let it run on 1 x 4GB
> memory, but I'm not sure how to interpret some of the following outputs or
> if I need to run other commands:
> 
> $ dmesg
> OpenBSD 6.8-current (GENERIC.MP) #175: Wed Nov 11 10:02:40 MST 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 8029429760 (7657MB)
> avail mem = 7770787840 (7410MB)
> [ ... ]
> spdmem0 at iic0 addr 0x52: 4GB DDR3 SDRAM PC3-10600
> spdmem1 at iic0 addr 0x53: 4GB DDR3 SDRAM PC3-10600
> [ ... ]
> 
> $ systat
>1 users Load 0.19 0.37 0.31thinkc.my.domain
> 02:23:22
> 
> memory totals (in KB)PAGING   SWAPPING
> Interrupts
>real   virtual free   in  out   in  out  360
> total
> Active  1530836   1530836  2928304   ops100
> clock
> All 4668644   4668644 11031684   pages  237 ipi
> 
> radeondr
> Proc:r  d  s  wCsw   Trp   Sys   Int   Sof  Flt   forks  13
> ahci0
>  2   259   769   746  304022   288  520   fkppw
> ohci0
>   fksvm
> ehci0
>0.0%Int   0.1%Spn   1.1%Sys   5.1%Usr  93.7%Idle   pwait   8
> ohci1
> |||||||||||   175 relck
> ehci1
> =>>   175 rlkok
> azalia0
>   noram
> ohci2
> Namei Sys-cacheProc-cacheNo-cache  56 ndcpy   2 bge0
> Calls hits%hits %miss   % fltcp
> ohci3
>   102   79   7722  22 295 zfod
>  pckbc0
>   cow
> Disks   sd0   cd0   63307 fmin
> seeks   84409 ftarg
> xfers26   itarg
> speed  410K 2 wired   3
> IPKTS
>   sec   0.0   pdfre   1
> OPKTS
> 
> $ vmstat
>  procsmemory   pagediskstraps  cpu
>  r   s   avm fre  flt  re  pi  po  fr  sr sd0 cd0  int   sys   cs us sy
> id
>  1 259 1504M   2848M 1450   0   0   0   0   0   2   0  174 13338 3982 13  3
> 83

vmstat only swows pi an po, pages paged in and out, not swap usage.

For sysyat: the vmstat view does not show swap usage, but it does show
paging/swap traffic. The swap view does (per swap device), as does the
uvm view (swpginuse, this is a total swap pages in use).

top also shows swap usage.

-Otto



Re: Malloc options

2020-11-12 Thread Otto Moerbeek
On Thu, Nov 12, 2020 at 05:40:39AM -0600, ed...@pettijohn-web.com wrote:

> On Nov 12, 2020 3:06 AM, Stuart Henderson  wrote:
> 
>   On 2020-11-11, ed...@pettijohn-web.com 
>   wrote:
>   > Thanks for the quick reply. I'll stick with "s" for now and if its
>   > unbearably slow I'll try others.
> 
>   'S' not 's', they're case-sensitive (from the manual, "Unless
>   otherwise
>   noted uppercase means on, lowercase means off.")
> 
> 
> Luckily it's a typo in the email. However, I don't recall reading that.
> Must have skimmed past it.
> Thanks,
> Edgar 

This can be handy when you have e.g. the sysctl set, but want to override:

# sysctl vm.malloc_conf=S

All processes run with S

MALLOC_OPTONS=sC executable

Runs this executable without S but with canaries.

-Otto



Re: Malloc options

2020-11-11 Thread Otto Moerbeek
On Wed, Nov 11, 2020 at 10:09:19AM -0600, ed...@pettijohn-web.com wrote:

> I'm trying to compile a program that is using a MALLOC_OPTIONS of "A"
> which doesn't exist. Reading the manual all of the options look good to
> me so what would be the best? I'm going to go with "S" unless otherwise
> instructed. 
> Thanks,
> Edgar

A long time ago A existed, not it is the default: if an inconsistency
is detected, malloc aborts.

While developing and for "sensitive" programs S is best. ssh and sshd
use it.

If S is too slow, any subset of CFJ might provide a middle ground.
Not using any flags is also pretty safe already compared to other
malloc implementations.

-Otto



Re: Set environment variable for non-interactive shell

2020-11-06 Thread Otto Moerbeek
On Fri, Nov 06, 2020 at 07:38:35PM +0100, Kirill Peskov wrote:

> Unfortunately manpage for login.conf does not give any example, only
> brief description:
> 
> setenv envlist  A list of environment
> variables and associated
> values to be set for the
> class.

the envlist type is defined a bit further,

-Otto


> so if I would like to set for example global variable MY_ENV=DEV for all 
> users and any login method, then what should I put here instead of XX?
> 
> default:\
> :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin 
> /usr/local/sbin:\
> :umask=022:\
> :setenv=XX:\
> :...blabla...:\
> 
> 
> 
> On 06.11.20 16:28, Todd C. Miller wrote:
> > Typically, this kind of thing is done in /etc/login.conf.
> >
> >  - todd



Re: disk setup question

2020-10-29 Thread Otto Moerbeek
On Thu, Oct 29, 2020 at 02:44:39PM +0100, Aleksander De wrote:

> Hi.
> 
> Are there any downsides or potential issues which may happen when
> extending boundaries for OpenBSD partition on >2TB disk while using
> MBR for booting it at the same time? I need MBR otherwise the machine
> will not boot. BIOS/RAID controller does not support UEFI.
> 
> Here you can see MBR with its 2TB limit:
> # fdisk sd0
> Disk: sd0   geometry: 267349/255/63 [4294961685 Sectors]
> Offset: 0   Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
> ---
>  0: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  1: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> *3: A6  0   4   5 - 267342  56  42 [ 256:  4294852544 ] OpenBSD
> 
> In disklabel I have extended boundaries ('b' command') in order to be able to
> utilize whole disk - and use big encrypted partition 'e'. I have small
> root partition at the beginning in order to boot from it and access from
> remote via ssh + decrypt big partition manually, then start services.
> 
> # disklabel sd0
> # /dev/rsd0c:
> type: SCSI
> disk: SCSI disk
> label: Block Device
> duid: 288724ae82038959
> flags:
> bytes/sector: 512
> sectors/track: 255
> tracks/cylinder: 511
> sectors/cylinder: 130305
> cylinders: 44967
> total sectors: 5859442688
> boundstart: 256
> boundend: 5859442688
> drivedata: 0
> 
> 16 partitions:
> #size   offset  fstype [fsize bsize   cpg]
>   a: 14724192  256  4.2BSD   2048 16384 12960 # /
>   b: 16809362 14724448swap# none
>   c:   58594426880  unused
>   d: 14724192 31533824  4.2BSD   2048 16384 1
>   e:   5813184640 46258048RAID
> 
> the same in more human-readable form:
> # disklabel -E sd0
> Label editor (enter '?' for help at any prompt)
> sd0> p g
> OpenBSD area: 256-5859442688; size: 2794.0G; free: 0.0G
> #size   offset  fstype [fsize bsize   cpg]
>   a: 7.0G  256  4.2BSD   2048 16384 12960 # /
>   b: 8.0G 14724448swap# none
>   c:  2794.0G0  unused
>   d: 7.0G 31533824  4.2BSD   2048 16384 1
>   e:  2771.9G 46258048RAID
> 
> I don't need whole 3TB now, just feel better knowing that I can use it safe.
> If it may cause some issues I can keep 2TB limit to OpenBSD partition.
> --
> Aleksander
> 

I know of no issue doing that, you should be fine.

-Otto



Re: search contains unknown domain in resolv.conf

2020-10-28 Thread Otto Moerbeek
On Tue, Oct 27, 2020 at 11:44:21PM +0300, Andreas X wrote:

> ignore domain-search;
> supersede domain-name mail.myserver.tld;
> supersede domain-search mail.myserver.tld;
> 
> None of these lines have worked in dhclient.conf

The domain name need to be quoted.

What did you do exactly? Did you restart dhclient? Anything in logs?
What are the contents of resolv.conf after re-aquiring a lease?

-Otto

> 
> Anything else I could try?
> 
> Thank you.
> 
> 
> 
> 27 Ekim 2020 Salı tarihinde Otto Moerbeek  yazdı:
> 
> > On Tue, Oct 27, 2020 at 02:32:46PM +0300, Andreas X wrote:
> >
> > > Greetings. On OpenBSD 6.8, I have unbound enabled in my server, (server
> > > gets its IP via DHCP from my server provider)
> > > In resolv.conf I have a "search your-server.de" line and I don't know
> > what
> > > hostname is that.
> > > My own hostname is something different.
> > >
> > > That seems the older hostname during setup (I forgot to set hostname
> > during
> > > installation). I changed hostname, removed it from resolv.conf, rebooted,
> > > it's back again (DHCP - generated by re0 dhclient) How can I
> > remove/change
> > > that line?
> > >
> > > P.S: I have unbound enabled, therefore I created dhclient.conf and added:
> > > supersede domain-name-servers 127.0.0.1, my.server.provider.IP;
> > >
> > > openbsdl# cat /etc/resolv.conf
> > > # Generated by re0 dhclient
> > > search your-server.de
> > > nameserver 127.0.0.1
> > > lookup file bind
> > >
> > > Thanks.
> >
> > You can tell dhclient to ignore things.
> >
> > something like
> >
> > ignore domain-search;
> >
> > in dhclient.conf. See dhclient.conf and dhcp-options man pages.
> >
> > -Otto
> >



Re: search contains unknown domain in resolv.conf

2020-10-27 Thread Otto Moerbeek
On Tue, Oct 27, 2020 at 02:32:46PM +0300, Andreas X wrote:

> Greetings. On OpenBSD 6.8, I have unbound enabled in my server, (server
> gets its IP via DHCP from my server provider)
> In resolv.conf I have a "search your-server.de" line and I don't know what
> hostname is that.
> My own hostname is something different.
> 
> That seems the older hostname during setup (I forgot to set hostname during
> installation). I changed hostname, removed it from resolv.conf, rebooted,
> it's back again (DHCP - generated by re0 dhclient) How can I remove/change
> that line?
> 
> P.S: I have unbound enabled, therefore I created dhclient.conf and added:
> supersede domain-name-servers 127.0.0.1, my.server.provider.IP;
> 
> openbsdl# cat /etc/resolv.conf
> # Generated by re0 dhclient
> search your-server.de
> nameserver 127.0.0.1
> lookup file bind
> 
> Thanks.

You can tell dhclient to ignore things. 

something like 

ignore domain-search;

in dhclient.conf. See dhclient.conf and dhcp-options man pages.

-Otto



Re: getaddrinfo(3) in CGI program

2020-10-22 Thread Otto Moerbeek
On Thu, Oct 22, 2020 at 10:16:28AM -0400, ben wrote:

> Hello, Misc;
> 
> I'm attempting to write a CGI progam in C which uses getaddrinfo(3), however
> upon running the script getaddrinfo doesn't seem to run. I have a feeling this
> is due to linking issues as other have experienced a similar issue with glibc.
> 
> Here is the ldd output of the binary:
> 
> StartEnd  Type  Open Ref GrpRef Name
> 0d8efcc5 0d8efcc55000 exe   10   0  test
> 0d910d122000 0d910d216000 rlib  01   0  /usr/lib/libc.so.96.0
> 0d91079b6000 0d91079b6000 ld.so 01   0  /usr/libexec/ld.so
> 
> I've moved the libraries listed in the ldd command output to the /var/www
> directory for proper chrooting, and still getaddrinfo doesn't work.
> 
> Has anyone experienced this before and is there a possible solution? Thank you
> in advance.
> 
> 
> Ben Raskin.
> 

In these cases ktrace is your friend.

My guess would be your chroot does not contain a etc/resolv.conf
and/ro etc/hosts file and you do not have a resolver running on
127.0.0.1

-Otto



Re: Disable touchpad acceleration? (wsmouse)

2020-10-13 Thread Otto Moerbeek
On Tue, Oct 13, 2020 at 11:38:11PM -0400, Brennan Vincent wrote:

> Hello,
> 
> I am using the wsmouse driver with x11, and no amount of googling or reading
> man pages has helped me figure out how to disable acceleration and have
> completely flat/linear response. Is this possible?
> 
> I know that I can change sensitivity with `mouse.tp.scaling=`, but
> I don't think this affects acceleration.
> 
> 

Check xset (and maybe xinput, but I;ve never used that).

-Otto



Re: dump LOB status

2020-09-26 Thread Otto Moerbeek
On Fri, Sep 25, 2020 at 07:49:20AM +0200, Otto Moerbeek wrote:

> On Fri, Sep 25, 2020 at 08:42:38AM +0300, Juha Erkkilä wrote:
> 
> > 
> > > On 24. Sep 2020, at 15.36, Otto Moerbeek  wrote:
> > > 
> > > On Tue, Sep 22, 2020 at 08:37:22PM +0300, Juha Erkkilä wrote:
> > >> Actually, I tested this again and now it appears
> > >> dump and restore both work correctly. Previously,
> > >> I first tested dump/restore with an empty filesystem,
> > >> then with some files, and it may be that the second
> > >> time I was accidentally testing restore with the first
> > >> dump file.
> > >> 
> > >> My tests were only with a small amount of files,
> > >> I will do a better test with proper data (about
> > >> 0.5 terabytes and over 10 files) and will
> > >> report again here in a next few days.
> > > 
> > > Lookin through FreeBSD commits I think you want the main.c one as
> > > well, otherwise silent corruption of the dump is still possible.
> > > 
> > >   -Otto
> > 
> > With that patch I get a message:
> > 
> > fatal: morestack on g0
> >   DUMP: fs is too large for dump!
> >   DUMP: The ENTIRE dump is aborted.
> > 
> > This is on a 2 terabyte filesystem with 0.5 terabytes
> > of data “successfully” backed up (or at least I considered
> > the backup and restore as successful).
> 
> Hmm, I neeed to dig into the dump format and see if the math is right.

Indeed, that commit was reverted in FreeBSD. This should do better. I
do not like the assert FreeBSD has, so I turned into an quit().

-Otto

Index: tape.c
===
RCS file: /cvs/src/sbin/dump/tape.c,v
retrieving revision 1.45
diff -u -p -r1.45 tape.c
--- tape.c  28 Jun 2019 13:32:43 -  1.45
+++ tape.c  26 Sep 2020 06:30:37 -
@@ -330,7 +330,10 @@ flushtape(void)
}
 
blks = 0;
-   if (spcl.c_type != TS_END) {
+   if (spcl.c_type != TS_END && spcl.c_type != TS_CLRI &&
+   spcl.c_type != TS_BITS) {
+   if (spcl.c_count > TP_NINDIR)
+   quit("c_count too large\n");
for (i = 0; i < spcl.c_count; i++)
if (spcl.c_addr[i] != 0)
blks++;



Re: dump LOB status

2020-09-24 Thread Otto Moerbeek
On Fri, Sep 25, 2020 at 08:42:38AM +0300, Juha Erkkilä wrote:

> 
> > On 24. Sep 2020, at 15.36, Otto Moerbeek  wrote:
> > 
> > On Tue, Sep 22, 2020 at 08:37:22PM +0300, Juha Erkkilä wrote:
> >> Actually, I tested this again and now it appears
> >> dump and restore both work correctly. Previously,
> >> I first tested dump/restore with an empty filesystem,
> >> then with some files, and it may be that the second
> >> time I was accidentally testing restore with the first
> >> dump file.
> >> 
> >> My tests were only with a small amount of files,
> >> I will do a better test with proper data (about
> >> 0.5 terabytes and over 10 files) and will
> >> report again here in a next few days.
> > 
> > Lookin through FreeBSD commits I think you want the main.c one as
> > well, otherwise silent corruption of the dump is still possible.
> > 
> > -Otto
> 
> With that patch I get a message:
> 
> fatal: morestack on g0
>   DUMP: fs is too large for dump!
>   DUMP: The ENTIRE dump is aborted.
> 
> This is on a 2 terabyte filesystem with 0.5 terabytes
> of data “successfully” backed up (or at least I considered
> the backup and restore as successful).

Hmm, I neeed to dig into the dump format and see if the math is right.

-Otto



Re: dump LOB status

2020-09-24 Thread Otto Moerbeek
On Tue, Sep 22, 2020 at 08:37:22PM +0300, Juha Erkkilä wrote:

> 
> > On 22. Sep 2020, at 15.04, Juha Erkkilä  wrote:
> > 
> >> On 22. Sep 2020, at 9.00, Otto Moerbeek  wrote:
> >> Maybe by hand, but not by using patch(1), the context differs a bit.
> >> 
> >> Next obvious question: did you test if it fixes your problem? That
> >> means, do you get a dump that can be restored again?
> >> 
> >>-Otto
> > 
> > Thanks Otto for a very good question!  So no,
> > do not use that patch as is, it breaks restore
> > as it can not be used to restore any files.
> 
> Actually, I tested this again and now it appears
> dump and restore both work correctly. Previously,
> I first tested dump/restore with an empty filesystem,
> then with some files, and it may be that the second
> time I was accidentally testing restore with the first
> dump file.
> 
> My tests were only with a small amount of files,
> I will do a better test with proper data (about
> 0.5 terabytes and over 10 files) and will
> report again here in a next few days.

Lookin through FreeBSD commits I think you want the main.c one as
well, otherwise silent corruption of the dump is still possible.

-Otto

Index: main.c
===
RCS file: /cvs/src/sbin/dump/main.c,v
retrieving revision 1.61
diff -u -p -r1.61 main.c
--- main.c  28 Jun 2019 13:32:43 -  1.61
+++ main.c  24 Sep 2020 10:24:45 -
@@ -92,7 +92,7 @@ main(int argc, char *argv[])
int ch, mode;
struct tm then;
struct statfs fsbuf;
-   int i, anydirskipped, bflag = 0, Tflag = 0, honorlevel = 1;
+   int i, anydirskipped, c_count, bflag = 0, Tflag = 0, honorlevel = 1;
ino_t maxino;
time_t t;
int dirlist;
@@ -442,6 +442,9 @@ main(int argc, char *argv[])
 #endif
maxino = (ino_t)sblock->fs_ipg * sblock->fs_ncg;
mapsize = roundup(howmany(maxino, NBBY), TP_BSIZE);
+   c_count = howmany(mapsize * sizeof(char), TP_BSIZE);
+   if (c_count > TP_NINDIR)
+   quit("fs is too large for dump!");
usedinomap = calloc((unsigned) mapsize, sizeof(char));
dumpdirmap = calloc((unsigned) mapsize, sizeof(char));
dumpinomap = calloc((unsigned) mapsize, sizeof(char));
Index: tape.c
===
RCS file: /cvs/src/sbin/dump/tape.c,v
retrieving revision 1.45
diff -u -p -r1.45 tape.c
--- tape.c  28 Jun 2019 13:32:43 -  1.45
+++ tape.c  24 Sep 2020 10:24:45 -
@@ -330,7 +330,8 @@ flushtape(void)
}
 
blks = 0;
-   if (spcl.c_type != TS_END) {
+   if (spcl.c_type != TS_END && spcl.c_type != TS_CLRI &&
+   spcl.c_type != TS_BITS) {
for (i = 0; i < spcl.c_count; i++)
if (spcl.c_addr[i] != 0)
blks++;



Re: dump LOB status

2020-09-22 Thread Otto Moerbeek
On Mon, Sep 21, 2020 at 10:23:55PM +0300, Juha Erkkilä wrote:

> 
> 
> > On 16. Sep 2020, at 20.27, Juha Erkkilä  wrote:
> > 
> > 
> >> On 16. Sep 2020, at 0.18, Kenneth Gober  wrote:
> >> I took a very quick look at the source and it appears that 213 is shown in
> >> octal.  I believe that the 200 bit indicates that a core file was produced,
> >> and 13 is probably a signal number (13 octal equals 11 decimal which would
> >> be SIGSEGV).  I am not sure whether the size of the file system is itself
> >> the cause, I have been using dump(8) to back up a large (currently 6.7TB)
> >> volume to tape for years (several tapes, actually) and it works fine,
> >> although that system is still on 6.1/amd64.  I looked in CVS and didn't see
> >> any obvious diffs between 6.1 and 6.6 that jumped out at me as potential
> >> causes, so perhaps the issue has been latent for a long time and I haven't
> >> seen it because it's triggered by the particulars of one or more files
> >> rather than the overall file system size.  Maybe if an individual file gets
> >> too big, or is too 'sparse' or something?
> > 
> > I can reproduce this on -current from Fri Sep 11 11:30:09
> > with a freshly created and an empty filesystem of 2 terabytes.
> 
> It looks like the same issue has been fixed in
> FreeBSD: https://svnweb.freebsd.org/base?view=revision=334979 
> 
> 
> The diff applies cleanly to the current OpenBSD source tree.

Maybe by hand, but not by using patch(1), the context differs a bit.

Next obvious question: did you test if it fixes your problem? That
means, do you get a dump that can be restored again?

-Otto



Re: Does DNS need TCP?

2020-09-21 Thread Otto Moerbeek
On Sun, Sep 20, 2020 at 10:17:47PM -0400, Predrag Punosevac wrote:

> Nicolai  wrote :
> 
> > On Sun, Sep 20, 2020 at 12:43:41AM -0400, Predrag Punosevac wrote:
> > 
> > > For number of years I had in my /var/unbound/etc/unbound.conf line
> > > 
> > > do-tcp: no
> > 
> > > To make things worse I was blocking port TCP port 53. 
> > 
> > Just curious, why did you do that?
> 
> When I start using Unbound on OpenBSD it was not the part of the base.
> There was not such a thing as the default unbound.conf file. I vividly
> remember reading NLnet Labs Documentation three full days before
> deciding on my defaults. Even once Unbound became the part of the base,
> (IIRC 5.7) the defaults were not carved in stone. They changed quite a
> bit over the time.

unbound itslef has tcp switched on by default.

> 
> As of the port blocking unfortunately I am old enough to remember this
> post 
> 
> http://cr.yp.to/djbdns/tcp.html#why
> 
> and the remark that TCP is only needed for records larger than 512
> bytes. 
> 
> "You want to publish record sets larger than 512 bytes. (This is almost
> always a mistake.)"
> 
> I had no need for TCP port 53 to be open. Until month and a half ago
> things worked as expected and I have more important things to do than to
> fix things which don't appear to be broken.

He's talking about publishing here. You are talking abbout resolving.
You do not have control about what sizes of record sets other are publishing.

djb is both respected and an outlier. Never take his opinion for
granted without consulting other sources.

Just one example: dig +dnssec akamai.com txt

> 
> The following 
> 
> https://www.openbsd.org/faq/pf/
> 
> is also evolving. It has been almost 15 years since the OpenBSD became
> my daily driver and I would swear (but I am not going to look through
> Internet archive) that there was a time when UDP port 53 was the only
> open domain service in the minimal working example.

I think if you look at the CVS history of the default pf.conf you'll
see that outgoing traffic was never blocked by default.

-Otto

> 
> 
> > 
> > On my authoritative servers roughly 1 in 1000 queries are over TCP, even
> > though no answers are over 512 bytes.  Like most people, I don't use
> > DNSSEC, and unlike most people, I do use DNSCurve.
> > 
> 
> I try to stay away from a universal quantification (a professional
> deformation).  I do use DNSSEC more or less since it became available. I
> used it before the time it became default in unbound.conf file of
> OpenBSD. That is an example of the OpenBSD unbound.conf default which
> actually changed not so long time ago.
> 
> 
> 
> > I've seen "in the wild" authoritative servers that always set TC=1 but
> > that's exceedingly rare and a bad idea for general use.
> > 
> > If you block 53/udp then your life will change for the worse a LOT
> > faster than if you merely block 53/tcp, but both are used, and both
> > should be allowed.  Blocking either will lead to downtime.
> > 
> > If you don't understand the defaults then leave them be.  Put your
> > energy into fixing things that are visibly broken.
> >
> 
> That is exactly the reason that I kept 53/tcp closed past it useful
> shelf life. I actually have more interesting things to do than fixing
> the stuff which are only marginally important for my life. 
> 
> 
> > 
> > Just a related PSA: please don't block ICMP either.  It's important,
> > necessary, and good.
> 
> I am not blocking and I have never blocked it although I do have some
> restrictions in place since I read the first edition of the book of PF. 
> As you know the book is overdue for 4th edition. As you see the only
> constant in life is change. 
> 
> 
> Cheers,
> Predrag
> 
> > 
> > Nicolai
> 



Re: Does DNS need TCP?

2020-09-20 Thread Otto Moerbeek
On Sun, Sep 20, 2020 at 12:43:41AM -0400, Predrag Punosevac wrote:

> 
> 
> Hi Misc,
> 
> I have been a double as a system admin for our small university research
> group for a number of years now but every now and then I get reminded of
> my own ignorance. One of those moments happened a month and a half ago
> when pkg management tools stopped working on all my FreeBSD file servers
> and jail hosts. After waisting an hour, I got to the bottom of my
> problem. Namely, my caching DNS Unbound resolvers (obviously running of
> OpenBSD) which also serve my LAN and DMZ authoritatively could no longer
> resolve 
> 
> pkg.freebsd.org.
> 
> After waisting another hour it became clear that authoritative DNS for 
> pkg.freebsd.org no longer was serving using UDP protocol and was
> expecting my DNS resolver to use TCP instead of UDP for name queries. 
> For number of years I had in my /var/unbound/etc/unbound.conf line
> 
> do-tcp: no
> 
> even though I was aware that OpenBSD 6.7 is shipped with
> 
> do-tcp: yes
> 
> To make things worse I was blocking port TCP port 53. 
> 
> I am not much of a DNS expert but I was under impression that TCP was
> only used for publishing record sets larger than 512 bytes. However, it
> appears that I am mistaken.
> 
> https://serverfault.com/questions/181956/is-it-true-that-a-nameserver-have-to-answer-queries-over-tcp
> 
> That is not just a random garbage thread. The person whose answer was
> accepted claims to be the author of RFC 5966. There is another
> interesting post getting a lot of thumbs downs who is bringing back some
> of old fights started by Daniel Bernstein.  
> 
> There is a second less illuminating thread 
> 
> https://serverfault.com/questions/404840/when-do-dns-queries-use-tcp-instead-of-udp
> 
> According to above threads it appears that DNSSEC validation requires
> TCP port 53 and do-tcp: yes to work properly.
> 
> Could a kind soul who runs DNS for living point me to the documentation
> which I can use to educate myself.

https://tools.ietf.org/html/rfc7766 says it all.

The TCP requirement is related to DNSSEC because DNSSEC makes the DNS
replies bigger, but the custom of dumping more and more into TXT
records is another reason. The recommendation to use an UDP buffer
size of 1232 to avoid big UDP packets and thus IP fragmentation also
makes TCP fallback needed more often. See https://dnsflagday.net/2020/

For all practical purposes, setting up DNS without TCP is broken.

-Otto




Re: Troubleshooting pf congestion

2020-09-14 Thread Otto Moerbeek
On Mon, Sep 14, 2020 at 11:19:46AM -0400, Scott Reese wrote:

> Greetings:
> 
> I am troubleshooting an issue: users complaining about network performance. 
> The firewall
> is an OpenBSD 6.7 system with patches applied. I've traced the issue and I'm 
> seeing the
> congestion counter incrementing on system. The problems that we're seeing fit 
> with what
> I have been able to find about congestion - when the firewall is congested it 
> continues
> passing packets that match existing state entries but it will not create any 
> new state
> entries until the congestion clears.
> 
> I'm having trouble troubleshooting it beyond that point because I have not 
> been able to
> find any additional information about what the congestion counter is 
> counting. There is
> the information in the pfctl man page: "congestion: network interface queue 
> congested",
> but beyond that I can't really find any information about exactly what 
> network interface
> queue is congested.
> 
> I'm not seeing packets being dropped, either on the switch side or firewall 
> side that
> correspond with the congestion counter going up. The average on the 
> congestion counter
> stays around 10/s, but what it's really doing is going up by 100-300/s for 
> short periods
> and then not moving for longer periods.
> 
> If anyone could spare a couple of sentences or a share a link to a page 
> detailing what
> state causes the system to consider itself contested, I would appreciate it.
> 
> Thanks for your time.
> 
> -Scott

openbsd-archive.7691.n7.nabble.com/PF-congestion-question-td156490.html

Description and potential remedy are stil true, afaik.

-Otto

> 
> 
> System dmesg:
> 
> OpenBSD 6.7 (GENERIC.MP) #6: Thu Sep  3 14:08:18 MDT 2020
> 
> r...@syspatch-67-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 8386699264 (7998MB)
> avail mem = 8119902208 (7743MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7fb76000 (62 entries)
> bios0: vendor American Megatrends Inc. version "2.2" date 05/23/2018
> bios0: Supermicro X11SSL-F
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG HPET LPIT SSDT SSDT SSDT 
> DBGP DBG2 SSDT SSDT UEFI SSDT DMAR EINJ ERST BERT HEST
> acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) 
> RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) PXSX(S4) 
> RP13(S4) PXSX(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Xeon(R) CPU E3-1280 v6 @ 3.90GHz, 3901.62 MHz, 06-9e-09
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 24MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Xeon(R) CPU E3-1280 v6 @ 3.90GHz, 3900.01 MHz, 06-9e-09
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> cpu2: Intel(R) Xeon(R) CPU E3-1280 v6 @ 3.90GHz, 3900.01 MHz, 06-9e-09
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> cpu3: Intel(R) Xeon(R) CPU E3-1280 v6 @ 3.90GHz, 3900.01 MHz, 06-9e-09
> cpu3: 

Re: Crashing 64bit (AMD) 6.7 kernel on APU2

2020-08-30 Thread Otto Moerbeek
On Sun, Aug 30, 2020 at 03:33:17PM +1000, Damian McGuckin wrote:

> 
> Hi,
> 
> For the first time ever, we have seen a crashing kernel. Having never
> experienced this before on any OpenBSD release for over 20 years, I have no
> debugging experience. We have simply reverted to 32bit to see it that is the
> issue. The system works flawlessly with 6.3 in 32 bit mode but we thought we
> should update.
> 
> This is on an APU2 with an AMD64 release.
> 
> Has anybody seen the same problem?

Without any usefull details noboday can answer that question.

-Otto

> 
> Thanks - Damian
> 
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer
> 



Re: Confused by adjfreq(2)

2020-08-15 Thread Otto Moerbeek
On Sat, Aug 15, 2020 at 09:10:26AM +0100, Julian Smith wrote:

> On Mon, 10 Aug 2020 09:53:10 +0200
> Otto Moerbeek  wrote:
> 
> > On Sun, Aug 09, 2020 at 09:46:00PM +0100, Julian Smith wrote:
> > 
> > > I've just used adjfreq() directly to correct my hardware clock,
> > > which was running an hour ahead of UTC (due to my hardware
> > > previously running Windows).
> > > 
> > > But i've struggled to understand the adjfreq(2) man page, so ended
> > > up finding a value for  by trial and error.
> > > 
> > > I ended up with this code:
> > > 
> > > double x = 1.5;
> > > int64_t newfreq = ((1ll << 32) * 1e9) * x;
> > > adjfreq(newfreq, NULL);  
> > 
> > This does not look like actual code, first arg should be a pointer.
> 
> Ah yes, apologies for this, i foolishly hand-wrote the code above in
> the post. The actual code i used did pass a pointer:
> 
> e = adjfreq(, );
> 
> > 
> > > 
> > > In this table, the second column is the increment in the time as
> > > shown by running date(1) twice, over a 10 second period as measured
> > > using my phone as a timer, for different values of x:
> > > 
> > > x  10s
> > > -
> > > 0  10
> > > 0.112
> > > 0.25   13
> > > 0.515
> > > 0.75   17
> > > 0.819
> > > 0.919
> > > 1.021
> > > 1.1 1
> > > 1.25:   3
> > > 1.5:6
> > > 1.75:   8
> > > 2: 10
> > > 2.25:  10
> > > 2.5:   10  
> > 
> > The only user of adjfreq(2) in base is ntpd(8), which caps it's
> > adjustments between +/-MAX_FREQUENCY_ADJUST = 128e-5.
> > 
> > It is very well possible the calculations in the kernel go wrong with
> > large(r) values. The API exists for gradual adjustments, not for
> > anything big.  Scott Cheloha  has been working
> > on the kernel side of things, he might know more, so I Cc'ed him,
> > don't know if her reads misc@.
> 
> Thanks for doing this.
> 
> Using a big adjustment was very convenient for fixing my problem, so it
> might be of general use/interest. I guess alternatives would be to get
> control early in boot and fix it up; or i think one can tell the OS that
> the hardware clock is set to a particular offset from UTC (can't find
> the man page for this right now, but i'm sure i came across it when
> investigating adjfreq).

ntpd fixes the clock very early in the boot using both htpps and ntp
time sources, but is conservative: it does not do backward
adjustments. There is utc_offset in src/conf/param.c.

-Otto




Re: How many IPs can I block before taking a performance hit?

2020-08-12 Thread Otto Moerbeek
On Wed, Aug 12, 2020 at 08:11:14AM -0400, Alan McKay wrote:

> Hey folks,
> 
> This is one that is difficult to test in a test environment.
> 
> I've got OpenBSD 6.5 on a relatively new pair of servers each with 8G RAM.
> 
> With some scripting I'm looking at feeding block IPs to the firewalls
> to block bad-guys in near real time, but in theory if we got attacked
> by a bot net or something like that, it could result in a few thousand
> IPs being blocked.  Possibly even 10s of thousands.
> 
> Are there any real-world data out there on how big of a block list we
> can handle without impacting performance?
> 
> We're doing the standard /etc/blacklist to load a table and then have
> a block on the table right at the top of the ruleset.
> 
> thanks,
> -Alan
> 
> -- 
> "You should sit in nature for 20 minutes a day.
>  Unless you are busy, then you should sit for an hour"
>  - Zen Proverb
> 

Typical answer: "it depends".  Having in the order of 10k of rules
might not be a smart idea.  But if you are using tables you should do
fine for many, many IPs.

-Otto



Re: explicit_bzero vs. alternatives

2020-08-11 Thread Otto Moerbeek
On Tue, Aug 11, 2020 at 08:20:32AM +0200, Otto Moerbeek wrote:

> On Tue, Aug 11, 2020 at 08:13:24AM +0200, Philipp Klaus Krause wrote:
> 
> > Am 11.08.20 um 02:52 schrieb Theo de Raadt:
> > > 
> > > But no, WG14 are the lords and masters in the high castle, and now 6
> > > years after the ship sailed something Must Be Done, it must look like
> > > They Solved The Problem, and so they'll create an incompatible API.
> > > 
> > > Will they be heroes?  No, not really.  Changing the name is villainous.
> > > 
> > 
> > The purpose of WG14 is to codify existing practise, not to invent (see
> > N2086 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2086.htm, 8. and
> > 13.).
> > 
> > WG14 has reserved some identifiers for future extensions of the
> > standard. E.g. those starting with mem_. Naturally, others then choose
> > identifiers that do not conflict with this, such as explicit_bzero. But
> > if that name is then used in the standard unchanged, it would mean that
> > future extensions only use exactly those identifiers not reserved for
> > future extensions.
> > 
> > Philipp
> 
> But if we would use reserved identifiers, we would be castigated for that.
> 
> Don't you see your process does not work?
> 
>   -Otto
> 

Let me elaborate. IMO, if the WG refuses to include a common function
user in various platforms into the standard and instead insist on
naming it differently it just creates confusion and work and thus
bugs. In that, it hinders progress instead of enabling it.

-Otto



Re: explicit_bzero vs. alternatives

2020-08-11 Thread Otto Moerbeek
On Tue, Aug 11, 2020 at 08:13:24AM +0200, Philipp Klaus Krause wrote:

> Am 11.08.20 um 02:52 schrieb Theo de Raadt:
> > 
> > But no, WG14 are the lords and masters in the high castle, and now 6
> > years after the ship sailed something Must Be Done, it must look like
> > They Solved The Problem, and so they'll create an incompatible API.
> > 
> > Will they be heroes?  No, not really.  Changing the name is villainous.
> > 
> 
> The purpose of WG14 is to codify existing practise, not to invent (see
> N2086 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2086.htm, 8. and
> 13.).
> 
> WG14 has reserved some identifiers for future extensions of the
> standard. E.g. those starting with mem_. Naturally, others then choose
> identifiers that do not conflict with this, such as explicit_bzero. But
> if that name is then used in the standard unchanged, it would mean that
> future extensions only use exactly those identifiers not reserved for
> future extensions.
> 
> Philipp

But if we would use reserved identifiers, we would be castigated for that.

Don't you see your process does not work?

-Otto



Re: Confused by adjfreq(2)

2020-08-10 Thread Otto Moerbeek
On Sun, Aug 09, 2020 at 09:46:00PM +0100, Julian Smith wrote:

> I've just used adjfreq() directly to correct my hardware clock, which
> was running an hour ahead of UTC (due to my hardware previously running
> Windows).
> 
> But i've struggled to understand the adjfreq(2) man page, so ended up
> finding a value for  by trial and error.
> 
> I ended up with this code:
> 
> double x = 1.5;
> int64_t newfreq = ((1ll << 32) * 1e9) * x;
> adjfreq(newfreq, NULL);

This does not look like actual code, first arg should be a pointer.

> 
> In this table, the second column is the increment in the time as shown
> by running date(1) twice, over a 10 second period as measured using my
> phone as a timer, for different values of x:
> 
> x  10s
> -
> 0  10
> 0.112
> 0.25   13
> 0.515
> 0.75   17
> 0.819
> 0.919
> 1.021
> 1.1 1
> 1.25:   3
> 1.5:6
> 1.75:   8
> 2: 10
> 2.25:  10
> 2.5:   10

The only user of adjfreq(2) in base is ntpd(8), which caps it's
adjustments between +/-MAX_FREQUENCY_ADJUST = 128e-5.

It is very well possible the calculations in the kernel go wrong with
large(r) values. The API exists for gradual adjustments, not for
anything big.  Scott Cheloha  has been working
on the kernel side of things, he might know more, so I Cc'ed him,
don't know if her reads misc@.

-Otto

> 
> So using x=1.5 makes OpenBSD's clock run at 0.6x of real time. I used
> this value to correct the one hour error in just over two hours.
> 
> But i wonder whether anyone could explain the values in the above
> table? The code in src/sys/kern/kern_tc.c:tc_windup() might be
> relevant, but i'm not sure what exactly it is doing.
> 
> As far as i can tell, the actual int64 values corresponding to the
> values of x above are:
> 
> x=0.00 newfreq=0
> x=0.10 newfreq=4294967296
> x=0.25 newfreq=10737418240
> x=0.50 newfreq=21474836480
> x=0.75 newfreq=32212254720
> x=0.80 newfreq=34359738368
> x=0.90 newfreq=38654705664
> x=1.00 newfreq=42949672960
> x=1.10 newfreq=47244640256
> x=1.25 newfreq=53687091200
> x=1.50 newfreq=64424509440
> x=1.75 newfreq=75161927680
> x=2.00 newfreq=85899345920
> x=2.25 newfreq=-9223372036854775808
> x=2.50 newfreq=-9223372036854775808
> 
> So these values increase monotonically, except for presumably wrapping
> errors for x=2.25 and x=2.5. So how come there is a big difference in
> behaviour between x=1.0 (21x) and x=1.1 (1x) ?
> 
> Thanks for any help here,
> 
> - Jules
> 
> -- 
> http://op59.net
> 
> 



Re: grow a filesystem on a softraid

2020-07-22 Thread Otto Moerbeek
On Wed, Jul 22, 2020 at 02:20:41PM +0200, Leo Unglaub wrote:

> Hey,
> i have the following setup: I have the drive sd1 with 20GB and on there i
> have one partition "a" with the type RAID. On that raid i have used bioctl
> to create an encrypted partition. When i decrypt sd1a it becomes sd3 and on
> there i have my normal sd3a with the type FFS.
> 
> It works great but now i have to grow that disk in size. I used "disklabel
> -E sd1", enlarged the boundried to the new size and then enlarged the disk.
> It worked great, disklabel -h sd1 shows already the new size. But when i
> decrypt by using "bioctl -cC -l sd1a" the new sd3 is still on the old size.
> The problem is that i cannot enlarge the boundried on that sd3 disk. Any
> ideas what i can do in this case?

Backup, recreate the RAID, restore.

THe RAID meta data includes the size and AFAIK, there is now way to
change that after creation.

-Otto

> 
> 
> 
> Here is the disklabel from sd1, the disk with the RAID partition that i
> could resize correctly.
> 
> > # disklabel sd1 # /dev/rsd1c:
> > type: SCSI
> > disk: SCSI disk
> > label: vol-vmai
> > duid: 8243870d445950cf
> > flags:
> > bytes/sector: 512
> > sectors/track: 80
> > tracks/cylinder: 16
> > sectors/cylinder: 1280
> > cylinders: 16383
> > total sectors: 419430400
> > boundstart: 128
> > boundend: 419430400
> > drivedata: 0
> > 
> > 16 partitions:
> > #size   offset  fstype [fsize bsize   cpg]
> >   a:419430272  128RAID  c:
> > 4194304000  unused
> 
> And here is the disklabel from the sd3 disk, the one that i get when i
> decrypt sd1a.
> 
> > # disklabel sd3
> > # /dev/rsd3c:
> > type: SCSI
> > disk: SCSI disk
> > label: SR CRYPTO VMAIL
> > duid: f7defe201b90c524
> > flags:
> > bytes/sector: 512
> > sectors/track: 63
> > tracks/cylinder: 255
> > sectors/cylinder: 16065
> > cylinders: 1305
> > total sectors: 20969584
> > boundstart: 64
> > boundend: 20969584
> > drivedata: 0
> > 
> > 16 partitions:
> > #size   offset  fstype [fsize bsize   cpg]
> >   a: 20964736   64  4.2BSD   2048 16384 12960 # 
> > /var/vmail
> >   c: 209695840  unused
> 
> 
> Maybe someone of you could be so kind and give me a hint into the right
> direction. That would be so nice, thanks!
> 
> Greetings
> Leo
> 
> 
> > # dmesg
> > OpenBSD 6.7 (RAMDISK_CD) #177: Thu May  7 11:19:02 MDT 2020
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
> > real mem = 4177383424 (3983MB)
> > avail mem = 4046757888 (3859MB)
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf6a50 (10 entries)
> > bios0: vendor Hetzner version "2017" date 11/11/2017
> > bios0: Hetzner vServer
> > acpi0 at bios0: ACPI 1.0
> > acpi0: tables DSDT FACP APIC
> > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> > cpu0 at mainbus0: apid 0 (boot processor)
> > cpu0: Intel Xeon Processor (Skylake, IBRS), 2100.23 MHz, 06-55-04
> > cpu0: 
> > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLWB,AVX512CD,AVX512BW,AVX512VL,PKU,MD_CLEAR,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,MELTDOWN
> > cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 
> > 64b/line 16-way L2 cache
> > cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> > cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> > cpu0: apic clock running at 1000MHz
> > cpu at mainbus0: not configured
> > ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
> > acpiprt0 at acpi0: bus 0 (PCI0)
> > acpicpu at acpi0 not configured
> > "ACPI0006" at acpi0 not configured
> > "PNP0A03" at acpi0 not configured
> > acpicmos0 at acpi0
> > "PNP0A06" at acpi0 not configured
> > "PNP0A06" at acpi0 not configured
> > "PNP0A06" at acpi0 not configured
> > "QEMU0002" at acpi0 not configured
> > "ACPI0010" at acpi0 not configured
> > cpu0: using VERW MDS workaround
> > pvbus0 at mainbus0: KVM
> > pci0 at mainbus0 bus 0
> > pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
> > "Intel 82371SB ISA" rev 0x00 at pci0 dev 1 function 0 not configured
> > pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 
> > 0 wired to compatibility, channel 1 wired to compatibility
> > pciide0: channel 0 disabled (no drives)
> > atapiscsi0 at pciide0 channel 1 drive 0
> > scsibus0 at atapiscsi0: 2 targets
> > cd0 at scsibus0 targ 0 lun 0:  removable
> > cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
> > "Intel 82371AB Power" rev 0x03 at pci0 dev 1 function 3 not configured
> > vga1 at pci0 dev 2 function 0 "Bochs VGA" rev 0x02
> > vga1: aperture needed
> > wsdisplay1 at vga1 

Re: NSD Problems (Reverse Direction)

2020-07-08 Thread Otto Moerbeek
On Thu, Jul 09, 2020 at 01:19:47AM +, ken.hendrick...@l3harris.com wrote:

> What am I doing wrong???  I'm using nsd on OpenBSD.
> 
> 
> 
> 
> 
> nsd works only in the forward direction: from a name to an IP address.
> I'm using my named zone files from way back.
> nsd-checkzone says that the zone files are good.
> Here are the startup logs for nsd:
> --
> Jul  8 20:30:20 Soekris2 nsd[85856]: nsd starting (NSD 4.2.4)
> Jul  8 20:30:21 Soekris2 nsd[78426]: zone 10.24.172.in-addr.arpa read with 
> success
> Jul  8 20:30:21 Soekris2 nsd[78426]: zone 20.24.172.in-addr.arpa read with 
> success
> Jul  8 20:30:21 Soekris2 nsd[78426]: zone 30.24.172.in-addr.arpa read with 
> success
> Jul  8 20:30:21 Soekris2 nsd[78426]: zone 2.168.192.in-addr.arpa read with 
> success
> Jul  8 20:30:21 Soekris2 nsd[78426]: zone Foo.Bar read with success
> Jul  8 20:30:21 Soekris2 nsd[78426]: nsd started (NSD 4.2.4), pid 71631
> --
> 
> 
> 
> 
> 
> nsd works in the forward direction (not shown).
> nsd fails in the reverse direction:
> --
> 117 Soekris2# nslookup
> > server 127.0.0.1 
> Default server: 127.0.0.1
> Address: 127.0.0.1#53
> > set port 53053
 ^
> > 172.24.20.1
> Server:   127.0.0.1
> Address:  127.0.0.1#53
  ^^
You're not asking the server you expect.

Dunno why, never use it. Maybe has to do with the recent cleanup os
nslookup and friends. I prefer dig.

-Otto

> 
> ** server can't find 1.20.24.172.in-addr.arpa: NXDOMAIN
> --
> 
> 
> 
> 
> 
> Here is an example reverse-direction file: db.20.24.172.in-addr.arpa
> --
> ;
> ; BIND reverse data file for 20.24.172.in-arpa.arpa.
> ;
> ; Origin added to names not ending in a dot:20.24.172.in-addr.arpa.
> 
> $TTL3h
> 
> @ IN SOA Soekris1.Foo.Bar. root.Soekris1.Foo.Bar. (
>  2020070501 ; Serial
>   10800 ; Refresh   3 hours
>3600 ; Retry 1 hour
>  604800 ; Expire1 week
>3600 )   ; Negative Caching  1 hour
> 
> ; Name Servers
> 
> ;IN NS  Cherub.Foo.Bar.
> ;IN NS  Tux.Foo.Bar.
> IN  NS  Soekris1.Foo.Bar.
> IN  NS  Soekris2.Foo.Bar.
> IN  NS  PcEngines1.Foo.Bar.
> IN  NS  PcEngines2.Foo.Bar.
> 
> ; Network Name
> 0   IN  PTR Wired.20.
> 
> 1   IN  PTR WirelessAccess.Foo.Bar.
> 2   IN  PTR WirelessRouter.Foo.Bar.
> --
> 
> 
> 
> 
> 
> Any ideas?
> 
> Why would nsd work in the forward direction,
> but not in the reverse direction,
> if all of the zone files are good?
> 
> What is different between nsd and named?
> 
> 
>   
> 
> CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use 
> of the intended recipient and may contain material that is proprietary, 
> confidential, privileged or otherwise legally protected or restricted under 
> applicable government laws. Any review, disclosure, distributing or other use 
> without expressed permission of the sender is strictly prohibited. If you are 
> not the intended recipient, please contact the sender and delete all copies 
> without reading, printing, or saving.
> 



Re: strlcpy version speed tests?

2020-07-04 Thread Otto Moerbeek
On Sat, Jul 04, 2020 at 09:07:35AM -0400, Brian Brombacher wrote:

> 
> >> On Jul 1, 2020, at 1:14 PM, gwes  wrote:
> >> 
> >> On 7/1/20 8:05 AM, Luke Small wrote:
> >> I spoke to my favorite university computer science professor who said
> >> ++n is faster than n++ because the function needs to store the initial
> >> value, increment, then return the stored value in the former case,
> >> while the later merely increments, and returns the value. Apparently,
> >> he is still correct on modern hardware.
> > For decades the ++ and *p could be out of order, in different
> > execution units, writes speculatively queued, assigned to aliased registers,
> > etc, etc, etc.
> > 
> > Geoff Steckel
> 
> Hey Luke,
> 
> I love the passion but try to focus your attention on the fact that their are 
> multiple architectures supported and compiler optimizations are key here.  Go 
> with Marc’s approach using arch/ asm.  Implementations can be made over time 
> for the various arch’s, if such an approach is desirable by the project.  You 
> can pull a well-optimized version based on your code, for your arch, and then 
> slim it down a bunch.
> 
> Cheers,
> Brian
> 
> [Not a project developer.  Just an observer.]
> 
> 

Another data point for consideration: the pdp11 instruction set had
post-increment and pre-decrement indirect memory reference
instructions. If I'm not mistaken, using pre-increment or post
decrement on this architecture would impose a penalty. So your
university computer science professor making such sweeping statements
maybe doesn't deserve to be your favorite.

-Otto



Re: disklabel: autoalloc failed

2020-06-26 Thread Otto Moerbeek
On Fri, Jun 26, 2020 at 05:53:24PM +, Rupert Gallagher wrote:

> Ref. disklabel(8)
> > The maximum disk and partition size is 64PB.
> 
> Is that so? Let see...
> 
> OpenBSD 6.7 (GENERIC.MP) #2: Thu Jun  4 09:55:08 MDT 2020
> 
> $> doas dmesg | grep sd3
> sd3 at scsibus2 targ 2 lun 0:  
> naa.5000c500c3ad5c90
> sd3: 4769307MB, 512 bytes/sector, 9767541168 sectors
> 
> $> doas disklabel -p t sd3
> # /dev/rsd3c:
> type: SCSI
> disk: SCSI disk
> label: ST5000LM000-2AN1
> duid: [omitted]
> flags:
> bytes/sector: 512
> sectors/track: 255
> tracks/cylinder: 511
> sectors/cylinder: 130305
> cylinders: 74959
> total sectors: 9767541168 # total bytes: 4.5T
> boundstart: 256
> boundend: 4294852800

Here's our problem. Use the b command to exend the OpenBSD to the
whole disk.

-Otto

> drivedata: 0
> 
> 16 partitions:
> #    size   offset  fstype [fsize bsize   cpg]
>   c: 4.5T    0  unused
> 
> $> doas disklabel -E sd3
> sd3> p t
> OpenBSD area: 256-4294852800; size: 2.0T; free: 2.0T
> ..^^ :(
> #    size   offset  fstype [fsize bsize   cpg]
>   c: 4.5T    0  unused
> 
> $> echo "/ 4T" >label
> 
> $> doas disklabel -w -A -T label sd3
> disklabel: autoalloc failed
> 
> :(
> 
> $> doas disklabel -E sd3
> Label editor (enter '?' for help at any prompt)
> sd3> p t
> OpenBSD area: 256-4294852800; size: 2.0T; free: 2.0T
> #size   offset  fstype [fsize bsize   cpg]
>   c: 4.5T0  unused
> sd3> a
> partition: [a]
> offset: [256]
> size: [4294852544]
> FS type: [4.2BSD]
> sd3*> p t
> OpenBSD area: 256-4294852800; size: 2.0T; free: 0.0T
> #size   offset  fstype [fsize bsize   cpg]
>   a: 2.0T  256  4.2BSD   8192 65536 1
>   c: 4.5T0  unused
> sd3*>
> 
> :(
> 



Re: Stuck in Needbuf state, trying to understand (6.7)

2020-06-25 Thread Otto Moerbeek
On Thu, Jun 25, 2020 at 09:51:41PM -0400, sven falempin wrote:

> Hello,
> 
> I have a script that mostly untar stuff on a vnd device.
> And i have the same problem with syspatch
> 
> The program state gets into needbuf forever, ( the top state ).
> 
> I'm trying to figure out what is happening,
> I have a feeling it may be an entropy exhaustion
> but it's just a guess.

You can't always trust feelings.

> 
> vmstat -m goes near 100% usage quickly
> and swap/memory is like empty according to top.
> 
> Is it possible to get out of `vmstat -m` logged memory,
> could it be a limit in login.conf that I reach without knowing ?
> 
> Once the problem is present, I cannot do anything  but reboot
> which does not help to understand what is going on.
> 
> Please hAlp.

You fail to report almost all relevent information. So we cannot help.

-Otto



Re: OpenBSD 6.6/amd64 kernel crash: softdep_deallocate_dependencies: unrecovered I/O error

2020-06-07 Thread Otto Moerbeek
On Sun, Jun 07, 2020 at 02:18:50PM +0200, Antonius Bekasi wrote:

> Hi Misc,
> 
> 
> 
> This is just a report. My lovely OpenBSD firewall crashed lately too much. So 
> here kernel debugger output.
> 
> I was using softdep in every partition. Now i removed every softdep from 
> fstab. I think using softdep under /usr partition was the source of my fault: 
> 
> 
> 
> # cat /etc/fstab.old
> 
> 6d79d3e0564cde27.b none swap sw
> 
> 6d79d3e0564cde27.a / ffs rw 1 1
> 
> 6d79d3e0564cde27.g /log ffs rw,softdep,nodev,nosuid 1 2
> 
> 6d79d3e0564cde27.d /tmp ffs rw,softdep,nodev,nosuid 1 2
> 
> 6d79d3e0564cde27.f /usr ffs rw,softdep,wxallowed,nodev 1 2
> 
> 6d79d3e0564cde27.e /var ffs rw,softdep,nodev,nosuid 1 2
> 
> 
> 
> 
> Here ddb outputs:
> 
> 
> 
> ddb{0}> show panic
> 
> softdep_deallocate_dependencies: unrecovered I/O error

This is a clear sign of hardware failing.

-Otto

> 
> ddb{0}> trace
> 
> db_enter() at db_enter+0x10
> 
> panic() at panic+0x128
> 
> softdep_deallocate_dependencies(fd83e1c6af00) at 
> softdep_deallocate_depende
> 
> ncies+0x49
> 
> brelse(fd83e1c6af00) at brelse+0xdc
> 
> sd_buf_done(fd83e1eb4548) at sd_buf_done+0x124
> 
> scsi_done(fd83e1eb4548) at scsi_done+0x24
> 
> ahci_port_intr(80277d00,8000) at ahci_port_intr+0xa88
> 
> ahci_ata_cmd_timeout(802a0f00) at ahci_ata_cmd_timeout+0x226
> 
> softclock(0) at softclock+0x142
> 
> softintr_dispatch(0) at softintr_dispatch+0xf2
> 
> Xsoftclock(0,40,1388,0,40,81f3c6f8) at Xsoftclock+0x1f
> 
> acpicpu_idle() at acpicpu_idle+0x1d2
> 
> sched_idle(81f3bff0) at sched_idle+0x225
> 
> end trace frame: 0x0, count: -13
> 
> ddb{0}> machine ddbcpu 1
> 
> Stopped at      x86_ipi_db+0x12:        leave
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> x86_ipi_handler() at x86_ipi_handler+0x80
> 
> Xresume_lapic_ipi(c,800022008ff0,80002202bff0,0,0,8000cef0) 
> at X
> 
> resume_lapic_ipi+0x23
> 
> __mp_acquire_count(81fbd988,1) at __mp_acquire_count+0x82
> 
> mi_switch() at mi_switch+0x243
> 
> sleep_finish(8000335e7eb8,1) at sleep_finish+0x84
> 
> tsleep(fd83f0d36b60,118,81c9866f,bb9) at tsleep+0xcb
> 
> kqueue_scan(fd83f0d36b60,40,48097867800,8000335e8270,8000cef0,f
> 
> fff8000335e82b8) at kqueue_scan+0x113
> 
> sys_kevent(8000cef0,8000335e8320,8000335e8380) at 
> sys_kevent+0x
> 
> 2a9
> 
> syscall(8000335e83f0) at syscall+0x389
> 
> Xsyscall(6,48,7f7f1f80,48,0,48097867800) at Xsyscall+0x128
> 
> end of kernel
> 
> end trace frame: 0x7f7f1f40, count: 4
> 
> ddb{1}> trace
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> x86_ipi_handler() at x86_ipi_handler+0x80
> 
> Xresume_lapic_ipi(c,800022008ff0,80002202bff0,0,0,8000cef0) 
> at X
> 
> resume_lapic_ipi+0x23
> 
> __mp_acquire_count(81fbd988,1) at __mp_acquire_count+0x82
> 
> mi_switch() at mi_switch+0x243
> 
> sleep_finish(8000335e7eb8,1) at sleep_finish+0x84
> 
> tsleep(fd83f0d36b60,118,81c9866f,bb9) at tsleep+0xcb
> 
> kqueue_scan(fd83f0d36b60,40,48097867800,8000335e8270,8000cef0,f
> 
> fff8000335e82b8) at kqueue_scan+0x113
> 
> sys_kevent(8000cef0,8000335e8320,8000335e8380) at 
> sys_kevent+0x
> 
> 2a9
> 
> syscall(8000335e83f0) at syscall+0x389
> 
> Xsyscall(6,48,7f7f1f80,48,0,48097867800) at Xsyscall+0x128
> 
> end of kernel
> 
> end trace frame: 0x7f7f1f40, count: -11
> 
> ddb{1}>
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> end trace frame: 0x8000335e7d40, count: 0
> 
> ddb{1}>
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> end trace frame: 0x8000335e7d40, count: 0
> 
> ddb{1}>
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> end trace frame: 0x8000335e7d40, count: 0
> 
> ddb{1}>
> 
> x86_ipi_db(800022008ff0) at x86_ipi_db+0x12
> 
> end trace frame: 0x8000335e7d40, count: 0
> 
> ddb{1}> machine ddbcpu 2
> 
> Stopped at      x86_ipi_db+0x12:        leave
> 
> x86_ipi_db(800022019ff0) at x86_ipi_db+0x12
> 
> x86_ipi_handler() at x86_ipi_handler+0x80
> 
> Xresume_lapic_ipi(d,800022019ff0,8000223e8000,0,0,8000223e80d8) 
> at X
> 
> resume_lapic_ipi+0x23
> 
> _kernel_lock() at _kernel_lock+0xa2
> 
> timeout_del_barrier(8000223e80d8) at timeout_del_barrier+0xa2
> 
> msleep(81efbe70,81efbe90,20,81c70ac3,0) at msleep+0xf5
> 
> taskq_next_work(81efbe70,800022409010) at taskq_next_work+0x38
> 
> taskq_thread(81efbe70) at taskq_thread+0x6f
> 
> end trace frame: 0x0, count: 7
> 
> ddb{2}> trace
> 
> x86_ipi_db(800022019ff0) at x86_ipi_db+0x12
> 
> x86_ipi_handler() at x86_ipi_handler+0x80
> 
> Xresume_lapic_ipi(d,800022019ff0,8000223e8000,0,0,8000223e80d8) 
> at X
> 
> resume_lapic_ipi+0x23
> 
> _kernel_lock() at _kernel_lock+0xa2
> 
> timeout_del_barrier(8000223e80d8) at timeout_del_barrier+0xa2
> 
> 

Re: Filling a 4TB Disk with Random Data

2020-06-05 Thread Otto Moerbeek
On Thu, Jun 04, 2020 at 08:39:24PM -0700, Justin Noor wrote:

> Thanks you @misc.
> 
> Using dd with a large block size will likely be the course of action.
> 
> I really need to refresh my memory on this stuff. This is not something we
> do, or need to do, everyday.
> 
> Paul your example shows:
> 
> bs=1048576
> 
> How did you choose that number? Could you have gone even bigger? Obviously
> it is a multiple of 512.
> 
> The disks in point are 4TB Western Digital Blues. They have 4096 sector
> sizes.
> 
> I used a 16G USB stick as a sacrificial lamb to experiment with dd.
> Interestingly, there is no difference in time between 1m, 1k, and 1g. How
> is that possible? Obviously this will not be an accurate comparison of the
> WD disks, but it was still a good practice exercise.

Did you write to the raw device? That make a big difference. 

At some point increasing buffer size will not help, since you already are
hitting some other (hw or sw) limit to the bandwidth.

-Otto

> 
> Also Paul, to clarify a point you made, did you mean forget the random data
> step, and just encrypt the disks with softraid0 crypto? I think I like that
> idea because this is actually a traditional pre-encryption step. I don't
> agree with it, but I respect the decision. For our purposes, encryption
> only helps if the disks are off the machine, and someone is trying to
> access them. This automatically implies that they were stolen. The chances
> of disk theft around here are slim to none. We have no reason to worry
> about forensics either - we're not storing nuclear secrets.
> 
> Thanks for your time
> 
> 
> On Mon, Jun 1, 2020 at 7:28 AM Paul de Weerd  wrote:
> 
> > On Mon, Jun 01, 2020 at 06:58:01AM -0700, Justin Noor wrote:
> > | Hi Misc,
> > |
> > | Has anyone ever filled a 4TB disk with random data and/or zeros with
> > | OpenBSD?
> >
> > I do this before disposing of old disks.  Have written random data to
> > several sizes of disk, not sure if I ever wiped a 4TB disk.
> >
> > | How long did it take? What did you use (dd, openssl)? Can you share the
> > | command that you used?
> >
> > It takes quite some time, but OpenBSD (at least on modern hardware)
> > can generate random numbers faster than you can write them to spinning
> > disks (may be different with those fast nvme(4) disks).
> >
> > I simply used dd, with a large block size:
> >
> > dd if=/dev/random of=/dev/sdXc bs=1048576
> >
> > And then you wait.  The time it takes really depends on two factors:
> > the size of the disk and the speed at which you write (whatever the
> > bottleneck).  If you start, you can send dd the 'INFO' signal (`pkill
> > -INFO dd` (or press Ctrl-T if your shell is set up for it with `stty
> > status ^T`))  This will give you output a bit like:
> >
> > 30111+0 records in
> > 30111+0 records out
> > 31573671936 bytes transferred in 178.307 secs (177074202 bytes/sec)
> >
> > Now take the size of the disk in bytes, divide it by that last number
> > and subtract the second number.  This is a reasonable ball-park
> > indication of time remaining.
> >
> > Note that if you're doing this because you want to prevent others from
> > reading back even small parts of your data, you are better of never
> > writing your data in plain text (e.g. using softraid(4)'s CRYPTO
> > discipline), or (if it's too late for that), to physically destroy the
> > storage medium.  Due to smart disks remapping your data in case of
> > 'broken' sectors, some old data can never be properly overwritten.
> >
> > Cheers,
> >
> > Paul 'WEiRD' de Weerd
> >
> > --
> > >[<++>-]<+++.>+++[<-->-]<.>+++[<+
> > +++>-]<.>++[<>-]<+.--.[-]
> >  http://www.weirdnet.nl/
> >



Re: Message WARNING: CHECK AND RESET THE DATE! in kvm guests

2020-06-02 Thread Otto Moerbeek
On Tue, Jun 02, 2020 at 08:28:13AM +, Carlos Lopez wrote:

> Hi Otto,
> 
>  After some days without problems, it has happened again:
> 
> root on sd0a (912329c5d9d2b184.a) swap on sd0b dump on sd0b
> WARNING: clock gained 3 days
> WARNING: CHECK AND RESET THE DATE!
> 
>  With version 6.6 I never had these problems. I am using default conf for 
> ntpd, but some errors appears:

If you read what I wrote earlier, this is not a problem. ntpd sets the
time ok, so nothing to worry about.

-Otto

> 
> Jun  2 06:32:01 obsdfw ntpd[91858]: ntp engine ready
> Jun  2 06:32:01 obsdfw ntpd[91858]: constraint reply from 9.9.9.9: offset 
> 1.298291
> Jun  2 06:32:03 obsdfw ntpd[91858]: cancel settime because dns probe failed
> Jun  2 06:32:03 obsdfw savecore: no core dump
> Jun  2 06:32:04 obsdfw ftp-proxy[73808]: listening on 127.0.0.1 port 8021
> Jun  2 06:32:08 obsdfw ntpd[91858]: constraint reply from 216.58.211.36: 
> offset 1.270038
> Jun  2 06:32:29 obsdfw ntpd[91858]: peer 162.159.200.123 now valid
> Jun  2 06:32:30 obsdfw ntpd[91858]: peer 162.159.200.1 now valid
> Jun  2 06:32:30 obsdfw ntpd[91858]: peer 147.156.7.18 now valid
> Jun  2 06:32:32 obsdfw ntpd[91858]: peer 162.159.200.123 now valid
> Jun  2 06:32:33 obsdfw ntpd[91858]: reply from 147.156.7.26: not synced 
> (alarm), next query 3184s
> Jun  2 06:33:04 obsdfw ntpd[91858]: peer 147.156.7.18 now invalid
> Jun  2 06:33:27 obsdfw ntpd[61176]: adjusting local clock by 1.707163s
> Jun  2 06:33:50 obsdfw ntpd[91858]: peer 147.156.7.18 now valid
> Jun  2 06:35:06 obsdfw ntpd[61176]: adjusting local clock by 1.212163s
> Jun  2 06:37:18 obsdfw ntpd[61176]: adjusting local clock by 0.559285s
> Jun  2 06:39:28 obsdfw ntpd[91858]: clock is now synced
> Jun  2 06:39:28 obsdfw ntpd[91858]: constraint reply from 9.9.9.9: offset 
> -0.872650
> Jun  2 06:39:28 obsdfw ntpd[91858]: constraint reply from 216.58.211.36: 
> offset -0.880034
> Jun  2 07:02:16 obsdfw ntpd[61176]: adjusting clock frequency by -0.447350 to 
> 4.954650ppm
> Jun  2 07:25:37 obsdfw ntpd[91858]: peer 147.156.7.26 now valid
> Jun  2 07:28:08 obsdfw ntpd[61176]: adjusting clock frequency by -0.050205 to 
> 4.904445ppm
> Jun  2 07:34:08 obsdfw ntpd[91858]: reply from 147.156.7.26: not synced 
> (alarm), next query 3170s
> Jun  2 08:25:24 obsdfw ntpd[91858]: reply from 147.156.7.18: not synced 
> (alarm), next query 3154s
> 
> Ntpctl -s all output:
> 5/5 peers valid, constraint offset -1s, clock synced, stratum 4
> 
> peer
>wt tl st  next  poll  offset   delay  jitter
> 162.159.200.123 time.cloudflare.com
>  *  1 10  3   28s   33s 1.520ms 2.347ms 0.712ms
> 162.159.200.123 from pool pool.ntp.org
>  *  1 10  3   29s   32s 1.530ms 2.394ms 0.406ms
> 147.156.7.26 from pool pool.ntp.org
> 1 10  2   10s   34s 0.081ms20.071ms 0.387ms
> 162.159.200.1 from pool pool.ntp.org
> 1 10  3   11s   30s 1.502ms 2.442ms 0.134ms
> 147.156.7.18 from pool pool.ntp.org
> 1 10  2 3005s 3154s 1.199ms19.994ms 0.321ms
> 
> On 25/05/2020, 10:20, "Otto Moerbeek"  wrote:
> 
> On Mon, May 25, 2020 at 07:53:47AM +, Carlos Lopez wrote:
> 
> > Hi all,
> > 
> >  After upgrading four kvm guests to OpenBSD 6.7, I see the following 
> messages when these guests starts:
> > 
> > WARNING: clock gained 2 days
> > WARNING: CHECK AND RESET THE DATE!
> 
> This means the clock compared to the last mounted filesystem time differ.
> 
> Show what ntpd is doing after boot (see /var/log/daemon). If ntpd sets
> the time ok, there is nothing further to be done. It's just a warning
> that the kernel initially isn't sure about the time.
> 
>   -Otto
> 
> > 
> >  All four guests are fully patched. Dmesg output:
> > 
> > OpenBSD 6.7 (GENERIC) #1: Sat May 16 16:07:20 MDT 2020
> > 
> r...@syspatch-67-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > real mem = 788389888 (751MB)
> > avail mem = 752021504 (717MB)
> > mpath0 at root
> > scsibus0 at mpath0: 256 targets
> > mainbus0 at root
> > bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf5af0 (9 entries)
> > bios0: vendor SeaBIOS version "1.11.1-4.module+el8.1.0+4066+0f1aadab" 
> date 04/01/2014
> > bios0: Red Hat KVM
> > acpi0 at bios0: ACPI 3.0
> > acpi0: sleep states S5
> > acpi0: tables DSDT FACP APIC MCFG
> > acpi0: wakeup devices
> > acpitimer0 at acpi0: 3579545 Hz, 24 bits
> > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> > cpu0 at main

Re: OpenBSD 6.7 and ffs2 FAQs

2020-05-28 Thread Otto Moerbeek
On Wed, May 27, 2020 at 10:54:59PM +0200, Otto Moerbeek wrote:

> I got some questions on ffs2 in 6.7. This is to set the record
> straight, feel free to share on forums like reddit that I do not read,
> let alone post on.
> 
> 1. Using 6.7, the *installer* defaults to ffs2 for new filesystems for
>almost all platforms.
> 
> 2. Using 6.7, a newfs "by hand" still gets you ffs1, unless you use the
>-O2 flag or the partition > 1TB.
> 
> 3. In -current, newfs defaults to ffs2 for all platforms.
> 
> 4. ffs2 is faster than ffs2 when creating filesystems and almost always when
>fscking them.

"ffs2 is faster than ffs1" ...

> 
> 5. ffs2 uses 64-bit timestamps and block numbers. So it handles dates
>after 2038 and much larger partitions. This does not mean that super
>large partitions are always a good idea, there are still drawbacks:
>e.g. they do need lots of memory to fsck, especially when many inodes
>are in use.
> 
> 6. I have no plans for writing a conversion tool. You can convert an
>ffs1 filesystem to ffs2 using single user mode: umount; dump; newfs
>-O2; restore; mount. Or see it as an opportunity to reinstall and
>   get a nice clean system without cruft collected over the years.
> 
> Hope this help in clearing up some of questions people have,
> 
>   -Otto
> 
> 
> 



Re: OpenBSD 6.7 and ffs2 FAQs

2020-05-27 Thread Otto Moerbeek
On Thu, May 28, 2020 at 07:48:57AM +0200, Matthias wrote:

> On a fresh 6.7 installation, mount(8) shows 'type ffs'. Is there any way
> to figure out the version number?

dumpfs /dev/rsdXY | head -1

-Otto

> 
> 
> On 2020-05-27 22:54, Otto Moerbeek wrote:
> > I got some questions on ffs2 in 6.7. This is to set the record
> > straight, feel free to share on forums like reddit that I do not read,
> > let alone post on.
> > 
> > 1. Using 6.7, the *installer* defaults to ffs2 for new filesystems for
> > almost all platforms.
> > 
> > 2. Using 6.7, a newfs "by hand" still gets you ffs1, unless you use the
> > -O2 flag or the partition > 1TB.
> > 
> > 3. In -current, newfs defaults to ffs2 for all platforms.
> > 
> > 4. ffs2 is faster than ffs2 when creating filesystems and almost always when
> > fscking them.
> > 
> > 5. ffs2 uses 64-bit timestamps and block numbers. So it handles dates
> > after 2038 and much larger partitions. This does not mean that super
> > large partitions are always a good idea, there are still drawbacks:
> > e.g. they do need lots of memory to fsck, especially when many inodes
> > are in use.
> > 
> > 6. I have no plans for writing a conversion tool. You can convert an
> > ffs1 filesystem to ffs2 using single user mode: umount; dump; newfs
> > -O2; restore; mount. Or see it as an opportunity to reinstall and
> >get a nice clean system without cruft collected over the years.
> > 
> > Hope this help in clearing up some of questions people have,
> > 
> > -Otto
> > 
> > 
> > 
> > 
> 



OpenBSD 6.7 and ffs2 FAQs

2020-05-27 Thread Otto Moerbeek
I got some questions on ffs2 in 6.7. This is to set the record
straight, feel free to share on forums like reddit that I do not read,
let alone post on.

1. Using 6.7, the *installer* defaults to ffs2 for new filesystems for
   almost all platforms.

2. Using 6.7, a newfs "by hand" still gets you ffs1, unless you use the
   -O2 flag or the partition > 1TB.

3. In -current, newfs defaults to ffs2 for all platforms.

4. ffs2 is faster than ffs2 when creating filesystems and almost always when
   fscking them.

5. ffs2 uses 64-bit timestamps and block numbers. So it handles dates
   after 2038 and much larger partitions. This does not mean that super
   large partitions are always a good idea, there are still drawbacks:
   e.g. they do need lots of memory to fsck, especially when many inodes
   are in use.

6. I have no plans for writing a conversion tool. You can convert an
   ffs1 filesystem to ffs2 using single user mode: umount; dump; newfs
   -O2; restore; mount. Or see it as an opportunity to reinstall and
  get a nice clean system without cruft collected over the years.

Hope this help in clearing up some of questions people have,

-Otto





Re: Kernel relinking on old boxen at every boot

2020-05-25 Thread Otto Moerbeek
On Mon, May 25, 2020 at 05:35:17PM +0200, ULF wrote:

> Hello Devs,
> 
> I followed, some time ago, the proposal of a user who suggested a diff for
> an "opt out" of KARL to be placed in /etc/rc.conf.local, proposal which
> which wasn't welcomed well.
> 
> While agreeing that on servers and modern machines this is a great security
> feature which implies quite a small overhead, on the other side I am the
> owner of several old i386 boxen, mainly run just for hobby purposes for
> some hours a month, as, I could suppose, some other hobbists might do.
> 
> On Pentium 3's every boot means at least 5-7 minutes wait to have a usable
> machine, while on lower end boxen 10 minutes were already a desirable
> target, because on first gen Pentiums the time is well above.
> 
> This does not only meet pure number crunching, but, on old hardwares, also
> means extra stress for old disks which, especially on laptops, will become
> one day irreplaceable because of shortage. Not to consider extra
> electricity and time, whenever the machine needs a reboot.
> 
> Maybe other old platforms, beyond i386, might be affected this way too.
> 
> My question is:
> 
> considering that an opt out option has been already turned down, could at
> least old architectures be benefited of a "delay" option e.g. like tune2fs
> sets a fsck every n-th boot, could KARL, just for very old machines be
> tuned, say, to be applied every 10/20 boots?
> 
> Thank you very much for your attention.
> Ulf

I run 

nice /usr/libexec/reorder_kernel &

And my landisk is usable from the start.

-Otto



Re: 6.7 boot crashes on "entry point at" X1 Carbon gen7 i7-10510U

2020-05-25 Thread Otto Moerbeek
On Mon, May 25, 2020 at 08:29:56AM +0200, Otto Moerbeek wrote:

> On Sun, May 24, 2020 at 09:46:09PM +0900, John Mettraux wrote:
> 
> > On Sun, May 24, 2020 at 8:38 PM Otto Moerbeek  wrote:
> > >
> > > On Sun, May 24, 2020 at 08:26:43PM +0900, John Mettraux wrote:
> > >
> > > > On Sun, May 24, 2020 at 5:36 PM Stuart Henderson  
> > > > wrote:
> > > > >
> > > > > On 2020-05-23, John Mettraux  wrote:
> > > > > >
> > > > > > (...)
> > > > > >
> > > > > > Hard power down is the only way out, but rebooting still leads
> > > > > > immediately to the
> > > > > > "entry point at 0x1001000" wall. It is consistent. I tried boot -c,
> > > > > > boot -d or boot -s,
> > > > > > still the same wall.
> > > > > >
> > > > > > I have just tried with the install67.img snapshot (22-May-2020 21:12
> > > > > > 476545024) from
> > > > > > https://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ but it leads 
> > > > > > to the same
> > > > > > "entry point at 0x1001000" right after boot.
> > > > >
> > > > > Try downloading a 6.7 kernel (bsd.mp) to e.g. /bsd.test, and from the 
> > > > > 6.6 boot
> > > > > loader type "b /bsd.test". Do you still get the hang? This will give 
> > > > > you an idea
> > > > > whether the problem in 6.7 is with the newer boot loader or the 
> > > > > kernel.
> > > >
> > > > I can confirm that the hang doesn't happen with the 6.7 kernel and the 
> > > > 6.6 boot
> > > > loader (bootx64 3.46). The hang happens with the 6.7 boot loader (3.50).
> > > >
> > > > I will try to do a 6.7 install with the 3.46 boot loader.
> > >
> > > Can you also try using legacy boot mode (mbr)? There should be some
> > > setting in the bios to enable that.
> > >
> > > -Otto
> > 
> > I tried to set the boot mode to [Legacy First] and [Legacy Only]. In
> > both cases the Boot 3.47 kicked in
> > and allowed me to install.
> > 
> > I performed the install on the machine drive (sd0) with MBR and the
> > install was successful.
> > Dmesg below for the resulting 6.7 Snapshot.
> > 
> > I tried to install on sd0 with GPT. The install warned me "An EFI/GPT
> > disk may not boot. Proceed?"
> > I answered yes. The install proceeded but upon reboot it froze with
> > the "entry point at 0x1001000".
> > This was with bootx64 3.50.
> > 
> > I am going to re-install with sd0 MBR.
> > 
> > Thanks a lot!
> > 
> > John
> 
> I have an x1 6th generation that also does not like to boot using EFI.
> There's is a difference though: it already had problems with EFI
> when I initially installed it in Feb 2019. 
> 
> I'll see if I can find some time to make a more detail diagnosis.

I just tried and EFI boot with the latst snap works on it.  efifb(4)
is not configured but for the rest it seems to work ok using bootx64
3.50 and BIOS version 1.44.

-Otto



Re: Message WARNING: CHECK AND RESET THE DATE! in kvm guests

2020-05-25 Thread Otto Moerbeek
On Mon, May 25, 2020 at 07:53:47AM +, Carlos Lopez wrote:

> Hi all,
> 
>  After upgrading four kvm guests to OpenBSD 6.7, I see the following messages 
> when these guests starts:
> 
> WARNING: clock gained 2 days
> WARNING: CHECK AND RESET THE DATE!

This means the clock compared to the last mounted filesystem time differ.

Show what ntpd is doing after boot (see /var/log/daemon). If ntpd sets
the time ok, there is nothing further to be done. It's just a warning
that the kernel initially isn't sure about the time.

-Otto

> 
>  All four guests are fully patched. Dmesg output:
> 
> OpenBSD 6.7 (GENERIC) #1: Sat May 16 16:07:20 MDT 2020
> r...@syspatch-67-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> real mem = 788389888 (751MB)
> avail mem = 752021504 (717MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf5af0 (9 entries)
> bios0: vendor SeaBIOS version "1.11.1-4.module+el8.1.0+4066+0f1aadab" date 
> 04/01/2014
> bios0: Red Hat KVM
> acpi0 at bios0: ACPI 3.0
> acpi0: sleep states S5
> acpi0: tables DSDT FACP APIC MCFG
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel Core Processor (Broadwell), 1900.30 MHz, 06-3d-02
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 
> 64b/line 16-way L2 cache
> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 1000MHz
> ioapic0 at mainbus0: apid 0 pa 0xfec0, version 11, 24 pins
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xb000, bus 0-255
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpicpu0 at acpi0: C1(@1 halt!)
> "ACPI0006" at acpi0 not configured
> acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
> acpicmos0 at acpi0
> "PNP0A06" at acpi0 not configured
> "PNP0A06" at acpi0 not configured
> "QEMU0002" at acpi0 not configured
> "ACPI0010" at acpi0 not configured
> cpu0: using Broadwell MDS workaround
> pvbus0 at mainbus0: KVM
> pvclock0 at pvbus0
> pci0 at mainbus0 bus 0
> Pchb0 at pci0 dev 0 function 0 "Intel 82G33 Host" rev 0x00
> vga1 at pci0 dev 1 function 0 "Red Hat QXL Video" rev 0x04
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ppb0 at pci0 dev 2 function 0 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci1 at ppb0 bus 1
> virtio0 at pci1 dev 0 function 0 "Qumranet Virtio 1.x Network" rev 0x01
> vio0 at virtio0: address 00:50:56:f3:d8:1f
> virtio0: msix shared
> ppb1 at pci0 dev 2 function 1 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci2 at ppb1 bus 2
> virtio1 at pci2 dev 0 function 0 "Qumranet Virtio 1.x Network" rev 0x01
> vio1 at virtio1: address 00:50:56:b8:2b:4a
> virtio1: msix shared
> ppb2 at pci0 dev 2 function 2 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci3 at ppb2 bus 3
> virtio2 at pci3 dev 0 function 0 "Qumranet Virtio 1.x Console" rev 0x01
> virtio2: no matching child driver; not configured
> ppb3 at pci0 dev 2 function 3 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci4 at ppb3 bus 4
> virtio3 at pci4 dev 0 function 0 "Qumranet Virtio 1.x Storage" rev 0x01
> vioblk0 at virtio3
> scsibus1 at vioblk0: 2 targets
> sd0 at scsibus1 targ 0 lun 0: 
> sd0: 16384MB, 512 bytes/sector, 33554432 sectors
> virtio3: msix shared
> ppb4 at pci0 dev 2 function 4 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci5 at ppb4 bus 5
> virtio4 at pci5 dev 0 function 0 vendor "Qumranet", unknown product 0x1045 
> rev 0x01
> viomb0 at virtio4
> virtio4: apic 0 int 22
> ppb5 at pci0 dev 2 function 5 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci6 at ppb5 bus 6
> virtio5 at pci6 dev 0 function 0 "Qumranet Virtio 1.x RNG" rev 0x01
> viornd0 at virtio5
> virtio5: apic 0 int 22
> ppb6 at pci0 dev 2 function 6 vendor "Red Hat", unknown product 0x000c rev 
> 0x00: apic 0 int 22
> pci7 at ppb6 bus 7
> pcib0 at pci0 dev 31 function 0 "Intel 82801IB LPC" rev 0x02
> ahci0 at pci0 dev 31 function 2 "Intel 82801I AHCI" rev 0x02: msi, AHCI 1.0
> scsibus2 at ahci0: 32 targets
> ichiic0 at pci0 dev 31 function 3 "Intel 82801I SMBus" rev 0x02: apic 0 int 16
> iic0 at ichiic0
> isa0 at pcib0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> pckbc0 at 

Re: 6.7 boot crashes on "entry point at" X1 Carbon gen7 i7-10510U

2020-05-25 Thread Otto Moerbeek
On Sun, May 24, 2020 at 09:46:09PM +0900, John Mettraux wrote:

> On Sun, May 24, 2020 at 8:38 PM Otto Moerbeek  wrote:
> >
> > On Sun, May 24, 2020 at 08:26:43PM +0900, John Mettraux wrote:
> >
> > > On Sun, May 24, 2020 at 5:36 PM Stuart Henderson  
> > > wrote:
> > > >
> > > > On 2020-05-23, John Mettraux  wrote:
> > > > >
> > > > > (...)
> > > > >
> > > > > Hard power down is the only way out, but rebooting still leads
> > > > > immediately to the
> > > > > "entry point at 0x1001000" wall. It is consistent. I tried boot -c,
> > > > > boot -d or boot -s,
> > > > > still the same wall.
> > > > >
> > > > > I have just tried with the install67.img snapshot (22-May-2020 21:12
> > > > > 476545024) from
> > > > > https://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ but it leads to 
> > > > > the same
> > > > > "entry point at 0x1001000" right after boot.
> > > >
> > > > Try downloading a 6.7 kernel (bsd.mp) to e.g. /bsd.test, and from the 
> > > > 6.6 boot
> > > > loader type "b /bsd.test". Do you still get the hang? This will give 
> > > > you an idea
> > > > whether the problem in 6.7 is with the newer boot loader or the kernel.
> > >
> > > I can confirm that the hang doesn't happen with the 6.7 kernel and the 
> > > 6.6 boot
> > > loader (bootx64 3.46). The hang happens with the 6.7 boot loader (3.50).
> > >
> > > I will try to do a 6.7 install with the 3.46 boot loader.
> >
> > Can you also try using legacy boot mode (mbr)? There should be some
> > setting in the bios to enable that.
> >
> > -Otto
> 
> I tried to set the boot mode to [Legacy First] and [Legacy Only]. In
> both cases the Boot 3.47 kicked in
> and allowed me to install.
> 
> I performed the install on the machine drive (sd0) with MBR and the
> install was successful.
> Dmesg below for the resulting 6.7 Snapshot.
> 
> I tried to install on sd0 with GPT. The install warned me "An EFI/GPT
> disk may not boot. Proceed?"
> I answered yes. The install proceeded but upon reboot it froze with
> the "entry point at 0x1001000".
> This was with bootx64 3.50.
> 
> I am going to re-install with sd0 MBR.
> 
> Thanks a lot!
> 
> John

I have an x1 6th generation that also does not like to boot using EFI.
There's is a difference though: it already had problems with EFI
when I initially installed it in Feb 2019. 

I'll see if I can find some time to make a more detail diagnosis.

-Otto

> 
> ---dmesg---
> 
> OpenBSD 6.7-current (RAMDISK_CD) #204: Fri May 22 20:38:04 MDT 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
> real mem = 16197828608 (15447MB)
> avail mem = 15702892544 (14975MB)
> random: good seed from bootblocks
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.2 @ 0x6cc77000 (65 entries)
> bios0: vendor LENOVO version "N2QET18W (1.12 )" date 12/10/2019
> bios0: LENOVO 20R1CTO1WW
> acpi0 at bios0: ACPI 6.1
> acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT SSDT TPM2 SSDT HPET APIC
> MCFG ECDT SSDT SSDT SSDT NHLT BOOT SSDT LPIT WSMT SSDT DBGP DBG2 MSDM
> BATB DMAR UEFI FPDT
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz, 7498.82 MHz, 06-8e-0c
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: apic clock running at 24MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> cpu at mainbus0: not configured
> ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 120 pins
> acpiec0 at acpi0
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus -1 (RP01)
> acpiprt2 at acpi0: bus -1 (RP02)
> acpiprt3 at acpi0: bus -1 (RP03)
> acpiprt4 at acpi0: bus -1 (RP04

Re: 6.7 boot crashes on "entry point at" X1 Carbon gen7 i7-10510U

2020-05-24 Thread Otto Moerbeek
On Sun, May 24, 2020 at 08:26:43PM +0900, John Mettraux wrote:

> On Sun, May 24, 2020 at 5:36 PM Stuart Henderson  wrote:
> >
> > On 2020-05-23, John Mettraux  wrote:
> > >
> > > (...)
> > >
> > > Hard power down is the only way out, but rebooting still leads
> > > immediately to the
> > > "entry point at 0x1001000" wall. It is consistent. I tried boot -c,
> > > boot -d or boot -s,
> > > still the same wall.
> > >
> > > I have just tried with the install67.img snapshot (22-May-2020 21:12
> > > 476545024) from
> > > https://ftp.openbsd.org/pub/OpenBSD/snapshots/amd64/ but it leads to the 
> > > same
> > > "entry point at 0x1001000" right after boot.
> >
> > Try downloading a 6.7 kernel (bsd.mp) to e.g. /bsd.test, and from the 6.6 
> > boot
> > loader type "b /bsd.test". Do you still get the hang? This will give you an 
> > idea
> > whether the problem in 6.7 is with the newer boot loader or the kernel.
> 
> Hello,
> 
> I can confirm that the hang doesn't happen with the 6.7 kernel and the 6.6 
> boot
> loader (bootx64 3.46). The hang happens with the 6.7 boot loader (3.50).
> 
> I will try to do a 6.7 install with the 3.46 boot loader.
> 
> Thanks a lot!
> 
> John
> 

Can you also try using legacy boot mode (mbr)? THere should be some
setting in the bios to enable that.

-Otto



Re: Unable to sysupgrade to 6.7!

2020-05-23 Thread Otto Moerbeek
On Sat, May 23, 2020 at 09:33:59AM +0200, Federico Giannici wrote:

> I was unable to upgrade my amd64 6.6 workstation to 6.7.
> 
> Here is what happens: https://www.neomedia.it/tmp/67.jpg
> 
> Then system reboot in an infinite loop (finally I changed the boot image to
> "bsd" and returned to previous 6.6).
> 
> Other system infos at the end of the email.
> 
> Anybody knows what the problem could be?
> 
> I don't understand if it's related to the "CHECK AND RESET THE DATE" warning
> (never seen before and in normal 6.6 boots). I tried to advance the BIOS
> clock two hours (I'm UTC+2) so it shows the corret time, but nothing
> changed.

This has nothing to do with the time or date.

You have an inconsistency on your filesystem fsck -p cannot fix.

Boot into bsd.rd and fsck /dev/rsd2a

That said, the auto upgrade script should abort in this case.

-Otto

> 
> Thanks.
> 
> 
> 
> casa:/home/giannici# df
> Filesystem  1K-blocks  Used Avail Capacity  Mounted on
> /dev/sd2a92878104  42769116  4546508448%/
> /dev/sd2d   143421884 103486968  3276382476%/home
> 
> 
> casa:/home/giannici# mount
> /dev/sd2a on / type ffs (local, wxallowed)
> /dev/sd2d on /home type ffs (local, nodev, nosuid)
> 
> 
> casa:/home/giannici# disklabel sd2
> # /dev/rsd2c:
> type: SCSI
> disk: SCSI disk
> label: SR RAID 1
> duid: 409ffd737f4265d7
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 30400
> total sectors: 488391473
> boundstart: 64
> boundend: 488376000
> drivedata: 0
> 
> 16 partitions:
> #size   offset  fstype [fsize bsize   cpg]
>   a:188747616   64  4.2BSD   2048 16384 12958 # /
>   b: 10490450188747680swap# none
>   c:4883914730  unused
>   d:289137856199238144  4.2BSD   4096 32768 26062 # /home
> 
> 
> casa:/home/giannici# bioctl sd2
> Volume  Status   Size Device
> softraid0 0 Online   250056434176 sd2 RAID1
>   0 Online   250056434176 0:0.0   noencl 
>   1 Online   250056434176 0:1.0   noencl 
> 
> 
> casa:/home/giannici# fdisk sd2
> Disk: sd2 geometry: 30400/255/63 [488391473 Sectors]
> Offset: 0 Signature: 0xAA55
> Starting Ending LBA Info:
>  #: id  C   H   S -  C   H   S [   start:size ]
> ---
>  0: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  1: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
>  2: 00  0   0   0 -  0   0   0 [   0:   0 ] unused
> *3: A6  0   1   2 -  30399 254  63 [  64:   488375936 ] OpenBSD
> 
> 
> casa:/home/giannici# dmesg
> OpenBSD 6.6 (GENERIC.MP) #8: Fri Apr 17 15:06:32 MDT 2020
> 
> r...@syspatch-66-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 17075994624 (16284MB)
> avail mem = 16545767424 (15779MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.2 @ 0xe6f40 (70 entries)
> bios0: vendor American Megatrends Inc. version "1407" date 04/02/2020
> bios0: ASUSTeK COMPUTER INC. PRIME X570-P
> acpi0 at bios0: ACPI 6.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT SSDT WSMT SSDT SSDT MCFG HPET SSDT
> UEFI WPBT IVRS PCCT SSDT CRAT CDIT SSDT
> acpi0: wakeup devices GPP0(S4) GPP2(S4) GPP3(S4) GPP4(S4) GPP5(S4) GPP6(S4)
> GPP7(S4) GPP8(S4) X161(S4) GPP9(S4) X162(S4) GPPA(S4) GPPB(S4) GPPC(S4)
> GPPD(S4) GPPE(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 32 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: AMD Ryzen 5 3600 6-Core Processor, 3593.74 MHz, 17-71-00
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
> cpu0: 32KB 64b/line 8-way I-cache, 32KB 64b/line 8-way D-cache, 512KB
> 64b/line 8-way L2 cache, 32MB 64b/line disabled L3 cache
> cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully
> associative
> cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully
> associative
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: AMD Ryzen 5 3600 6-Core Processor, 3593.25 MHz, 17-71-00
> cpu1: 
> 

Re: Convert ffs1 to ffs2?

2020-05-20 Thread Otto Moerbeek
On Wed, May 20, 2020 at 11:30:00AM +0300, Михаил Попов wrote:

> > "Possible" is irrelevant. Lots of things are _possible_ but not done.
> 
> Then only rsyncing?
> 
> Why not adding at least one of a well tested journaled FS like XFS to OpenBSD?
> Is XFS too fat and complex to be secure?
> 
> Does OpenBSD work well if system root is stored via NFS, say on a Linux ZFS?
> 

rsync is one option, if you can keep both partitions online.
dump followed by newfs and restore also works and allows you to keep
the backup on another system.

As for other FS types or adding journalling to FFS: nobody willing and
*able* showed up so far.

-Otto
 





Re: OpenBSD/sparc64 6.7-beta not working on silver Blade 2500

2020-05-02 Thread Otto Moerbeek
On Thu, Apr 09, 2020 at 12:24:17PM +0200, Otto Moerbeek wrote:

> On Wed, Apr 08, 2020 at 07:03:29PM +0200, Sigi Rudzio wrote:
> 
> > Am Mi., 8. Apr. 2020 um 08:37 Uhr schrieb Otto Moerbeek :
> > >
> > > On Wed, Apr 08, 2020 at 01:11:29AM +0200, Sigi Rudzio wrote:
> > >
> > > > Hello misc@,
> > > >
> > > > while testing the FFS2 patches by otto@ I noticed that I was unable
> > > > to install -current on my silver Blade 2500.
> > > >
> > > > tar fails while unpacking the sets, although the files are ok, if I 
> > > > rename
> > > > base67.tgz to base66.tgz to fool the 6.6 installer I can install 
> > > > 6.7-beta,
> > > > although it fails/panics with a lot of "Illegal instruction" errors on 
> > > > the
> > > > first boot.
> > > >
> > > > otto@ already told me that this might be fallout from the recent clang
> > > > changes.
> > >
> > > Well, I sais it *could be*, but you did not show my the actual errors
> > > in that email.  The errorsbelow indicate a harware issue, i.e. bad
> > > disk.
> > >
> > > -Otto
> > 
> > Sorry, I should have been clearer.
> > 
> > I tried another disk from my working V240 and got the same errors.
> > 
> > Installing on an IDE hard disk works and the system is working fine,
> > even writing large files to the SCSI disk is fine, but I can make it panic
> > with tar.
> > So I guess this is some kind of SCSI bus hardware issue that 6.6 doesn't
> > trigger and some change between 6.6 and 6.7-beta does trigger it.
> > 
> > Sorry for the noise and thanks for the help!
> 
> Hmm, you might be onto something. I have seen similar issues on a
> Blade 150 but blamed it on hardware. Sadly I do not thave easy access
> to the machine atm.

In the meantime krw@ committed (with your help) a fix for a DMA issue
on mpi(4) that seems to make my Blade150 much more reliable. Thanks to
both of you!

-Otto



Re: /bin/sh echo \n

2020-04-26 Thread Otto Moerbeek
On Sun, Apr 26, 2020 at 01:04:05PM +0200, Otto Moerbeek wrote:

> On Sun, Apr 26, 2020 at 12:27:24PM +0200, Thomas de Grivel wrote:
> 
> > Hello,
> > 
> > I was testing some scripting using /bin/sh and I could not find this
> > behaviour in the documentation :
> > 
> > > $ /bin/sh
> > > $ echo -n '\n'
> > >
> > > $
> > 
> > It seems that ksh even in sh (posix ?) mode does expansion of \n to an
> > actual newline.
> 
> Nope, this is a property of the builtin 'echo'. echo (and the more
> general print) are described in the Command execution section of
> ksh(1).
> 
> > 
> > First is there a way to turn off the \n expansion in simple quotes in 
> > /bin/sh ?
> 
> Not with echo, but print has -r

Oops, echo has -E of course

> 
>   -Otto
> > 
> > Second I don't see this feature described neither in man sh nor man
> > ksh so is it a known behaviour of ksh ?
> > 
> > Thanks a ton,
> > 
> > -- 
> >  Thomas de Grivel
> >  kmx.io
> > 



Re: /bin/sh echo \n

2020-04-26 Thread Otto Moerbeek
On Sun, Apr 26, 2020 at 12:27:24PM +0200, Thomas de Grivel wrote:

> Hello,
> 
> I was testing some scripting using /bin/sh and I could not find this
> behaviour in the documentation :
> 
> > $ /bin/sh
> > $ echo -n '\n'
> >
> > $
> 
> It seems that ksh even in sh (posix ?) mode does expansion of \n to an
> actual newline.

Nope, this is a property of the builtin 'echo'. echo (and the more
general print) are described in the Command execution section of
ksh(1).

> 
> First is there a way to turn off the \n expansion in simple quotes in /bin/sh 
> ?

Not with echo, but print has -r

-Otto
> 
> Second I don't see this feature described neither in man sh nor man
> ksh so is it a known behaviour of ksh ?
> 
> Thanks a ton,
> 
> -- 
>  Thomas de Grivel
>  kmx.io
> 



Re: _types.h: increase size of size_t

2020-04-24 Thread Otto Moerbeek
On Thu, Apr 23, 2020 at 10:45:38PM -0400, Ian Sutton wrote:

> Following the revalations made by a misc@ poster, I am happy to present
> the following patch which increases the width of size_t from "long" to
> "long long", which is twice the width as before, on all platforms. This
> has the effect of doubling the amount of available memory regardless of
> the physical capacity installed memory hardware. Additionally, it
> enables PAE on all 32 bit platforms without incurring performance costs.

You must be out of your mind.

-Otto

>  
> Index: arch/alpha/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/alpha/include/_types.h,v
> retrieving revision 1.24
> diff -u -p -r1.24 _types.h
> --- arch/alpha/include/_types.h   5 Mar 2018 01:15:24 -   1.24
> +++ arch/alpha/include/_types.h   24 Apr 2020 02:26:13 -
> @@ -120,7 +120,7 @@ typedef unsigned long __psize_t;
>  typedef double   __double_t;
>  typedef float__float_t;
>  typedef long __ptrdiff_t;
> -typedef  unsigned long   __size_t;
> +typedef  unsigned long long__size_t;
>  typedef  long__ssize_t;
>  #if defined(__GNUC__) && __GNUC__ >= 3
>  typedef  __builtin_va_list   __va_list;
> Index: arch/amd64/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/amd64/include/_types.h,v
> retrieving revision 1.17
> diff -u -p -r1.17 _types.h
> --- arch/amd64/include/_types.h   5 Mar 2018 01:15:25 -   1.17
> +++ arch/amd64/include/_types.h   24 Apr 2020 02:26:13 -
> @@ -120,7 +120,7 @@ typedef unsigned long __psize_t;
>  typedef  double  __double_t;
>  typedef  float   __float_t;
>  typedef long __ptrdiff_t;
> -typedef  unsigned long   __size_t;
> +typedef  unsigned long long__size_t;
>  typedef  long__ssize_t;
>  #if defined(__GNUC__) && __GNUC__ >= 3
>  typedef  __builtin_va_list   __va_list;
> Index: arch/arm/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/arm/include/_types.h,v
> retrieving revision 1.19
> diff -u -p -r1.19 _types.h
> --- arch/arm/include/_types.h 5 Mar 2018 01:15:25 -   1.19
> +++ arch/arm/include/_types.h 24 Apr 2020 02:26:13 -
> @@ -120,7 +120,7 @@ typedef unsigned long __psize_t;
>  typedef double   __double_t;
>  typedef float__float_t;
>  typedef long __ptrdiff_t;
> -typedef  unsigned long   __size_t;
> +typedef  unsigned long long__size_t;
>  typedef  long__ssize_t;
>  #if defined(__GNUC__) && __GNUC__ >= 3
>  typedef  __builtin_va_list   __va_list;
> Index: arch/arm64/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/arm64/include/_types.h,v
> retrieving revision 1.4
> diff -u -p -r1.4 _types.h
> --- arch/arm64/include/_types.h   5 Mar 2018 01:15:25 -   1.4
> +++ arch/arm64/include/_types.h   24 Apr 2020 02:26:13 -
> @@ -121,7 +121,7 @@ typedef unsigned long __psize_t;
>  typedef  double  __double_t;
>  typedef  float   __float_t;
>  typedef  long__ptrdiff_t;
> -typedef  unsigned long   __size_t;
> +typedef  unsigned long long__size_t;
>  typedef  long__ssize_t;
>  #if defined(__GNUC__) && __GNUC__ >= 3
>  typedef  __builtin_va_list   __va_list;
> Index: arch/hppa/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/hppa/include/_types.h,v
> retrieving revision 1.26
> diff -u -p -r1.26 _types.h
> --- arch/hppa/include/_types.h5 Mar 2018 01:15:25 -   1.26
> +++ arch/hppa/include/_types.h24 Apr 2020 02:26:13 -
> @@ -124,7 +124,7 @@ typedef unsigned long __psize_t;
>  typedef double   __double_t;
>  typedef float__float_t;
>  typedef long __ptrdiff_t;
> -typedef  unsigned long   __size_t;
> +typedef  unsigned long long__size_t;
>  typedef  long__ssize_t;
>  #if defined(__GNUC__) && __GNUC__ >= 3
>  typedef  __builtin_va_list   __va_list;
> Index: arch/i386/include/_types.h
> ===
> RCS file: /cvs/src/sys/arch/i386/include/_types.h,v
> retrieving revision 1.23
> diff -u -p -r1.23 _types.h
> --- arch/i386/include/_types.h5 Mar 2018 01:15:25 -   1.23
> +++ arch/i386/include/_types.h24 Apr 2020 

Re: MIdnight Commander won't run

2020-04-22 Thread Otto Moerbeek
On Wed, Apr 22, 2020 at 05:15:30AM +, slackwaree wrote:

> That's why you never upgrade ... rather migrate. I still find it hard to 
> believe that obsd added a tool to upgrade the system.

Strange, upgrading saves lots of time and work and works for a tonne
of people.

I suspect sysclean (a tool not in base) removed something.

-Otto

> 
> BSDs unlike linux is a complete system. With all new releases you get new 
> packages and a new kernel together.
> 
> Dist upgrading always broke tons of stuff in linux too, rolling releases 
> seems to kinda helped on it but when parameters change in daemons they will 
> still break stuff.
> 
> Just do a fresh install and migrate things over. Also reinstalling obsd once 
> in every 3-4 years seems to be enough for me.
> 
> 
> 
> ‐‐‐ Original Message ‐‐‐
> On Wednesday, April 22, 2020 5:01 AM, Amit Kulkarni  
> wrote:
> 
> > > Upgraded my router from 6.5 to 6.6. Followed the upgrade guide and 
> > > installed most, not all, of
> > > the file sets. I did not install the games set or several of the X sets.
> >
> > Install all X sets, and then retry. mc uses X with some library
> > somewhere to display it on screen.
> >
> > > I ran pkg_add -u and also used sysclean to find and remove all unneeded 
> > > files.
> > > Afterwards, trying to run 'mc' results in:
> > > tangerine# mc
> > > ld.so can't load library libpcre.so.3.0
> > > Killed
> > > libpcre.so.3.0 is in /usr/local/lib
> > > Not sure how to go about fixing this, google searches did not turn up 
> > > anything on this.
> > > Looking for a bit of help.
> 
> 



Re: timegm()

2020-04-21 Thread Otto Moerbeek
On Tue, Apr 21, 2020 at 10:51:54AM +, Roderick wrote:

> 
> Acording to the man page: "timegm() is a deprecated interface that
> converts [...]"
> 
> O.K., deprecated. And what is the alternative?
> 
> Thanks for any hint
> Rodrigo
> 

The paragraph above it (discussing timelocal()) suggests it's
mktime().

-Otto



Re: Double fault trap in rtable_l2

2020-04-20 Thread Otto Moerbeek
On Mon, Apr 20, 2020 at 08:03:23AM +0200, Thomas de Grivel wrote:

> Thanks Otto,
> 
> Now I still don't know what could cause the double fault, I see no
> interrupt related code in rtable_l2. What am I missing ? I would like
> to investigate more but I'm not really a kernel developer.

Traps are used for more things than interrupts.

> 
> The wikipedia page says it has to be a kernel bug, as in not from
> userland. It also says it would probably not happen on SPARC64. X86
> has some flawed designs at its core
> 
> I have a small diff for >2GB ext2fs partitions though I don't see how
> it could be related ?

First retest with a kervel without diffs.

If you collect more information you can file a bug report, see
http://www.openbsd.org/report.html

-Otto


> 
> Le dim. 19 avr. 2020 à 17:30, Otto Moerbeek  a écrit :
> >
> > On Sun, Apr 19, 2020 at 10:26:20AM +0200, Thomas de Grivel wrote:
> >
> > > Hello,
> > >
> > > I got this error last night on an OpenBSD 6.6-stable amd64 on which I
> > > recently enabled IKEv2 :
> > >
> > > > kernel: double fault trap, code=0
> > > > Stopped atrtable_l2+0x27: callq   srp_enter+0x4
> > >
> > > I'm a bit puzzled by the "double fault trap" part of the message, what
> > > does it mean ?
> > >
> > > The relevant sources seem to be /sys/net/rtable.c and
> > > /sys/kern/kern_srp.c though I don't really grok what I'm looking at
> > > there either.
> > >
> > > --
> > >  Thomas de Grivel
> > >  kmx.io
> > >
> >
> > Googling is not that hard: https://en.wikipedia.org/wiki/Double_fault
> >
> > -Otto
> 
> 
> 
> -- 
>  Thomas de Grivel
>  kmx.io



Re: Double fault trap in rtable_l2

2020-04-19 Thread Otto Moerbeek
On Sun, Apr 19, 2020 at 10:26:20AM +0200, Thomas de Grivel wrote:

> Hello,
> 
> I got this error last night on an OpenBSD 6.6-stable amd64 on which I
> recently enabled IKEv2 :
> 
> > kernel: double fault trap, code=0
> > Stopped atrtable_l2+0x27: callq   srp_enter+0x4
> 
> I'm a bit puzzled by the "double fault trap" part of the message, what
> does it mean ?
> 
> The relevant sources seem to be /sys/net/rtable.c and
> /sys/kern/kern_srp.c though I don't really grok what I'm looking at
> there either.
> 
> -- 
>  Thomas de Grivel
>  kmx.io
> 

Googling is not that hard: https://en.wikipedia.org/wiki/Double_fault

-Otto



Re: S3 Virge support on IBM T23 for 6.6

2020-04-15 Thread Otto Moerbeek
On Wed, Apr 15, 2020 at 04:55:04PM +0200, Paolo Aglialoro wrote:

> Hello,
> 
> I read from the 6.5 to 6.6 upgrade guide that the following files:
> 
> 
> */usr/X11R6/lib/modules/drivers/s3_drv.la 
> /usr/X11R6/lib/modules/drivers/s3_drv.so
> /usr/X11R6/lib/modules/drivers/s3virge_drv.la 
> /usr/X11R6/lib/modules/drivers/s3virge_drv.so/usr/X11R6/man/man4/s3.4
> /usr/X11R6/man/man4/s3virge.4*
> 
> are being deleted as "retired". Does this mean that my IBM T23 will stop
> its X-life at 6.5 or is its S3 Virge video card supported in some other
> decent way (VESA or whatever)? I would be glad to know it *before* trying
> this upgrade.
> 
> If the sad answer would be "no more support", could I ask why this,
> together with several i686 still working boxes, would be dropped while
> other OSs aren't doing so?
> 
> Thanks


http://cvsweb.openbsd.org/xenocara/driver/Makefile?rev=1.74=text/x-cvsweb-markup

explains it:

"Unlink a number of old video drivers from the build.

The corresponding hardware is out of date, barely useable
with modern systems and their code is not maintained.
ok sthen@"

We have a very limited numnbers of volunteers. In general, code is a
liability, not an asset. What other OS maintainers do is their choice.

-Otto



Re: openbsd.org down?

2020-04-13 Thread Otto Moerbeek
On Mon, Apr 13, 2020 at 10:13:47AM -0500, Eric Zylstra wrote:

> ezylstra ~ % traceroute openbsd.org
> traceroute to openbsd.org (129.128.5.194), 64 hops max, 52 byte packets
>  1  dslrouter (192.168.0.1)  0.811 ms  0.405 ms  0.295 ms
>  2  stpl-dsl-gw13.stpl.qwest.net (207.109.2.13)  10.595 ms  10.860 ms  10.977 
> ms
>  3  stpl-agw1.inet.qwest.net (207.109.3.97)  57.309 ms  14.162 ms  10.966 ms
>  4  4.68.38.177 (4.68.38.177)  11.740 ms  11.695 ms  15.970 ms
>  5  ae-0-25.bar3.minneapolis2.level3.net (4.69.218.182)  14.949 ms  12.693 ms 
>  11.964 ms
>  6  v135.core1.msp1.he.net (184.105.52.221)  13.082 ms  11.910 ms  11.796 ms
>  7  100ge10-1.core1.ywg1.he.net (184.105.64.86)  19.679 ms  19.895 ms  20.369 
> ms
>  8  100ge5-2.core1.yxe1.he.net (184.104.192.70)  28.868 ms  28.466 ms  28.587 
> ms
>  9  100ge11-2.core1.yeg1.he.net (72.52.92.61)  53.860 ms  53.360 ms  53.231 ms
> 10  university-of-alberta-sms.10gigabitethernet2-2.core1.yeg1.he.net 
> (184.105.18.50)  54.089 ms  54.084 ms  54.264 ms
> 11  katzcore-esqgw.corenet.ualberta.ca (129.128.255.41)  54.326 ms
> cabcore-esqgw.corenet.ualberta.ca (129.128.255.35)  54.093 ms  53.920 ms
> 12  * * *
> 13  * * *
> 14  * * *
> 15  obsd3.srv.ualberta.ca (129.128.5.194)  53.712 ms  54.430 ms  53.976 ms
> 
> Problems on campus at Alberta?

No need to speculate. The people taking care of the failing machine
are aware and are on it. It might take a while though, since it the
issues are hardware related.

-Otto

> 
> EZ
> 
> 
> > On Apr 13, 2020, at 8:22 AM, Mario Theodoridis  wrote:
> > 
> > For me with /etc/mail/spamd.conf
> > 
> > nixspam:\
> >:black:\
> >:msg="Your address %A is in the nixspam list\n\
> >See http://www.heise.de/ix/nixspam/dnsbl_en/ for details":\
> >:method=http:\
> >:file=www.openbsd.org/spamd/nixspam.gz
> > 
> > sleep $((RANDOM % 2048)) && /usr/libexec/spamd-setup
> > 
> > produces
> > 
> > ftp: connect: Operation timed out
> > 
> > since yesterday morning 4am CEST.
> > 
> > But running
> > 
> > wget http://www.openbsd.org/spamd/nixspam.gz
> > --2020-04-13 14:59:07--  http://www.openbsd.org/spamd/nixspam.gz
> > Resolving www.openbsd.org (www.openbsd.org)... 129.128.5.194
> > Connecting to www.openbsd.org (www.openbsd.org)|129.128.5.194|:80... 
> > connected.
> > HTTP request sent, awaiting response... 200 OK
> > Length: 18025 (18K) [text/plain]
> > Saving to: 'nixspam.gz'
> > 
> > nixspam.gz 
> > 100%[=>]
> >   17.60K  37.7KB/sin 0.5s
> > 
> > 2020-04-13 14:59:08 (37.7 KB/s) - 'nixspam.gz' saved [18025/18025]
> > 
> > just now works.
> > 
> > Mit freundlichen Grüßen/Best regards
> > 
> > Mario Theodoridis
> > 
> > On 13.04.2020 14:02, infoomatic wrote:
> >> not reachable for days now in Austria, Germany, Czech Republic
> >> On 13.04.20 11:01, SP2L Tom wrote:
> >>> Greetings.
> >>> 
> >>> 
> >>> It was and it is still up
> >>> At least, I can reach OpenBSD site.
> >>> 
> >>> 
> >>> Best regards.
> >>> Tom
> >>> 
> >>> W 13 kwietnia 2020 10:23:18 Sebastien Marie  napisał:
> >>> 
>  On Mon, Apr 13, 2020 at 10:14:00AM +0300, Ilya Mitrukov wrote:
> > Hi,
> > flushing the caches doesn't help and it's still unavailable.
> > 
> > Does anybody know where to report the issue?
> > (I'd look at openbsd.org but ... )
>  
>  I suppose there is one or two openbsd developers which follow this
>  list. So they
>  might already know.
>  
>  Thanks.
>  --
>  Sebastien Marie
> >>> 
> >>> 
> >>> 
> > 
> 



  1   2   3   4   5   6   7   8   9   10   >