Re: I just updated to main-n261544-cee09bda03c8 based (via source) and now /etc/machine-id and /var/db/machine-id disagree ; more

2023-03-16 Thread Colin Percival

I think the current situation should be sorted out aside from potential issues
for people who upgraded to a "broken" version before updating to the latest
code -- CCing bapt and tijl just in case since they're more familiar with this
than I am.

Colin Percival

On 3/16/23 15:55, Mark Millard wrote:


# cat /etc/hostid /etc/machine-id /var/db/machine-id
a4f7fbeb-f668-11de-b280-ebb65474e619
a4f7fbebf66811deb280ebb65474e619
7227cd89727a462186e3ba680d0ee142

(I'll not be keeping these values for the example system.)

# ls -Tld /etc/hostid /etc/machine-id /var/db/machine-id
-rw-r--r--  1 root  wheel  37 Dec 31 16:00:18 2009 /etc/hostid
-rw-r--r--  1 root  wheel  33 Mar 16 15:16:18 2023 /etc/machine-id
-r--r--r--  1 root  wheel  33 Mar  3 23:03:25 2023 /var/db/machine-id

I observed the delete-old-files deleting
/etc/machine-id during the upgrade. It did
nothing with /var/db/machine-id .

Also, modern hostid generation was switched to
random to avoid an exposure. But the update kept
the old hostid and propogated it (not "-"s) into
/etc/machine-id . So /etc/machine-id now has the
same exposure.

Later I'll see if stable/13 also got such behavior
for its upgrade.

I've not been dealing with releng/13.2 but upgrades
from releng/13.1 and before likely have the same
questions for what the handling should be vs. what it
might actually be. Different ways of upgrading might
not be in agreement, for all I know.

===
Mark Millard
marklmi at yahoo.com




--
Colin Percival
FreeBSD Deputy Release Engineer & EC2 platform maintainer
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid



Re: ns8250: UART FCR is broken

2022-10-28 Thread Colin Percival

On Mon, 24 Oct 2022 at 22:11, void  wrote:

this started appearing in dmesg

ns8250: UART FCR is broken
ns8250: UART FCR is broken

For the list: While I was correct that bhyve didn't emulate FCR_XMT_RST,
this warning was actually caused by a bug in my earlier commit.  It should
be fixed now:


commit 5ad8c32c722b58da4c153f241201af51b11f3152
Author: Colin Percival 
AuthorDate: 2022-10-28 04:42:44 +
Commit: Colin Percival 
CommitDate: 2022-10-28 19:20:28 +

ns8250: Fix sense of LSR_TEMT FCR check

When flushing the UART, we need to drain manually if LSR_TEMT is

*not* asserted, aka. if the transmit FIFO is not empty.

Reported by:void 

Fixes:  c4b68e7e53bb "ns8250: Check if flush via FCR succeeded"
Differential Revision:  https://reviews.freebsd.org/D37185


--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid



Re: ns8250: UART FCR is broken

2022-10-26 Thread Colin Percival

On 10/26/22 13:48, Ed Maste wrote:

On Mon, 24 Oct 2022 at 22:11, void  wrote:

this started appearing in dmesg

ns8250: UART FCR is broken
ns8250: UART FCR is broken


This message was added as part of Colin's work to support FreeBSD in
the Firecracker VMM
https://cgit.freebsd.org/src/commit/?id=c4b68e7e53bb352be3fa16995b99764c03097e66

In this case it indicates that bhyve has the same bug/missing
functionality as Firecracker -- it doesn't implement the FCR_XMT_RST
or FCR_RCV_RST bits. You can safely ignore the message, and it will
disappear once someone adds the required support to bhyve. We should
probably also have the kernel emit the message only once. I've CC'd
Colin for comment.


Indeed, looking at usr.sbin/bhyve/uart_emul.c it looks like FCR_XMT_RST is
not emulated.  This is different from Firecracker, which doesn't emulate
either anything from the FCR and where I was seeing the receive side not
being flushed, but I'm glad my warning was able to flag a bug. :-)

If "void" is comfortable with kernel hacking, it would be great to confirm
that the warning is indeed coming from the transmit side not being flushed;
a printf("drain = %d\n", drain); would be sufficient.

And yes, only emitting this warning once per device (or once per boot?)
would probably be good.

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid



Re: FreeBSD 13.0-RC5 Now Available

2021-04-05 Thread Colin Percival
On 4/4/21 1:50 PM, Alan Somers wrote:
> On Sat, Apr 3, 2021 at 9:34 AM Glen Barber  <mailto:g...@freebsd.org>> wrote:
>
> The fifth RC build of the 13.0-RELEASE release cycle is now available.
>
> In the past, making these releases required pushing updates to
> https://svnweb.freebsd.org/base/user/cperciva/freebsd-update-build/ .

Historically, we often made changes directly on the update builders and
then brought the svn tree back into sync later.

> However, that repo is read-only now.  I assume that it's been gitified, but
> I can't find the new location.  Where is it?

I think the freebsd-update build code might be homeless right now.  I know I
have seen emails mentioning that it needs to land somewhere but I don't recall
any decision being reached.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SVN r327433 fails to build

2017-12-31 Thread Colin Percival
On 12/31/17 12:17, Colin Percival wrote:
> Oops!  It never occurred to me that I had to worry about userland programs
> defining _KERNEL and then including kernel headers... I think I know how to
> fix this, just testing now.

Should be fixed now.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SVN r327433 fails to build

2017-12-31 Thread Colin Percival
On 12/31/17 09:35, Herbert J. Skuhra wrote:
> On Sun, 31 Dec 2017 15:19:21 +0100,
> Michael Butler <i...@protected-networks.net> wrote:
>>
>> ===> lib/libprocstat (obj,all,install)
>> Building /usr/obj/usr/src/amd64.amd64/lib/libprocstat/zfs/zfs.o
>> In file included from /usr/src/lib/libprocstat/zfs.c:41:
>> /usr/src/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:217:9:
>> error: 'curthread' macro redefined [-Werror,-Wmacro-redefined]
> 
> This is caused by
> 
> 
> r327429 | cperciva | 2017-12-31 10:23:35 +0100 (Sun, 31 Dec 2017) | 2 lines
> 
> Use the TSLOG framework to record entry/exit timestamps for VFS_MOUNT calls.
> 
> 

Oops!  It never occurred to me that I had to worry about userland programs
defining _KERNEL and then including kernel headers... I think I know how to
fix this, just testing now.

Thanks,
-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [Request for review] Profiling the FreeBSD kernel boot

2017-12-22 Thread Colin Percival
On 12/22/17 09:08, Mark Johnston wrote:
> On Fri, Dec 22, 2017 at 09:44:46AM +0000, Colin Percival wrote:
>> For the past few months I've been working on code for profiling the FreeBSD
>> "kernel boot", i.e., everything between when kernel code starts running and
>> when we first enter userland as init(8).  This is not trivial since it's
>> impossible to use tools like dtrace to monitor things prior to when said
>> tools are running.
> 
> In the case of DTrace, this isn't quite true. We support so-called
> boot-time DTrace on x86. The caveat is that we can only start tracing
> after the SI_SUB_DTRACE_ANON sysinit has been executed. That sysinit
> can't come earlier than SI_SUB_SMP, since it needs to be able to measure
> TSC skew between CPUs in order to initialize DTrace's high-resolution
> timer.

Right.  Also, even aside from details like measuring the TSC skew between
CPUs, DTrace needs things like traps, memory allocation, and mutexes, none
of which exist when we enter hammer_time (or any of the other MD startup
code).

What I meant is that it's impossible to use DTrace to monitor things which
happened prior to when the DTrace *kernel bits* are initialized.

> I don't think boot-time DTrace is quite what you want for this exercise,
> but it does come in handy sometimes.

Absolutely.  And for a long time I considered trying to splice together
a basic profiling mechanism for pre-DTrace-initialization with using DTrace
from when it's ready onwards... but I decided that it would be easier at
least to start with to simply use a single mechanism throughout.

> In case it's of interest: to use boot-time DTrace, invoke dtrace(1) as
> you normally would and add -A. Rather than starting to trace, dtrace(1)
> will save a representation of the D script to a file which gets read by
> the loader during the next boot. The results of the trace can be fetched
> with "dtrace -a". For instance, to print the amount of time elapsed in
> microseconds during each vprintf() call, along with a stack: [...]

Thanks for the example!  I think it's very likely that I'll make use of
boot-time DTrace for tracking down some of the performance warts I've found
-- the ones which happen after DTrace is initialized, that is.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[Request for review] Profiling the FreeBSD kernel boot

2017-12-22 Thread Colin Percival
Hi everyone,

For the past few months I've been working on code for profiling the FreeBSD
"kernel boot", i.e., everything between when kernel code starts running and
when we first enter userland as init(8).  This is not trivial since it's
impossible to use tools like dtrace to monitor things prior to when said
tools are running.  The goal of this exercise is to help me track down the
places where we're wasting time during the boot, and then to fix them.

The approach I've taken is to add some macros -- most notably TSENTER() and
TSEXIT() -- which by default compile to nothing, but if the TSLOG kernel
option is enabled they compile to code which logs the cycle count (e.g., on
x86 the value from the RDTSC instruction) along with some other data (in the
case of TSENTER and TSEXIT, the fact that we're entering/exiting a function).
This can then be dumped via a sysctl (debug.tslog) and processed in userland
to convert function entries/exits into stacks and to visualize the time spent
in the boot process.

Two examples:

A flame chart of my laptop booting HEAD:
http://www.daemonology.net/timestamping/tslog-laptop.svg

A flame chart of an EC2 c5.4xlarge instance booting 11.1-RELEASE:
http://www.daemonology.net/timestamping/tslog-c5.4xlarge.svg

The patches (10 of them, to be applied in order), userland scripts, and very
brief usage instructions are at:
http://www.daemonology.net/timestamping/tslog.tgz

I hope to commit the patches in the next week, since I'm planning on writing
a paper to submit to AsiaBSDCon (which has a deadline of December 31st); so
if anyone has interest/time to look at this in the near future (I mean, it's
not like anyone is going to be busy this weekend, right?) I'd love to have
some feedback before it goes into the tree.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: RFC: Removing hpt* drivers from GENERIC

2017-12-19 Thread Colin Percival
On 10/25/17 15:56, O'Connor, Daniel wrote:
>> On 26 Oct 2017, at 08:13, Colin Percival <cperc...@tarsnap.com> wrote:
>> [Proposal for removing hpt* drivers since hpt27xx and hptnr take a long
>> time to in DEVICE_PROBE.]
> 
> Seems sensible to me, but also worth contacting the blob authors if
> possible and asking them what gives (and if they can fix it).

Turns out that they were indeed able to fix it, with startling rapidity.
delphij@ committed r325683 (MFCed as r32600[56]) which reduces the time
spent in these DEVICE_PROBE routines from ~150 ms down to ~37 *us* on my
laptop.  So my immediate desire for faster booting has been satisfied with
regard to these drivers.

I know some people (CCed) were enthusiastic about removing these from GENERIC
on the basis that we shouldn't have binary blobs in GENERIC; while I'm
certainly sympathetic to this, I'd suggest that it should be done by someone
who has time to look at the other binary blobs in the tree and formulate a
general policy rather than just picking on the hpt* drivers.  Unfortunately,
that person is not me; I have 12 days left to submit a talk to AsiaBSDCon
about my work on profiling the kernel boot (which is how I noticed the slow
probing originally) and then a long list of other places to speed up the
boot performance.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RFC: Removing hpt* drivers from GENERIC

2017-10-25 Thread Colin Percival
Hi developers,

I'd like to remove the hpt* drivers from GENERIC.  These are the drivers
for the HighPoint storage hardware -- SATA (hptnr) and RAID (hpt27xx, hptiop,
hptmv, hptrr).

My reason for wanting to remove them is that the hpt27xx and hptnr drivers
spend ~150 ms in their DEVICE_PROBE routines every time the system boots.
Since they are roughly 1000x slower than the median driver, this is clearly
excessive; unfortunately the time is being spent inside a binary blob, so
there is no apparent way to fix the drivers.  (The other three drives from
the same vendor -- hptiop, hptmv, and hptrr -- don't exhibit this particular
bug, but I don't see any strong argument in favour of not removing them along
with the two problem drivers.)

All of these are available via kernel modules, so the impact upon users
should be minimal.  Obviously I would not plan on MFCing this change.

Any objections?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Time to increase MAXPHYS?

2017-06-03 Thread Colin Percival
On January 24, 1998, in what was later renumbered to SVN r32724, dyson@
wrote:
> Add better support for larger I/O clusters, including larger physical
> I/O.  The support is not mature yet, and some of the underlying implementation
> needs help.  However, support does exist for IDE devices now.

and increased MAXPHYS from 64 kB to 128 kB.  Is it time to increase it again,
or do we need to wait at least two decades between changes?

This is hurting performance on some systems; in particular, EC2 "io1" disks
are optimized for 256 kB I/Os, EC2 "st1" (throughput optimized spinning rust)
disks are optimized for 1 MB I/Os, and Amazon's NFS service (EFS) recommends
using a maximum I/O size of 1 MB (and despite NFS not being *physical* I/O it
seems to still be limited by MAXPHYS).

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFS client perf. degradation when SCHED_ULE is used (was when SMP enabled)

2017-06-03 Thread Colin Percival
On 05/28/17 13:16, Rick Macklem wrote:
> cperciva@ is running a highly parallelized buuildworld and he sees better
> slightly better elapsed times and much lower system CPU for SCHED_ULE.
> 
> As such, I suspect it is the single threaded, processes mostly sleeping 
> waiting
> for I/O case that is broken.
> I suspect this is how many people use NFS, since a highly parallelized make 
> would
> not be a typical NFS client task, I think?

Running `make buildworld -j36` on an EC2 "c4.8xlarge" instance (36 vCPUs, 60
GB RAM, 10 GbE) with GENERIC-NODEBUG, ULE has a slight edge over 4BSD:

GENERIC-NODEBUG, SCHED_4BSD:
1h14m12.48s real6h25m44.59s user1h4m53.42s sys
1h15m25.48s real6h25m12.20s user1h4m34.23s sys
1h13m34.02s real6h25m14.44s user1h4m09.55s sys
1h13m44.04s real6h25m08.60s user1h4m40.21s sys
1h14m59.69s real6h25m53.13s user1h4m55.20s sys
1h14m24.00s real6h24m59.29s user1h5m37.31s sys

GENERIC-NODEBUG, SCHED_ULE:
1h13m00.61s real6h02m47.59s user26m45.89s sys
1h12m30.18s real6h01m39.97s user26m16.45s sys
1h13m08.43s real6h01m46.94s user26m39.20s sys
1h12m18.94s real6h02m26.80s user27m39.71s sys
1h13m21.38s real6h00m46.13s user27m14.96s sys
1h12m01.80s real6h02m24.48s user27m18.37s sys

Running `make buildworld -j2` on an E2 "m4.large" instance (2 vCPUs, 8 GB RAM,
~ 500 Mbps network), 4BSD has a slight edge over ULE on real and sys
time but is slightly worse on user time:

GENERIC-NODEBUG, SCHED_4BSD:
6h29m25.17s real7h2m56.02s user 14m52.63s sys
6h29m36.82s real7h2m58.19s user 15m14.21s sys
6h28m27.61s real7h1m38.24s user 14m56.91s sys
6h27m05.42s real7h1m38.57s user 15m04.31s sys

GENERIC-NODEBUG, SCHED_ULE:
6h34m19.41s real6h59m43.99s user18m8.62s sys
6h33m55.08s real6h58m44.91s user18m4.31s sys
6h34m49.68s real6h56m03.58s user17m49.83s sys
6h35m22.14s real6h58m12.62s user17m52.05s sys

Note that in both cases there is lots of idle time (although far more in the
-j36 case); this is partly due to a lack of parallelism in buildworld, but
largely due to having /usr/obj mounted on Amazon EFS.

These differences all seem within the range which could result from cache
effects due to threads staying on one CPU rather than bouncing around; so
whatever Rick is tripping over, it doesn't seem to be affecting these tests.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


clang/llvm 3.9.0 mysteriously zeroing variables?

2016-12-03 Thread Colin Percival
Starting with r309124 (when clang/llvm 3.9.0 was imported) I'm seeing EC2
instances panic on boot with a division-by-zero error; the code in question
is in blkfront.c, printing out the size of disks:

>   device_printf(dev, "%juMB <%s> at %s",
>   (uintmax_t) sectors / (1048576 / sector_size),
>   device_get_desc(dev),
>   xenbus_get_node(dev));

My first thought was that 'sector_size' must be either zero or very large...
but no, when I add printf("sector_size = %ju\n", (uintmax_t)sector_size), it's
entirely normal.  What's more, adding that printf makes the division-by-zero
panic go away.

I'd think I was just hallucinating, but earlier today I heard that a similarly
"impossible" panic had been observed in the NFS client code when compiled with
clang/llvm 3.9.0.

So... is anyone else seeing unexpected panics or other odd behaviour starting
after clang/llvm 3.9.0 was imported?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: freebsd-update

2014-01-29 Thread Colin Percival
On 01/29/14 12:51, Lars Engels wrote:
 On Sat, Jan 25, 2014 at 09:11:04AM -0600, Mark Felder wrote:
 On Sat, Jan 25, 2014, at 5:32, Lars Engels wrote:
 Also using freebsd-update behind a proxy is really slow. Even with a 
 very fast internet connection (normally download rates ca. 3 MBytes /
 s) downloading all the tiny binary diff files took more than 8 hours. 
 Maybe freebsd-update's backend could create a tarball of all those
 diffs and provide this?
 
 Even streaming the tar instead of waiting for the freebsd-update server 
 to produce the tarball would be an improvement. I have no experience 
 doing that over a WAN but I don't see why it would be unreliable.
 
 Colin, what do you think? Is it possible?

Anything is *possible*, but given that the number of patches available is
typically at least 10x the number being fetched this doesn't seem like it
would be very efficient.

FWIW, the performance problems with proxies are limited to HTTP proxies
which don't speak HTTP/1.1.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: freebsd-update

2014-01-29 Thread Colin Percival
On 01/29/14 14:26, Adrian Chadd wrote:
 On 29 January 2014 13:51, Colin Percival cperc...@freebsd.org wrote:
 FWIW, the performance problems with proxies are limited to HTTP proxies
 which don't speak HTTP/1.1.
 
 Did you / others ever actually benchmark this?

The fact that performance sucks when proxies break HTTP pipelining?  Yes,
but it's also implied by the RTT/request limit for non-pipelined requests.

 I know that Squid supports pipelined requests but only a handful
 (defaulting to 1) at a time, as the actual error semantics for
 HTTP/1.1 pipelining wasn't well defined.

I'm not sure what the poorly defined error semantics are, but I suppose
that doesn't matter.  Does Squid now reply with HTTP/1.1 headers?  The
phttpget code won't even try to pipeline requests unless it sees that --
as required by the HTTP specification.

 So flipping it around - which intermediaries that are actually in use
 by companies and such actually support pipelining at the level that
 you're doing it?

I don't know.  People usually don't tell me when things work.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


making PANIC_REBOOT_WAIT_TIME a tunable

2013-12-02 Thread Colin Percival
Hi all,

It seems that PANIC_REBOOT_WAIT_TIME has been a compile-time setting forever;
and I can't see any reason for this, but I assume there was one... at some
point in the distant past.

The attached patch makes it a loader tunable and sysctl.  My reason for wanting
this is to make EC2 images reboot faster after a panic (not that it happens
very often, of course) -- there's no point waiting for a key press at the
console because the EC2 console is output-only.

Any objections?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Index: sys/kern/kern_shutdown.c
===
--- sys/kern/kern_shutdown.c(revision 258085)
+++ sys/kern/kern_shutdown.c(working copy)
@@ -89,6 +89,11 @@
 #ifndef PANIC_REBOOT_WAIT_TIME
 #define PANIC_REBOOT_WAIT_TIME 15 /* default to 15 seconds */
 #endif
+int panic_reboot_wait_time = PANIC_REBOOT_WAIT_TIME;
+SYSCTL_INT(_kern, OID_AUTO, panic_reboot_wait_time, CTLFLAG_RW | CTLFLAG_TUN,
+panic_reboot_wait_time, 0,
+Seconds to wait before rebooting after a panic);
+TUNABLE_INT(kern.panic_reboot_wait_time, panic_reboot_wait_time);
 
 /*
  * Note that stdarg.h and the ANSI style va_start macro is used for both
@@ -485,12 +490,12 @@
int loop;
 
if (howto  RB_DUMP) {
-   if (PANIC_REBOOT_WAIT_TIME != 0) {
-   if (PANIC_REBOOT_WAIT_TIME != -1) {
+   if (panic_reboot_wait_time != 0) {
+   if (panic_reboot_wait_time != -1) {
printf(Automatic reboot in %d seconds - 
   press a key on the console to abort\n,
-   PANIC_REBOOT_WAIT_TIME);
-   for (loop = PANIC_REBOOT_WAIT_TIME * 10;
+   panic_reboot_wait_time);
+   for (loop = panic_reboot_wait_time * 10;
 loop  0; --loop) {
DELAY(1000 * 100); /* 1/10th second */
/* Did user type a key? */
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Automated submission of kernel panic reports: sysutils/panicmail

2013-11-05 Thread Colin Percival
On 11/05/13 09:27, John Baldwin wrote:
 One of my previous employers maintained a database of panics and I added ways
 to recognize known panics and tag them.  I ended up relying a lot on stack
 trace details from specific OS versions to mark a panic as an instance of a
 specific bug.  Also, you may have very different stack traces even on the same
 build version for a single bug.  In the case of my employer we had a
 constrained set of kernel configs and specific build versions to work with.
 It might be harder to correctly match panics in the wild what with patched
 trees and random kernel configs.

Right, I'm sure there will be panics I can't match up against anything else --
but this is fine.  If I get enough panic reports, I can still get useful data
out even if some of them aren't immediately usable.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Automated submission of kernel panic reports: sysutils/panicmail

2013-11-04 Thread Colin Percival
Hi all,

After considerable review on freebsd-hackers (thanks dt71 and jilles!) I have
now added sysutils/panicmail to the FreeBSD ports tree.  If you install this
and add
panicmail_enable=YES
to your /etc/rc.conf, a panic report will be generated and sent to root@ for
you to review and submit (via email).  You can skip the reviewing step and
submit panics automatically by setting panicmail_autosubmit=YES.

The panics submitted are encrypted to an RSA key which I hold in order to keep
them secure in transit; and I intend to keep the raw panic reports confidential
except to the minimum extent necessary for other developers to help me process
the incoming reports.

If I receive enough panic reports to be useful, I hope to provide developers
with aggregate statistics.  This may include:

* regular email reports listing the top panics, to help guide developers
towards the most fertile areas for stability improvements;

* email to specific developers alerting them to recurring panics in code they
maintain (especially if it becomes clear that the panic has been recently
introduced); and

* guidance to re@ and secteam@ about how often a particular panic occurs if
an errata notice is being considered

as well as other yet-to-be-imagined reports of a similarly aggregate and
anonymized nature.

So please install the sysutils/panicmail port and enable it in rc.conf!  This
all depends on getting useful data, and I can't do that without your help.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Automated submission of kernel panic reports: sysutils/panicmail

2013-11-04 Thread Colin Percival
On 11/04/13 02:47, Bob Bishop wrote:
 On 4 Nov 2013, at 10:41, Colin Percival wrote:
 After considerable review on freebsd-hackers (thanks dt71 and jilles!) I have
 now added sysutils/panicmail to the FreeBSD ports tree. [etc]
 
 Nice. Is this applicable to all supported branches?

Yes... the code should work all the way back to 5.0 (it's an rc.d script),
although I doubt ports infrastructure will allow you to install anything
from today's ports tree on a system running FreeBSD 5.0.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Automated submission of kernel panic reports: sysutils/panicmail

2013-11-04 Thread Colin Percival
On 11/04/13 10:49, d...@gmx.com wrote:
 Colin Percival wrote, On 11/04/2013 11:41:
 After considerable review on freebsd-hackers (thanks dt71 and jilles!) I have
 now added sysutils/panicmail to the FreeBSD ports tree.
 
 The pkesh script is probably still in need of a big review (S00N(TM)...).

Go for it!  It's a very simple script.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Automated submission of kernel panic reports: sysutils/panicmail

2013-11-04 Thread Colin Percival
On 11/04/13 04:49, Alfred Perlstein wrote:
 Colin, have you had a few minutes to check out the crash reporting facilities 
 in
 FreeNAS?

Yes.

 The reason I ask is that:
 
 1) we would like to share code.
 2) we have this running for a few months now and have a huge corpus of 
 information.
 3) we are building a nice UI (screenshots attached) over it, we have a couple 
 of
 thousands of lines of code we can share for this.

Once I have a useful number of panics collected, I was hoping to take the best
pieces from FreeNAS's processing, from the SoC project, and from the processing
I've been doing of automatic panic reports from EC2 instances.

 We send a minimal set of information: kernel stack trace, ddb buffer and
 hardware.  Just enough to get some very, very handy stuff.

I'm currently sending the dump header and what I get from kgdb 'bt'.  If I find
that I'm missing something important, I can always add it to a new version of
the panicmail port. ;-)

 I can share with you offline the crash server code, it's django and relatively
 straight forward.

I'll come back to you about this once I have some data.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Automated submission of kernel panic reports: sysutils/panicmail

2013-11-04 Thread Colin Percival
On 11/04/13 18:26, Thomas Mueller wrote:
 Question that arises is how does the system know where to send the email, and 
 through what SMTP server, especially if panicmail_autosubmit=YES.

The code assumes that your system knows how to deliver email.  An out-of-the-box
FreeBSD install has sendmail and can do this.  If you don't enable
panicmail_autosubmit then it also assumes you're reading or forwarding root's
email -- which you should be doing anyway.

 In the case of a kernel panic, wouldn't the system crash/freeze, and would it 
 then be able to compose an email message?

The email is generated from the crashdump when the system next boots.

 I use mail/mpop and mail/msmtp rather than messing with sendmail or postfix; 
 have multiple email accounts and inboxes.
 
 Now come to think of it, I don't think I ever sent an email from FreeBSD as 
 root, only as nonroot.

Don't you get daily run output and security run output emails?

 Something like panicmail ought to be ported to NetBSD pkgsrc, considering 
 that NetBSD seems so much more unstable and crash-prone than FreeBSD on my 
 hardware.

Go right ahead.  It's a small shell script -- might even work fine without
any changes.  It's BSD licensed, of course.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


HELP WANTED: Figure out why svnlite build is sometimes not reproducible

2013-10-27 Thread Colin Percival
Hi all,

Doing freebsd-update builds, I've now had two instances where /usr/bin/svnlite
has built inexplicably differently -- changes scattered all over the binary.
This is a problem for freebsd-update because it means that at some point in the
future the builds may not be able to correctly identify if that binary needs to
be distributed as part of a security update.

The svn* binaries had build date+time stamps in them until I nuked them in
r257129, but those are cleanly self-contained -- this is something else building
differently.

Unfortunately despite the freebsd-update builds running into this, I haven't
been able to reproduce it myself and so I can't track down what is causing this.

If anyone can provide assistance with this, it would be very gratefully 
received.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HELP WANTED: Figure out why svnlite build is sometimes not reproducible

2013-10-27 Thread Colin Percival
On 10/27/13 14:52, Erik Cederstrand wrote:
 Den 27/10/2013 kl. 22.03 skrev Colin Percival cperc...@freebsd.org:
 Doing freebsd-update builds, I've now had two instances where 
 /usr/bin/svnlite
 has built inexplicably differently -- changes scattered all over the binary.
 
 Which kind of changes? Are you aware of the -D flag to ar(1) (wipes 
 timestamps in archives)? Are you always using the same SRCDIR/DESTDIR (this 
 affects the __FILE__ macro)? Same DEBUG_FLAGS?

Changes in lots of non-7-bit-ASCII bits all over the file.  I'm guessing
it's executable code.

Yes, aware of -D flag.  That's a red herring since this isn't an archive;
and all the other binaries are fine.

Yes, all the build context is the same -- this is happening inside a
chroot with the same build script running every time.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: RFC: support for first boot rc.d scripts

2013-10-16 Thread Colin Percival
On 10/14/13 10:00, Ian Lepore wrote:
 The embedded systems we create at $work have readonly root and mfs /var,
 but we do have writable storage on another filesystem.  It would work
 for us (not that we need this feature right now) if there were an rcvar
 that pointed to the marker file.  Of course to make it work, something
 would have to get the alternate filesystem mounted early enough to be
 useful (that is something we do already with a custom rc script).

New patch attached.  This one re-probes for the firstboot sentinel
after ${early_late_divider}, so you can set firstboot_sentinel to
/path/to/my/writable/storage as long as that's available once the
boot process reaches FILESYSTEMS (or NETWORKING, or whatever you
set early_late_divider to).  I figure that if we can assume all the
local rc.d scripts are available at that point we can assume that
wherever people decide to put the firstboot sentinel will also be
available at that point.

Does anyone see any problems with this?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

Index: etc/defaults/rc.conf
===
--- etc/defaults/rc.conf(revision 256432)
+++ etc/defaults/rc.conf(working copy)
@@ -619,6 +619,9 @@
 accounting_enable=NO # Turn on process accounting (or NO).
 ibcs2_enable=NO  # Ibcs2 (SCO) emulation loaded at startup (or NO).
 ibcs2_loaders=coff   # List of additional Ibcs2 loaders (or NO).
+firstboot_sentinel=/firstboot# Scripts with firstboot keyword are 
run if
+   # this file exists.  Should be on a R/W filesystem so
+   # the file can be deleted after the boot completes.
 
 # Emulation/compatibility services provided by /etc/rc.d/abi
 sysvipc_enable=NO# Load System V IPC primitives at startup (or NO).
Index: etc/rc
===
--- etc/rc  (revision 256432)
+++ etc/rc  (working copy)
@@ -82,10 +82,15 @@
fi
 fi
 
+# If the firstboot sentinel doesn't exist, we want to skip firstboot scripts.
+if ! [ -e ${firstboot_sentinel} ]; then
+   skip_firstboot=-s firstboot
+fi
+
 # Do a first pass to get everything up to $early_late_divider so that
 # we can do a second pass that includes $local_startup directories
 #
-files=`rcorder ${skip} /etc/rc.d/* 2/dev/null`
+files=`rcorder ${skip} ${skip_firstboot} /etc/rc.d/* 2/dev/null`
 
 _rc_elem_done=' '
 for _rc_elem in ${files}; do
@@ -107,7 +112,13 @@
 *) find_local_scripts_new ;;
 esac
 
-files=`rcorder ${skip} /etc/rc.d/* ${local_rc} 2/dev/null`
+# The firstboot sentinel might be on a newly mounted filesystem; look for it
+# again and unset skip_firstboot if we find it.
+if [ -e ${firstboot_sentinel} ]; then
+   skip_firstboot=
+fi
+
+files=`rcorder ${skip} ${skip_firstboot} /etc/rc.d/* ${local_rc} 2/dev/null`
 for _rc_elem in ${files}; do
case $_rc_elem_done in
* $_rc_elem *)continue ;;
@@ -116,6 +127,15 @@
run_rc_script ${_rc_elem} ${_boot}
 done
 
+# Remove the firstboot sentinel, and reboot if it was requested.
+if [ -e ${firstboot_sentinel} ]; then
+   rm ${firstboot_sentinel}
+   if [ -e ${firstboot_sentinel}-reboot ]; then
+   rm ${firstboot_sentinel}-reboot
+   kill -INT 1
+   fi
+fi
+
 echo ''
 date
 exit 0
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: RFC: support for first boot rc.d scripts

2013-10-15 Thread Colin Percival
On 10/15/13 01:58, Nick Hibma wrote:
 Indeed... the way my patch currently does things, it looks for the
 firstboot sentinel at the start of /etc/rc, which means it *has* to
 be on /.  Making the path an rcvar is a good idea (updated patch
 attached) but we still need some way to re-probe for that file after
 mounting extra filesystems.
 
 In many cases a simple 
 
   test -f /firstboot  bla_enable='YES' || bla_enable='NO'
   rm -f /firstboot
 
 in your specific rc.d script would suffice. [...]
 I am not quite sure why we need /firstboot handling in /etc/rc.

Your suggestion wouldn't work if you have several scripts doing it;
the first one would remove the sentinel and the others wouldn't run.
In my EC2 code I have a single script which runs after all the others
and removes the sentinel file, but that still means that every script
has to be executed on every boot (even if just to check if it should
do anything); putting the logic into /etc/rc would allow rcorder to
skip those scripts entirely.

 Perhaps it is a better idea to make this more generic, to move the rc.d 
 script containing a 'runonce' keyword to a subdirectory as the last step in 
 rc (or make that an rc.d script in itself!). That way you could consider 
 moving it back if you need to re-run it. Or have an rc.d script setup 
 something like a database after installing a package by creating a rc.d 
 runonce script.
 
 Default dir could be ./run-once relative to the rc.d dir it is in, 
 configurable through runonce_directory .
 
 Note: The move would need to be done at the very end of rc.d to prevent 
 rcorder returning a different ordering and skipping scripts because of that.

I considered this, but decided that the most common requirement use of
run once would be for run when the system is first booted, and it
would be much simpler to provide just the firstboot functionality.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: RFC: support for first boot rc.d scripts

2013-10-15 Thread Colin Percival
On 10/15/13 13:09, Matthew Fleming wrote:
 We use something like this at work.  However, our version creates a file after
 the firstboot scripts have run, and doesn't run if the file exists.
 
 Is there a reason to prefer one choice over the other?  Naively I'd expect it 
 to
 be better to run when the file doesn't exist, creating when done; it solves 
 the
 problem of making sure the magic file exists before first boot, for the other
 polarity.

I don't see that making sure that the magic file exists is a problem, since
you'd also need to make sure you have knobs turned on in /etc/rc.conf and/or
extra rc.d scripts installed.

In a very marginal sense, deleting a file is safer than creating one, since if
the filesystem is full you can delete but not create.  It also seems to me that
the sensible polarity is that having something extra lying around makes extra
things happen rather than inhibiting them.

But probably the best argument has to do with upgrading systems -- if you update
a 9.2-RELEASE system to 10.1-RELEASE and there's a first boot script in that
new release, you don't want to have it accidentally get run simply because you
failed to create a /firstboot file during the upgrade process.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: RFC: support for first boot rc.d scripts

2013-10-14 Thread Colin Percival
Hi Nick,

On 10/14/13 00:59, Nick Hibma wrote:
 Sounds useful: We have nanobsd images that configure a hard disk if present, 
 but obviously only need to be run once.
 
 However: NanoBSD stores uses a memory disk for /etc and stores it's permanent 
 scripts in /conf/* (/etc/rc.initdiskless) and/or /cfg (NanoBSD) so I doubt 
 whether the 'embedded systems' argument is of much use, as deleting the 
 script or flagging 'firstboot' is non-permanent.

Yes, it's hard to store state on diskless systems... but I figured
that anyone building a diskless system would know to not create a
run firstboot scripts marker.  And not all embedded systems are
diskless...

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: RFC: support for first boot rc.d scripts

2013-10-14 Thread Colin Percival
On 10/14/13 05:07, Hiroki Sato wrote:
 Colin Percival cperc...@freebsd.org wrote
   in 525b258f.3030...@freebsd.org:
 
 cp I've attached a very simple patch which makes /etc/rc:
 
 cp +if ! [ -e /var/db/firstboot ]; then
 cp + skip=$skip -s firstboot
 cp +fi
 
  At this stage, it is possible that /var/db does not exist because it
  is before rc.d/mountcritlocal.

Ah, good point.  I guess we need something on / then?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: RFC: support for first boot rc.d scripts

2013-10-14 Thread Colin Percival
On 10/14/13 10:00, Ian Lepore wrote:
 On Mon, 2013-10-14 at 09:51 -0700, Colin Percival wrote:
 Yes, it's hard to store state on diskless systems... but I figured
 that anyone building a diskless system would know to not create a
 run firstboot scripts marker.  And not all embedded systems are
 diskless...
 
 The embedded systems we create at $work have readonly root and mfs /var,
 but we do have writable storage on another filesystem.  It would work
 for us (not that we need this feature right now) if there were an rcvar
 that pointed to the marker file.  Of course to make it work, something
 would have to get the alternate filesystem mounted early enough to be
 useful (that is something we do already with a custom rc script).

Indeed... the way my patch currently does things, it looks for the
firstboot sentinel at the start of /etc/rc, which means it *has* to
be on /.  Making the path an rcvar is a good idea (updated patch
attached) but we still need some way to re-probe for that file after
mounting extra filesystems.

 Note that I'm not asking for any changes here, just babbling.

Babbling is good.  Between us we might babble a useful solution. ;-)

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Index: etc/defaults/rc.conf
===
--- etc/defaults/rc.conf(revision 256432)
+++ etc/defaults/rc.conf(working copy)
@@ -619,6 +619,9 @@
 accounting_enable=NO # Turn on process accounting (or NO).
 ibcs2_enable=NO  # Ibcs2 (SCO) emulation loaded at startup (or NO).
 ibcs2_loaders=coff   # List of additional Ibcs2 loaders (or NO).
+firstboot_sentinel=/firstboot# Scripts with firstboot keyword are 
run if
+   # this file exists.  Should be on a R/W filesystem so
+   # the file can be deleted after the boot completes.
 
 # Emulation/compatibility services provided by /etc/rc.d/abi
 sysvipc_enable=NO# Load System V IPC primitives at startup (or NO).
Index: etc/rc
===
--- etc/rc  (revision 256432)
+++ etc/rc  (working copy)
@@ -81,6 +81,9 @@
skip=$skip -s nojailvnet
fi
 fi
+if ! [ -e ${firstboot_sentinel} ]; then
+   skip=$skip -s firstboot
+fi
 
 # Do a first pass to get everything up to $early_late_divider so that
 # we can do a second pass that includes $local_startup directories
@@ -116,6 +119,13 @@
run_rc_script ${_rc_elem} ${_boot}
 done
 
+if [ -e ${firstboot_sentinel} ]; then
+   rm ${firstboot_sentinel}
+   if [ -e ${firstboot_sentinel}-reboot ]; then
+   rm ${firstboot_sentinel}-reboot
+   kill -INT 1
+   fi
+fi
 echo ''
 date
 exit 0
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

RFC: support for first boot rc.d scripts

2013-10-13 Thread Colin Percival
Hi all,

I've attached a very simple patch which makes /etc/rc:

1. Skip any rc.d scripts with the firstboot keyword if /var/db/firstboot
does not exist,

2. If /var/db/firstboot and /var/db/firstboot-reboot exist after running rc.d
scripts, reboot.

3. Delete /var/db/firstboot (and firstboot-reboot) after the first boot.

The purpose of this is to support run on first boot rc.d scripts.  These can
be useful for both virtual machines and embedded systems; unlike conventional
desktops and servers, these may have a lengthy gap between installing and
turning on the system.

As examples of what such scripts could do:

* In Amazon EC2, I use a first boot script to download an SSH public key
from EC2 so that users can log in to newly provisioned EC2 instances.

* Now that (starting from 10.0-BETA1) it is possible to use FreeBSD Update
to update everything on EC2 instances, I'm planning on writing a script which
runs 'freebsd-update fetch install' when the system first boots, and then
reboots if there were updates installed.  (I imagine this would be useful
to other embedded / VM providers too.)

* Once packages are provided (properly) for 10.0 I'd like to allow people to
specify a list of packages they want installed onto an EC2 instance and have
them downloaded and installed when the EC2 instance launches.

I'd like to get this into HEAD in the near future in the hope that I can
convince re@ that this is a simple enough (and safe enough) change to merge
before 10.0-RELEASE.

Comments?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Index: etc/rc
===
--- etc/rc  (revision 256432)
+++ etc/rc  (working copy)
@@ -81,6 +81,9 @@
skip=$skip -s nojailvnet
fi
 fi
+if ! [ -e /var/db/firstboot ]; then
+   skip=$skip -s firstboot
+fi
 
 # Do a first pass to get everything up to $early_late_divider so that
 # we can do a second pass that includes $local_startup directories
@@ -116,6 +119,13 @@
run_rc_script ${_rc_elem} ${_boot}
 done
 
+if [ -e /var/db/firstboot ]; then
+   rm /var/db/firstboot
+   if [ -e /var/db/firstboot-reboot ]; then
+   rm /var/db/firstboot-reboot
+   kill -INT 1
+   fi
+fi
 echo ''
 date
 exit 0
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

panic: UMA: Increase vm.boot_pages with 32 CPUs

2013-08-12 Thread Colin Percival
Hi all,

A HEAD@254238 kernel fails to boot in EC2 with
 panic: UMA: Increase vm.boot_pages
on 32-CPU instances.  Instances with up to 16 CPUs boot fine.

I know there has been some mucking about with VM recently -- anyone want
to claim this, or should I start doing a binary search?

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Time to bump default VM_SWZONE_SIZE_MAX?

2012-08-24 Thread Colin Percival
On 08/24/12 07:13, John Baldwin wrote:
 On Friday, August 24, 2012 8:45:43 am Dag-Erling Smørgrav wrote:
 John Baldwin j...@freebsd.org writes:
 Note that on i386 you can't get more than 4GB of RAM without PAE, and if you
 have any modern x86 box with  4GB of RAM, you are most likely running amd64
 on it, not i386.  I think i386 would be fine to just keep the limit it had.

 The limit we had was insufficient for 8 GB of swap.
 
 In absolute or practical terms?  Not all swap blocks are fully utilized.  At
 Y! the install script we used would compute the maximum theoretical swap zone
 needed and then cut it in half, and this worked quite well.  Also, keep in 
 mind,
 this is for i386, not amd64.  At this point i386 is going to be used on 
 smaller
 systems (e.g. netbooks, etc.), not servers that have lots of swap.

I'd like to see i386 bumped slightly, just so that the rule of allocate swap
space equal to max(RAM, min(2*RAM, 8 GB)) (which I've seen in lots of places)
is more likely to be safe.  If I'm understanding things correctly, bumping from
32 MB up to 34.5 MB should give us a theoretical 16 GiB or a safe 8 GiB limit
on swap usage (2^17 structures which are 276 bytes each on i386).

But I agree that the real issue was with amd64, not i386.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Time to bump default VM_SWZONE_SIZE_MAX?

2012-08-14 Thread Colin Percival
On 08/13/12 14:23, Peter Jeremy wrote:
 On 2012-Aug-12 15:44:07 -0700, Colin Percival cperc...@freebsd.org
 wrote:
 If I'm understanding things correctly, the maxswzone value -- set by
 the kern.maxswzone loader tunable or to VM_SWZONE_SIZE_MAX by default --
 should be approximately 9 MiB per GiB of swap space.
 
 I'm not sure how you got that value.  By default, struct swblock is 288
 bytes (280 bytes on 32-bit archs) and can store up to 32 pages of swap (the
 comment in vm/swap_pager.c:swap_pager_swap_init() is wrong). For x86, this
 is 2.25 MiB per GiB (best case).

I got that value from a previous mailing list discussion -- I think it was
based on the (incorrect) value of 16 pages per swblock and not exceeding
50% swblock utilization.

 Realistically, I'd say that the default VM_SWZONE_SIZE_MAX can handle about
 9GB swap (at least, that was my experience).

The 50% utilization rule of thumb would make it 7 GB, but yes, same order
of magnitude.

 BTW, if you plan on allocating lots of swap, be aware that each swap device
 is limited to 32GiB - see vm/swap_pager.c:swaponsomething().

Yes, noted.  I'm not actually using lots of swap myself, but I was writing
code for EC2 instances to set up reasonable amounts of swap at boot time,
and I don't want to accidentally autoconfigure more swap than FreeBSD can
safely use.

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Time to bump default VM_SWZONE_SIZE_MAX?

2012-08-12 Thread Colin Percival
Hi all,

If I'm understanding things correctly, the maxswzone value -- set by the
kern.maxswzone loader tunable or to VM_SWZONE_SIZE_MAX by default -- should
be approximately 9 MiB per GiB of swap space.

The current default for VM_SWZONE_SIZE_MAX was set in August 2002 to 32 MiB;
meaning that anyone who wants to use more than ~ 3.5 GB of swap space ought
to set kern.maxswzone in /boot/loader.conf.

Is it time to increase this default on amd64?  (I understand that keeping the
value low on i386 is important due to KVA limitations, but amd64 has far more
address space available...)

-- 
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


mount -u /path/containing/a/symlink broken in 9.0

2011-12-13 Thread Colin Percival
Hi all,

I just discovered after upgrading the portsnap buildbox from 8.2 to 9.0-rc3 that
# mount -u /path/containing/a/symlink
now fails with 'not currently mounted'.  Can anyone tell me if this change was
deliberate?

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: freebsd-update not checking disk space?

2011-10-25 Thread Colin Percival
On 10/25/11 00:52, René Ladan wrote:
 I tried to upgrade a server at work from 8.2-RELEASE-i386 to 9.0-RC1-i386
 using freebsd-update.When running 'freebsd-update install' to install the new
 kernel, but that failed because there was insufficient disk space. This 
 resulted
 in freebsd-update thinking everything is ok but left the kernel unbootable.

Yes, this is a known issue in freebsd-update -- it needs to be much smarter
about detecting and handling errors.  Unfortunately I haven't had time to
deal with this (and it's a lot of work).

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: portsnap5 problem, portsnap error handling

2011-10-04 Thread Colin Percival
On 10/04/11 14:08, Jilles Tjoelker wrote:
 The important part is the error from [. Because the check is for
 inequality, in case of a [ syntax error the equal path is taken and
 the script continues as if everything is fine.
 
 The script arrives there because of a missing backslash so that the
 fetch(1) command's exit status is not checked.
 
 The below patch should fix this fairly simply. [...]

Aha!  I don't know how many times I've looked at that code and thought I
don't understand, why isn't it returning from this function?  Please feel
free to commit this fix.

For the benefit of the list: It looks like there was a short period this
morning when several portsnap mirrors were out of sync due to the mirroring
cron jobs running into a network outage somewhere around portsnap-master.
Everything should be back to normal now.

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


(CTRL-C to abort) console spam

2010-12-28 Thread Colin Percival
Hi all,

The '(CTRL-C to abort)' which gets printed while dumping is irritating me
because EC2's console has a very limited buffer and having this spammed
makes it impossible to see any printfs immediately prior.  (Turning off
dumping solves the problem, but creates a worse problem, namely a lack of
dumps.)  I'd like to add a sysctl to optionally disable this message, with
a default behaviour of printing the message.

Two questions for the list:
1. Any objections to adding such a sysctl?
2. What colour should the sysctl^W^W^W^W name should the sysctl be given?

I'm leaning towards debug.quietdump, but it's not a strong preference.

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: (CTRL-C to abort) console spam

2010-12-28 Thread Colin Percival
Hi all,

On 12/28/10 07:37, Colin Percival wrote:
 The '(CTRL-C to abort)' which gets printed while dumping is irritating me
 because EC2's console has a very limited buffer and having this spammed
 makes it impossible to see any printfs immediately prior.

Never mind, I've had pointed out to me that the '(CTRL-C to abort)' is only
printed when there's console input, and I've found a bug in the Xen console
code which explains why FreeBSD was thinking that there was input.

Move along, nothing to see here. :-)

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


TIMEOUT (was: Re: Official request: Please make GNU grep the default)

2010-08-14 Thread Colin Percival
Hi all,

Over the past 18 hours, I've received 22 emails in this thread.

In email number 5, sent a mere 25 minutes after the thread started, gabor@
said that he agreed that the performance penalty in BSD grep compared to
GNU grep was excessive and that he was going to revert back to having GNU
grep as the default.

Why are we still discussing this?  If and when gabor@ (or someone else) has
improved BSD grep performance and thinks that it's time to flip the switch
back again, I'm sure there will be ample opportunity for everybody to run
their favourite grep benchmarks, report numbers, and discuss the performance
differences before BSD grep is (re-)made the default.

-- 
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


user:sys time ratio

2003-11-30 Thread Colin Percival
  I've got a system running 5.2-BETA from 27/11/03, with the malloc_abort, 
malloc_junk, DEBUG=-g, DDB, INVARIANT*, and WITNESS* debugging options 
changed (as was done in 5.1-RELEASE).
  When running `make buildworld`, I see large amounts of sys time; eg, 27 
minutes user  14 minutes sys for building 5.2, or 14 minutes user  10 
minutes sys for building 4.9.  I expected the ratio of user:sys to be much 
larger than this, and mailing list traffic indicates that a 4:1 ratio is 
typical.  (FWIW, prior to changing the debugging options, the user:sys time 
ratio was around 1:1.)
  Can anyone suggest why the kernel seems to be behaving so sluggishly?

  The system hardware is P4 2.8Ghz, 865G, 2GB DDR, IDE drives; there is 
very little disk activity, so I'm sure that isn't the issue; and disabling 
HTT results in about a 2% improvement in both user and sys times.

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: user:sys time ratio

2003-11-30 Thread Colin Percival
At 15:30 30/11/2003 +0100, Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Colin 
Percival
 writes:
   When running `make buildworld`, I see large amounts of sys time; eg, 27
minutes user  14 minutes sys for building 5.2, or 14 minutes user  10
minutes sys for building 4.9.  I expected the ratio of user:sys to be much
larger than this, and mailing list traffic indicates that a 4:1 ratio is
typical.
I've seen UNIX systems have typical system/user splits from 1/9 to 9/1
it all depends on what you're doing.
  Sure, but buildworld is a fairly well-defined benchmark; I wouldn't 
expect to see such a large difference when running exactly the same code on 
different systems.

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: user:sys time ratio

2003-11-30 Thread Colin Percival
  Robert Watson suggested that I compare performance from UP and SMP kernels:

# /usr/bin/time -hl sh -c 'make -s buildworld 21'  /dev/null
  Real  UserSys
  UP kernel   38m33.29s 27m10.09s   10m59.15s
 (retest) 38m33.18s 27m04.40s   11m05.73s
  SMP w/o HTT 41m01.54s 27m10.27s   13m29.82s
 (retest) 39m47.50s 27m08.05s   12m12.20s
  SMP w/HTT   42m17.16s 28m12.82s   14m04.93s
 (retest) 44m09.61s 28m15.31s   15m44.86s
  That enabling HTT degrades performance is not surprising, since I'm not 
passing the -j option to make; but a 5% performance delta between UP and 
SMP kernels is rather surprising (to me, at least), and the fact that the 
system time varies so much on the SMP kernel also seems peculiar.
  Is this normal?

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 40% slowdown with dynamic /bin/sh

2003-11-25 Thread Colin Percival
At 00:23 26/11/2003 -0500, Michael Edenfield wrote:
Static /bin/sh:
  real385m29.977s
  user111m58.508s
  sys 93m14.450s
Dynamic /bin/sh:
  real455m44.852s
  user113m17.807s
  sys 103m16.509s
  Given that user+sys  real in both cases, it looks like you're running 
out of memory; it's not surprising that dynamic linking has an increased 
cost in such circumstances, since reading the diverse files into memory 
will take longer than reading a single static binary.
  I doubt many systems will experience this sort of performance delta.

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 40% slowdown with dynamic /bin/sh

2003-11-24 Thread Colin Percival
Newbie developer question
  Would it be possible to ship a static /bin/sh and a dynamic 
/bin/dynamic-sh, with /bin/sh execing /bin/dynamic-sh if it is invoked 
interactively?  If I'm understanding the issues correctly, a dynamic 
/bin/sh is desired for the benefit of interactive users, while the 
performance of a static /bin/sh is only an issue in non-interactive cases.
/Newbie developer question

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unfortunate dynamic linking for everything

2003-11-18 Thread Colin Percival
At 17:06 18/11/2003 -0700, Scott Long wrote:
Our rationale for encouraging Gordon is as follows:

1.  4.x upgrade path:  As we approach 5-STABLE, a lot of users might want
to upgrade from 4-STABLE.  Historically in 4.x, the / partition has
been very modest in size.  One just simply cannot cram the bloat that
has grown in 5.x into a 4.x partition scheme.  Of course there is the
venerable 'dump - clean install - restore' scheme, but we were looking
for something a little more user-friendly.
  Of course, making / dynamic results in added complication of removing 
old libraries from /usr/lib, now that some of them have moved to /lib...

3.  Binary security updates: there is a lot of interest in providing a
binary update mechanism for doing security updates.  Having a dynamic
root means that vulnerable libraries can be updated without having to
update all of the static binaries that might use them.
  As far as I'm concerned, this is a non-issue.  Identifying which static 
binaries need to be replaced is now a solved problem, replacing them is 
easy, and if binary patches are used, there is effectively no impact on 
bandwidth usage either.

  On the issue of performance, however: I know people have benchmarked 
fork-bombs, but has anyone done benchmarks with moderate numbers of 
long-lived, library-intensive, processes?  It seems to me that dynamic 
linking could have caching advantages.

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unfortunate dynamic linking for everything

2003-11-18 Thread Colin Percival
At 21:54 18/11/2003 -0500, Garance A Drosihn wrote:
Many freebsd users (me for one) are still living on a modem,
where even one bump of 1.5 meg is a significant issue...

Remember that the issue we're talking about is security
updates, not full system upgrades.  Everyone would want
the security updates, even if they're on a slow link.
  If people rebuild from source, the binary sizes don't affect the update 
time.  If people use FreeBSD Update -- which is the only binary security 
update tool around -- then they're using binary patches, and that 1.5MB is 
actually closer to 10 kb.
  The bandwidth usage associated with updating a system is only a concern 
for people who roll their own binary update mechanism -- and those people 
aren't likely to be doing everything over a modem.

Colin Percival

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]