Re: Possible PEBKAC bug for fwget(8)?

2023-07-07 Thread Peter Jeremy
On 2023-Jul-07 08:03:40 +0100, Graham Perrin  wrote:
>PCI pictured at
><https://en.wikipedia.org/wiki/Peripheral_Component_Interconnect>, somehow I
>don't imagine finding that type of slot inside the HP EliteBook where I ran
>the command ;-)

Whilst you probably don't have a full-size PCI or PCIe connector in
your laptop, it's very likely that it has a Mini PCIe connector for
the WiFi adapter.  Even without that, there are virtual PCI buses
inside your CPU chip - have a look at the output of "pciconf -lv".

-- 
Peter Jeremy


signature.asc
Description: PGP signature


ntpd fails on recent -current/arm64

2023-04-23 Thread Peter Jeremy
Somewhere between c283016-g607bc91d90a3 and c283077-g7f658f99f7ed,
some change in the kernel has made ntpd stop working on my arm64 test
box.  (My amd64 test box is a couple of days behind so I'm not sure if
it's arm-specific).

What I've identified so far:
* The problem is in the kernel, not userland.
* The impact seems to be limited to ntpd (in particular, ntpdate works).
* ntpd appears to be correctly exchanging NTP packets with peers.
* ntpd is not responding to "ntpq -p" queries
* ntp_gettime and ntp_adjtime both return TIME_ERROR to ntptime

I've looked through the commits and, beyond much of netinet being
roto-tilled, I can't see anything obvious.

Is anyone else seeing anything similar?  Can anyone suggest where
to look next?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Beadm can't create snapshot

2022-08-23 Thread Peter Jeremy
On 2022-Aug-23 15:19:34 +0200, Ronald Klop  wrote:
>Van: Kyle Evans 
>> I was not aware that beadm touches loader.conf, but I find that
>> slightly horrifying. I won't personally make bectl do that, but I
>> guess I could at least document that it doesn't...
>
>Today I looked up something for boot environments myself and read this: 
>https://wiki.freebsd.org/BootEnvironments#Setting_Boot_Dataset
>
>"In order for boot environments to be effective, you must let the bootfs zpool 
>property control which dataset gets mounted as the root. Particularly, 
>/etc/fstab must be purged of any / mount, and /boot/loader.conf must not be 
>setting vfs.root.mountfrom directly. "
>
>So it is documented somewhere at least.

Looking at the wiki history, Kyle wrote that in January 2020.  I
wonder if he recalls where that requirement came from.

I've gone rummaging through the mailing list history and other wiki
pages.  It seems that vfs.root.mountfrom used to be required - e.g.
 https://lists.freebsd.org/pipermail/freebsd-fs/2011-September/012482.html
 https://lists.freebsd.org/pipermail/svn-src-head/2011-October/030641.html
and people wanted to change that - e.g.
 https://lists.freebsd.org/pipermail/freebsd-current/2009-October/012933.html
 https://lists.freebsd.org/pipermail/freebsd-fs/2010-March/008010.html
resulting in it becoming optional in May 2012:
 https://lists.freebsd.org/pipermail/svn-src-head/2012-May/036902.html

Based on the quoted wiki entry, it seems that sometime between May
2012 and January 2020, vfs.root.mountfrom went from "must be set" to
"must not be set" and I can't find anywhere where that is publicised.
This is a serious problem because we now have the situation where
some documentation still says to set vfs.root.mountfrom - e.g.
 https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror step 2.6
and people are still using it without being warned that it shouldn't
be used - e.g. the thread starting
 https://lists.freebsd.org/pipermail/freebsd-fs/2020-July/028351.html

I've had a look at the beadm source and it preserves/updates
vfs.root.mountfrom if it's present in loader.conf but doesn't add it
if it's not present.

IMO, if bectl isn't going to update loader.conf, it needs to warn and
fail if loader.conf contains a vfs.root.mountfrom that points to a
BE that's different to bootfs.  (And ideally, a similar check of
/etc/fstab, though beadm doesn't touch that).

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Beadm can't create snapshot

2022-08-22 Thread Peter Jeremy
On 2022-Aug-22 10:56:51 +0200, "Patrick M. Hausen"  wrote:
>> Am 22.08.2022 um 10:45 schrieb Peter Jeremy :
>> On 2022-Aug-17 18:07:20 +0200, "Patrick M. Hausen"  wrote:
>>> Isn't beadm retired in favour of bectl?
>> 
>> 2) "bectl activate" doesn't update /boot/loader.conf so the wrong
>>   root filesystem is mounted.
>
>You mean the vfs.root.mountfrom option? I thought that, too, was deprecated and
>replaced by the bootfs property of the zpool.

I've looking through mailing list archives and searched the 'net and
haven't found anything saying vfs.root.mountfrom is deprecated.
loader(8) mentions that it will fallback to using "currdev" if there's
no root entry in /etc/fstab and vfs.root.mountfrom isn't set.

At the very least, it's an undocumented incompatibility between beadm
and bectl: I can't take an existing system that's using beadm and just
switch to using bectl.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Beadm can't create snapshot

2022-08-22 Thread Peter Jeremy
On 2022-Aug-17 18:07:20 +0200, "Patrick M. Hausen"  wrote:
>Isn't beadm retired in favour of bectl?

bectl still has a number of bugs:
1) The output from "bectl list" is in filesystem/bename order rather
   than creation date order.  This is an issue if you use (eg) git
   commit hashes as the name.
2) "bectl activate" doesn't update /boot/loader.conf so the wrong
   root filesystem is mounted.

That said "bectl create" appears to be a workable replacement for
"beadm create" and avoids the current "'snapshots_changed' is
readonly" bugs.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: recover deleted file

2022-04-16 Thread Peter Jeremy
On 2022-Apr-17 01:13:02 +0300, Sami Halabi  wrote:
>I understand its hard to undelete since no one designed UFS/ZFS to do so..
>that why I asked in later replies to see if someone would step in and
>implement such a "feature" and I suggested some directions/thoughts.

As you point out, neither UFS nor ZFS were designed to support an
"undelete" function: Once an inode has no references (open files
or directory entries), the inode and all associated data blocks are
returned to the free list and could be used by a subsequent allocation.

What semantics would you like UFS or ZFS to implement instead?  Is it
just that the inode and associated data blocks should stay in limbo
for some period?  If, what controls the period?  What if a file is
truncated to 0 or overwritten before being unlinked?  How much would
you be willing to pay for "undelete" functionality?

>As soren@ suggested in later reply it maybe would be easier to implement
>custom rm script that moves files to "Recycle bin" directory (and empty it
>after some period)

Alternatively, you could alias "rm" to "rm -i".

>but as a programmer I know that perfection is needed :)
>so It might start as a simple task and end in many what-if's
>(unfortunattly I did my last C programming in late 2003!).

This doesn't need to be C.  You could do this in your scripting
language of choice.  Or you could offer to pay someone to do this
for you.

>What amzes me is that this "feature" was asked too much in the last decade
>or two and no one ever implemented it, maybe it's not needed in daily
>usage, but in disasters it would be super userful, save admins many time
>and nerves..

I went rummaging back through my mail archives and it actually doesn't
seem to come up that often.  You seem to be about the 3rd person this
century on the lists I read.  I did find a discussion in zfs-discuss
from May/June 2006 about supporting undelete but it seems that no
agreement on the desired behaviour was achieved.

>For now I did some backup tools locally and used chflags to mark them
>undeletable so I wouldn't do that mistake again,

You could also consider snapshots - both UFS and ZFS support snapshots.

If the information is very critical (you mentioned legal consequences)
then you might like to consider real-time replication of the MySQL redo
logs to another systems - though that won't necessarily protect you
from someone accidently doing a "DELETE FROM xxx;" or "DROP TABLE xxx;"

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Rock64 configuration fails to boot for main 22c4ab6cb015 but worked for main 06bd74e1e39c (Nov 21): e.MMC mishandled?

2021-12-08 Thread Peter Jeremy
On 2021-Dec-09 08:19:30 +0100, Emmanuel Vadot  wrote:
>
> Hi Mark,
>
>On Wed, 8 Dec 2021 20:36:20 -0800
>Mark Millard via freebsd-current  wrote:
>
>> [ Note: w...@freebsd.org is only a guess, based on:
>> https://lists.freebsd.org/archives/dev-commits-src-main/2021-December/001931.html
>>  ]
>> 
>> Attempting to update to:
>> 
>> main-n251456-22c4ab6cb015-dirty: Tue Dec  7 19:38:53 PST 2021
>> 
>> resulted in boot failure (showing some boot -v output):
[hang just before root is mounted]
> Could you try reverting 
>8661e085fb953855dbc7059f21a64a05ae61b22c "mmc: Fix HS200/HS400
>capability check" and let me know ?

I had exactly the same boot failure but was still working backwards
through the root mount code trying to isolate the issue.  Reverting
8661e085fb953855dbc7059f21a64a05ae61b22c solves the problem for me.
I'd noticed the mmc1 difference and mmcsd1 error:
 mmc1:  bus: 8bit, 200MHz (HS200 timing)
 mmc1:  memory: 30310400 blocks, erase sector 1024 blocks
mmc1: setting transfer rate to 150.000MHz (HS200 timing)

bud I didn't think it was the cause.

I had tracked down that the hang was somewhere between
https://cgit.freebsd.org/src/tree/sys/kern/vfs_mountroot.c#n779 and
https://cgit.freebsd.org/src/tree/sys/kern/vfs_mountroot.c#n1008
which led me to suspect that the problem might be in the geom
layer (eg g_waitidle()) but was still considering where to add
my next tranche of printf's when I saw Mark's mail.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Install to ZFS root is using device names hence failing when device tree is changed.

2021-09-07 Thread Peter Jeremy
On 2021-Sep-06 17:45:31 +0200, Karel Gardas  wrote:
>just installed 14-current snapshot from 2.9. on uefi amd64 machine. 
>Installed from USB memstick which was detected as da0 into the ssd 
>hanging on usb3 in external enclosure which was detected as da1.
>
>ZFS root pool is then using /dev/da1p3 as swap and /dev/da1p1 as 
>/boot/efi and probably also something as root zpool.
>
>Anyway, expected thing happen. When I pulled out USB stick identified as 
>da0 on reboot, the drive on USB3 switch from da1 to da0 and result is 
>unbootable system with complains about various /dev/da1xx drives missing 
>for swap efi boot etc.

Can you give more details about exactly what the errors and when they
occur during the boot cycle.  In particular:
* Low-level boot (anything prior to the FreeBSD kernel) knows nothing
  about da0 or da1, so any problems there are associated with your
  BIOS config, not FreeBSD.
* The swap partition will, by default, appear as a hard-wired device
  name in /etc/fstab - that will definitely need updating.  This will
  prevent the "swapon" working but won't prevent the boot.
* ZFS doesn't care about device names - it looks for ZFS labels on all
  possible devices.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Files in /etc containing empty VCSId header

2021-06-09 Thread Peter Jeremy via freebsd-current
On 2021-Jun-08 17:13:45 -0600, Ian Lepore  wrote:
>On Tue, 2021-06-08 at 15:11 -0700, Rodney W. Grimes wrote:
>> There is a command for that which does or use to do a pretty
>> decent job of it called whereis(1).

Thanks.  That looks useful.

>revolution > whereis ntp.conf
>ntp.conf:
>revolution > whereis netif
>netif:
>revolution > whereis services
>services:
>
>So how does that help me locate the origin of these files in the source
>tree?

It works for me™:
server% whereis ntp.conf
ntp.conf: /usr/src/usr.sbin/ntp/ntpd/ntp.conf
server% whereis netif   
netif: /usr/src/libexec/rc/rc.d/netif
server% whereis services
services: /usr/src/contrib/unbound/services

Is your source tree somewhere other than /usr/src?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: geli broken in 13.0-BETA4 and later on armv8

2021-03-06 Thread Peter Jeremy via freebsd-current
On 2021-Mar-06 10:39:02 -0800, Oleksandr Tymoshenko  wrote:
>Peter Jeremy via freebsd-current (freebsd-current@freebsd.org) wrote:
>> [Adding arm@ and making it clearer that this is armv8-only]
>> 
>> On 2021-Mar-06 20:26:19 +1100, Peter Jeremy  
>> wrote:
>> >On 2021-Mar-06 19:18:37 +1100, Peter Jeremy via freebsd-stable 
>> > wrote:
>> >>Somewhere between 13.0-ALPHA2 (c256201-g02611ef8ee9) and 13.0-BETA4
>> >>(releng/13.0-n244592-e32bc253629), geli (at least on my RockPro64 -
>> >>RK3399, arm64) has changed so that a geli-encrypted partition (using
>> >>AES-XTS 128) that was readable on 13.0-ALPHA2 becomes garbage on
>> >>13.0-BETA4.
>> >
>> >I've confirmed that the problem is f76393a6305b - reverting that
>> >commit fixes the problem in releng/13.0.
>> >
>> >I've further verified that the bug is still present in main (14.x)
>> >at 028616d0dd69.
>
>Could you test this patch and let me know if it fixes the issue?
>
>https://people.freebsd.org/~gonzo/patches/armv8crypto-xts-fix.diff

Yes, it does.  Thank you very much.

--- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: geli broken in 13.0-BETA4 and later on armv8

2021-03-06 Thread Peter Jeremy via freebsd-current
[Adding arm@ and making it clearer that this is armv8-only]

On 2021-Mar-06 20:26:19 +1100, Peter Jeremy  wrote:
>On 2021-Mar-06 19:18:37 +1100, Peter Jeremy via freebsd-stable 
> wrote:
>>Somewhere between 13.0-ALPHA2 (c256201-g02611ef8ee9) and 13.0-BETA4
>>(releng/13.0-n244592-e32bc253629), geli (at least on my RockPro64 -
>>RK3399, arm64) has changed so that a geli-encrypted partition (using
>>AES-XTS 128) that was readable on 13.0-ALPHA2 becomes garbage on
>>13.0-BETA4.
>
>I've confirmed that the problem is f76393a6305b - reverting that
>commit fixes the problem in releng/13.0.
>
>I've further verified that the bug is still present in main (14.x)
>at 028616d0dd69.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: geli broken in 13.0-BETA4 and later

2021-03-06 Thread Peter Jeremy via freebsd-current
On 2021-Mar-06 19:18:37 +1100, Peter Jeremy via freebsd-stable 
 wrote:
>Somewhere between 13.0-ALPHA2 (c256201-g02611ef8ee9) and 13.0-BETA4
>(releng/13.0-n244592-e32bc253629), geli (at least on my RockPro64 -
>RK3399, arm64) has changed so that a geli-encrypted partition (using
>AES-XTS 128) that was readable on 13.0-ALPHA2 becomes garbage on
>13.0-BETA4.

I've confirmed that the problem is f76393a6305b - reverting that
commit fixes the problem in releng/13.0.

I've further verified that the bug is still present in main (14.x)
at 028616d0dd69.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: New Xorg - different key-codes

2020-03-11 Thread Peter Jeremy
On 2020-Mar-11 10:29:08 +0100, Niclas Zeising  wrote:
>This has to do with switching to using evdev to handle input devices on 
>FreeBSD 12 and CURRENT.  There's been several reports, and suggested 
>solutions to this, as well as an UPDATING entry detailing the change.

The UPDATING entry says that it's switched from devd to udev.  There's no
mention of evdev or that the keycodes have been roto-tilled.  It's basically
a vanilla "things have been changed, see the documentation" entry.  Given
that entry, it's hardly surprising that people are confused.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: System clock is slow

2020-03-09 Thread Peter Jeremy
On 2020-Mar-09 19:59:09 -0400, Theron  wrote:
>Since switching from 12.1-RELEASE to CURRENT I've noticed timing 
>problems with audio applications.  It turns out that the problem is not 
>with the audio drivers, but with the system clock driver, which now 
>reports passage of time 0.3% too slow.  Although I discovered this only 
>recently, it's been broken since r352684 made on Sept. 25.  Has anyone 
>else noticed?

Note that r352684 was MFC'd to both 11-stable (r353007) and 12-stable
(r353006) in early October and I don't recall seeing any adverse
reports before this.

Are you running NTP?  If so, is NTP maintaining lock and what is the
reported PLL frequency (ntpq -c kerni)?

What does "sysctl kern.timecounter" report and have you tried using
any of the alternative timecounters listed in kern.timecounter.choice?

Are you overclocking your CPU (or doing anything else non-standard)?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Which AMD CPUs are supported -- temperature

2020-02-16 Thread Peter Jeremy
On 2020-Feb-13 13:27:17 -0800, Chris  wrote:
>My BIOS appears to have the correct temp reading. Would it be of any use
>to anyone besides myself, if I were to decompile it, and get the source
>for the temp reading/monitoring from it?

I would definitely like to have this information.  If you are able to
share the two constants (both step size and reference temperature), that
would be great.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Which AMD CPUs are supported -- temperature

2020-02-12 Thread Peter Jeremy
On 2020-Feb-12 15:23:51 -0500, mike tancsa  wrote:
>Not sure about the older Athlon CPUs, but the 2 generations of Ryzen's I
>have seem correct as well as an APU
>
>CPU: AMD GX-412TC SOC    (998.17-MHz K8-class CPU)

OTOH, I'm not confident about temperatures on my APU.  The publicly
available data just says that the SoC reports "a temperature on its own
scale" relative to a Tctl_max which "is specified in the power and thermal
data sheet" (that I have been unable to locate).  Everyone seems to assume
that the step size is 0.125K but I haven't found that publicly documented
anywhere.  The AMD Product Brief states that the maximum temperature is
90°C but using that as Tctl_max gives me temperature readings that don't
look right.

>And on a fanless APU
>
># sysctl -a dev.cpu.0.temperature
>dev.cpu.0.temperature: 62.6C
>
># sysctl -a dev.amdtemp.0.core0.sensor0
>dev.amdtemp.0.core0.sensor0: 63.1C

At what ambient temperature?  I see a similar value from my (idle) APU3
but don't believe the (implied) ~35K junction-to-ambient difference.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: head -r356066 reaching kern.ipc.nmbclusters on Rock64 (CortexA53 with 4GiByte of RAM) while putting files on it via nfs: some evidence

2020-01-04 Thread Peter Jeremy
Sorry for the delay in responding.

On 2019-Dec-27 21:59:49 -0800, Mark Millard via freebsd-arm 
 wrote:
>The following sort of sequence leads to the Rock64 not
>responding on the console or over ethernet, after notifying
>of nmbclusters having been reached. (This limits what
>information I have of what things were like at the end.)

There's a bug in the dwc(4) driver such that it can leak mbuf clusters.
I've been running with the following patch but need to clean it up
samewhat before I can commit it:

Index: sys/dev/dwc/if_dwc.c
===
--- sys/dev/dwc/if_dwc.c(revision 356350)
+++ sys/dev/dwc/if_dwc.c(working copy)
@@ -755,7 +755,6 @@
 dwc_rxfinish_locked(struct dwc_softc *sc)
 {
struct ifnet *ifp;
-   struct mbuf *m0;
struct mbuf *m;
int error, idx, len;
uint32_t rdes0;
@@ -762,9 +761,8 @@
 
ifp = sc->ifp;
 
-   for (;;) {
+   for (; ; sc->rx_idx = next_rxidx(sc, sc->rx_idx)) {
idx = sc->rx_idx;
-
rdes0 = sc->rxdesc_ring[idx].tdes0;
if ((rdes0 & DDESC_RDES0_OWN) != 0)
break;
@@ -773,9 +771,9 @@
BUS_DMASYNC_POSTREAD);
bus_dmamap_unload(sc->rxbuf_tag, sc->rxbuf_map[idx].map);
 
+   m = sc->rxbuf_map[idx].mbuf;
len = (rdes0 >> DDESC_RDES0_FL_SHIFT) & DDESC_RDES0_FL_MASK;
if (len != 0) {
-   m = sc->rxbuf_map[idx].mbuf;
m->m_pkthdr.rcvif = ifp;
m->m_pkthdr.len = len;
m->m_len = len;
@@ -784,24 +782,33 @@
/* Remove trailing FCS */
m_adj(m, -ETHER_CRC_LEN);
 
+   /* Consume the mbuf and mark it as consumed */
+   sc->rxbuf_map[idx].mbuf = NULL;
DWC_UNLOCK(sc);
(*ifp->if_input)(ifp, m);
DWC_LOCK(sc);
+   m = NULL;
} else {
/* XXX Zero-length packet ? */
}
 
-   if ((m0 = dwc_alloc_mbufcl(sc)) != NULL) {
-   if ((error = dwc_setup_rxbuf(sc, idx, m0)) != 0) {
-   /*
-* XXX Now what?
-* We've got a hole in the rx ring.
-*/
+   if (m == NULL) {
+   if ((m = dwc_alloc_mbufcl(sc)) == NULL) {
+   if_inc_counter(sc->ifp, IFCOUNTER_IQDROPS, 1);
+   continue;
}
-   } else
+   }
+
+   if ((error = dwc_setup_rxbuf(sc, idx, m)) != 0) {
+   m_free(m);
+   device_printf(sc->dev,
+   "dwc_setup_rxbuf returned %d\n", error);
if_inc_counter(sc->ifp, IFCOUNTER_IQDROPS, 1);
-
-   sc->rx_idx = next_rxidx(sc, sc->rx_idx);
+   /*
+* XXX Now what?
+* We've got a hole in the rx ring.
+*/
+   }
}
 }

-- 
Peter Jeremy


signature.asc
Description: PGP signature


buildworld has mandatory dependency on optional executable.

2019-10-30 Thread Peter Jeremy
I've just discovered that "make buildworld" has a mandatory dependency
on kbdcontrol (see
https://svnweb.freebsd.org/base/head/Makefile.inc1?annotate=354138#l2207 )
but, if WITHOUT_LEGACY_CONSOLE is defined then kbdcontrol isn't built
(https://svnweb.freebsd.org/base/head/usr.sbin/Makefile?annotate=352949#l162 )
and the installed version will be deleted by "make delete-old":
https://svnweb.freebsd.org/base/head/tools/build/mk/OptionalObsoleteFiles.inc?annotate=353358#l4520

This seems undesirable...

The "make buildworld" failure doesn't make the cause obvious - it just
reports "*** Error code 1" in bootstrap-tools.  Having trace the failure,
I now see ".ERROR_TARGET='_bootstrap-tools-link-kbdcontrol'" but that was
only obvious in hindsight.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Reproducable deadlock in NFS client

2019-10-03 Thread Peter Jeremy
On 2019-Oct-03 23:28:07 +, Rick Macklem  wrote:
>1 - kib@ just put a patch up on phabricator that reorganizes the handling
>  of vnode_pager_setsize().
>  D21883
>  (If you could test this patch, that might be the best approach.)

That fixes my problem.  I've added a note to D21883

>ps: Btw, capturing "procstat -kk" and "ps axHl" would give you/us more info.
> (The "H" on "ps" shows the iod threads.)
>  If you can drop into the debugger when it is hung as above, you could
>  capture the stuff listed here:
>https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

Thanks for the pointer and sorry for leaving that out.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Reproduceable deadlock in NFS Client

2019-10-03 Thread Peter Jeremy
My diskless Rock64 has taken to deadlocking reproduceably whilst
building libprivatesqlite3.a as part of buildworld when running
r352792.  At the time of the deadlock, the relevant running process
is:
ar -crD libprivatesqlite3.a sqlite3.o

And those files are:
-rw-r--r--1 root  wheel  3178496  4 Oct 01:10 libprivatesqlite3.a
-rw-r--r--1 root  wheel  7975272  4 Oct 01:10 sqlite3.o

The "ar" reports it's in bo_wwait and, after about 30 minutes, I get:
deadlres_td_sleep_q: possible deadlock detected for 0xfd00012c9560, blocked 
for 1800613 ticks

cpuid = 2
time = 1570117920
KDB: stack backtrace:
db_trace_self() at db_trace_self_wrapper+0x28
 pc = 0x0054b83c  lr = 0x000e2b08
 sp = 0x4030a790  fp = 0x4030a9a0

db_trace_self_wrapper() at vpanic+0x18c
 pc = 0x000e2b08  lr = 0x0027fb54
 sp = 0x4030a9b0  fp = 0x4030aa50

vpanic() at panic+0x44
 pc = 0x0027fb54  lr = 0x0027f904
 sp = 0x4030aa60  fp = 0x4030aae0

panic() at deadlkres+0x33c
 pc = 0x0027f904  lr = 0x0021c19c
 sp = 0x4030aaf0  fp = 0x4030ab50

deadlkres() at fork_exit+0x7c
 pc = 0x0021c19c  lr = 0x002404f4
 sp = 0x4030ab60  fp = 0x4030ab90

fork_exit() at fork_trampoline+0x10
 pc = 0x002404f4  lr = 0x0056743c
 sp = 0x4030aba0  fp = 0x0000


-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: panic: sleeping thread on r352386

2019-09-18 Thread Peter Jeremy
On 2019-Sep-17 15:24:30 +0300, Konstantin Belousov  wrote:
>Try this.
>
>diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
>index 63ea4736707..a23b4ba4efa 100644

Sorry for the delay but I'm not seeing problems with this version of
your patch (now r352457) either.  Thank you for your efforts.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: panic: sleeping thread on r352386

2019-09-17 Thread Peter Jeremy
On 2019-Sep-17 11:06:58 +0300, Konstantin Belousov  wrote:
>Try the following change, which more accurately tries to avoid
>vnode_pager_setsize().  The real cause requires much more extensive
>changes.
>
>diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
>index 63ea4736707..16dc7745c77 100644
>--- a/sys/fs/nfsclient/nfs_clport.c
>+++ b/sys/fs/nfsclient/nfs_clport.c
...

With that patch, I'm back to "Sleeping thread (...) owns a non-sleepable
lock" panics.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: "Sleeping with non-sleepable lock" in NFS on recent -current

2019-09-16 Thread Peter Jeremy
On 2019-Sep-16 11:19:02 +0300, Konstantin Belousov  wrote:
>diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
>index 471e029a8b5..63ea4736707 100644
...

Thanks, that patch seems much more stable.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: "Sleeping with non-sleepable lock" in NFS on recent -current

2019-09-16 Thread Peter Jeremy
On 2019-Sep-16 09:32:52 +0300, Konstantin Belousov  wrote:
>On Mon, Sep 16, 2019 at 04:12:05PM +1000, Peter Jeremy wrote:
>> I'm consistently seeing panics in the NFS code on recent -current on aarm64.
>> The panics are one of the following two:
>> Sleeping on "vmopar" with the following non-sleepable locks held:
>> exclusive sleep mutex NEWNFSnode lock (NEWNFSnode lock) r = 0 
>> (0xfd0078b346f0) locked @ /usr/src/sys/fs/nfsclient/nfs_clport.c:432
>> 
>> Sleeping thread (tid 100077, pid 35) owns a non-sleepable lock
>> 
>> Both panics have nearly identical backtraces (see below).  I'm running
>> diskless on a Rock64 with both filesystem and swap over NFS.  The panics
>> can be fairly reliably triggered by any of:
>> * "make -j4 buildworld"
>> * linking the kernel (as part of buildkernel)
>> * "make installworld"
>> 
>> Has anyone else seen this?
...

>Weird since this should have been fixed long time ago.  Anyway, please
>try the following, it should fix the rest of cases.
>
>diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
...
>@@ -540,7 +541,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
>*nap, void *nvaper,
>   } else {
>   np->n_size = vap->va_size;
>   np->n_flag |= NSIZECHANGED;
>-  vnode_pager_setsize(vp, np->n_size);
>+  setnsize = 1;

Should this else block include a "nsize = np->n_size;"?  Without it,
nsize will remain set to 0, which looks wrong.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


"Sleeping with non-sleepable lock" in NFS on recent -current

2019-09-16 Thread Peter Jeremy
I'm consistently seeing panics in the NFS code on recent -current on aarm64.
The panics are one of the following two:
Sleeping on "vmopar" with the following non-sleepable locks held:
exclusive sleep mutex NEWNFSnode lock (NEWNFSnode lock) r = 0 
(0xfd0078b346f0) locked @ /usr/src/sys/fs/nfsclient/nfs_clport.c:432

Sleeping thread (tid 100077, pid 35) owns a non-sleepable lock

Both panics have nearly identical backtraces (see below).  I'm running
diskless on a Rock64 with both filesystem and swap over NFS.  The panics
can be fairly reliably triggered by any of:
* "make -j4 buildworld"
* linking the kernel (as part of buildkernel)
* "make installworld"

Has anyone else seen this?

The first panic (sleeping on vmopar) has a backtrace:
sched_switch() at mi_switch+0x19c
 pc = 0x002ab368  lr = 0x0028a9f4
 sp = 0x61192660  fp = 0x61192680

mi_switch() at sleepq_switch+0x100
 pc = 0x0028a9f4  lr = 0x002d56dc
 sp = 0x61192690  fp = 0x611926d0

sleepq_switch() at sleepq_wait+0x48
 pc = 0x002d56dc  lr = 0x002d5594
 sp = 0x611926e0  fp = 0x61192700

sleepq_wait() at _sleep+0x2c4  [***]
 pc = 0x002d5594  lr = 0x00289eec
 sp = 0x61192710  fp = 0x611927b0

_sleep() at vm_object_page_remove+0x178  [***]
 pc = 0x00289eec  lr = 0x0052211c
 sp = 0x611927c0  fp = 0x61192820

vm_object_page_remove() at vnode_pager_setsize+0xc0
 pc = 0x0052211c  lr = 0x00539a70
 sp = 0x61192830  fp = 0x61192870

vnode_pager_setsize() at nfscl_loadattrcache+0x2e8
 pc = 0x00539a70  lr = 0x001ed4b4
 sp = 0x61192880  fp = 0x611928e0

nfscl_loadattrcache() at ncl_writerpc+0x104
 pc = 0x001ed4b4  lr = 0x001e2158
 sp = 0x611928f0  fp = 0x61192a40

ncl_writerpc() at ncl_doio+0x36c
 pc = 0x001e2158  lr = 0x001f0370
 sp = 0x61192a50  fp = 0x61192ae0

ncl_doio() at nfssvc_iod+0x228
 pc = 0x001f0370  lr = 0x001f1d88
 sp = 0x61192af0  fp = 0x61192b50

nfssvc_iod() at fork_exit+0x7c
 pc = 0x001f1d88  lr = 0x0023ff5c
 sp = 0x61192b60  fp = 0x61192b90

fork_exit() at fork_trampoline+0x10
 pc = 0x0023ff5c  lr = 0x00562c34
 sp = 0x61192ba0  fp = 0x


For the second panic, the [***] change to:
sleepq_wait() at vm_page_sleep_if_busy+0x80
vm_page_sleep_if_busy() at vm_object_page_remove+0xfc


-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: "panic: Duplicate alloc" in dwmmc_attach on Rock64

2019-06-24 Thread Peter Jeremy
On 2019-Jun-21 20:59:39 +1000, Peter Jeremy  wrote:
>Since r349169, my Rock64 has consistently panic'd whilst attaching
>rockchip_dwmmc1.  A kernel built at r349135 works OK.  The relevant
>output looks like:
>rockchip_dwmmc0: (RockChip)> mem 0xff50-0xff503fff irq 40 on ofwbus0
>rockchip_dwmmc0: Hardware version ID is 270a
>mmc0:  on rockchip_dwmmc0
>rockchip_dwmmc1: (RockChip)> mem 0xff52-0xff523fff irq 42 on ofwbus0
>rockchip_dwmmc1: Hardware version ID is 270a
>panic: Duplicate alloc of 0xfd89cf50 from zone 0xfd817540(16) 
>slab 0xfd89cf90(0)

I did some more digging and narrowed this down to r349151 (which has nothing
that would be an obvious cause).  And the problem went away somewhere
between r349269 and r349288.  Since there's nothing obvious there either, I
presume this is something more subtle like a race condition that has been
provoked by the code changes.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


"panic: Duplicate alloc" in dwmmc_attach on Rock64

2019-06-21 Thread Peter Jeremy
Since r349169, my Rock64 has consistently panic'd whilst attaching
rockchip_dwmmc1.  A kernel built at r349135 works OK.  The relevant
output looks like:
rockchip_dwmmc0:  mem 0xff50-0xff503fff irq 40 on ofwbus0
rockchip_dwmmc0: Hardware version ID is 270a
mmc0:  on rockchip_dwmmc0
rockchip_dwmmc1:  mem 0xff52-0xff523fff irq 42 on ofwbus0
rockchip_dwmmc1: Hardware version ID is 270a
panic: Duplicate alloc of 0xfd89cf50 from zone 0xfd817540(16) 
slab 0xfd89cf90(0)

cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self_wrapper+0x28
 pc = 0x00535d54  lr = 0x000df10c
 sp = 0x000104d0  fp = 0x000106e0

db_trace_self_wrapper() at vpanic+0x18c
 pc = 0x000df10c  lr = 0x00278218
 sp = 0x000106f0  fp = 0x00010790

vpanic() at panic+0x44
 pc = 0x00278218  lr = 0x00277fc8
 sp = 0x000107a0  fp = 0x00010820

panic() at uma_dbg_alloc+0x144
 pc = 0x00277fc8  lr = 0x004fa4b0
 sp = 0x00010830  fp = 0x00010850

uma_dbg_alloc() at uma_zalloc_arg+0x9b0
 pc = 0x004fa4b0  lr = 0x004f9960
 sp = 0x00010860  fp = 0x000108e0

uma_zalloc_arg() at malloc+0x9c
 pc = 0x004f9960  lr = 0x00252a8c
 sp = 0x000108f0  fp = 0x00010920

malloc() at bounce_bus_dmamem_alloc+0x4c
 pc = 0x00252a8c  lr = 0x00533b64
 sp = 0x00010930  fp = 0x00010960

bounce_bus_dmamem_alloc() at dwmmc_attach+0x5fc
 pc = 0x00533b64  lr = 0x00556f14
 sp = 0x00010970  fp = 0x000109e0

dwmmc_attach() at device_attach+0x3f4
 pc = 0x00556f14  lr = 0x002abd8c
 sp = 0x000109f0  fp = 0x00010a40

device_attach() at bus_generic_new_pass+0x12c
 pc = 0x002abd8c  lr = 0x002adb40
 sp = 0x00010a50  fp = 0x00010a80
...

I've looked through all the intervening commits and don't see any
smoking gun.  Does anyone have any suggestions?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: error: yacc.h: No such file or directory

2019-06-20 Thread Peter Jeremy
On 2019-Jun-18 07:01:31 -0700, Enji Cooper  wrote:
>
>> On Jun 18, 2019, at 06:59, Enji Cooper  wrote:
>> PS This is one of the reasons why I wasn’t quick to discount Peter Jeremy’s 
>> reported build issue.
>
>Correction: I meant Julian Stacey.

I'm not sure how I feel about being confused with jhs.

Actually, I had also seen this problem in both mkesdb_static and
mkcsmapper_static but hadn't reported it because I was investigating
something else and wasn't certain that it wasn't self-inflicted.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: FreeBSD 12 kernel broken

2019-03-24 Thread Peter Jeremy
On 2019-Mar-22 19:08:18 +0300, Rozhuk Ivan  wrote:
>ld: error: undefined symbol: xz_dec_init
>>>> referenced by g_uzip_lzma.c:106 (/usr/src/sys/geom/uzip/g_uzip_lzma.c:106)
>>>>   g_uzip_lzma.o:(g_uzip_lzma_ctor)
>
>ld: error: undefined symbol: xz_dec_run
>>>> referenced by g_uzip_lzma.c:81 (/usr/src/sys/geom/uzip/g_uzip_lzma.c:81)
>>>>   g_uzip_lzma.o:(g_uzip_lzma_decompress)
>
>ld: error: undefined symbol: xz_dec_end
>>>> referenced by g_uzip_lzma.c:60 (/usr/src/sys/geom/uzip/g_uzip_lzma.c:60)
>>>>   g_uzip_lzma.o:(g_uzip_lzma_free)
>--- kernel.full ---
>*** [kernel.full] Error code 1

Are you talking about FreeBSD 12 or FreeBSD 13?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Optimization bug with floating-point?

2019-03-14 Thread Peter Jeremy
On 2019-Mar-13 23:30:07 -0700, Steve Kargl  
wrote:
>AFAICT, all libm float routines need to be modified to conditional
>include ieeefp.h and call fpsetprec(FP_PD).  This will work around
>issues is FP and libm.  FreeBSD needs to issue an erratum about 
>the numerical issues with clang.

I vaguely recall looking into the x87 initialisation a long time ago
and STR that the startup code (either crtX or in the kernel) does
a fninit() to set the precision.  I don't recall exactly where.

IMO, calling fpsetprec() in every libm float function is overkill. It
should be enough to fpsetprec() before main() and add a note in the
man pages that libm is built to use the default FPU configuration and
changing the configuration (precision or rounding) may result in larger
errors.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: how to browse svnweb source?

2018-05-29 Thread Peter Jeremy
On 2018-May-28 18:06:07 -0700, Jeffrey Bouquet  
wrote:
>> > Suddenly the site www.secnetix.de/olli/FreeBSD/svnews which showed 
>> > sequential
>> > source as for example xx1966 on april 3  xx2040 on april 4 this year, 
>> > is not loading
>> > in the browser.

That site is not associated with the FreeBSD Project so you would need to
discuss the absence of information on that site with whoever runs it.

>I tried that url every which way, sorting the headings, etc, and onscreen
>would be at best, a description of the new source but not specifically which
>files were changed and their complete path. Nothing like the url mentioned 
>above at
>.de in the latter's overview. 

Without knowing what that site displayed, it's very difficult to know where
(or if) svnweb provides the information.  Given a known revision, you can
check (eg) https://svnweb.freebsd.org/base?view=revision=333926

If you want a sequential list of commits, you might be better off with (eg)
https://lists.freebsd.org/pipermail/svn-src-all/

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Strange ARC/Swap/CPU on yesterday's -CURRENT

2018-03-20 Thread Peter Jeremy

On 2018-Mar-11 10:43:58 -1000, Jeff Roberson <jrober...@jroberson.net> wrote:
>Also, if you could try going back to r328953 or r326346 and let me know if 
>the problem exists in either.  That would be very helpful.  If anyone is 
>willing to debug this with me contact me directly and I will send some 
>test patches or debugging info after you have done the above steps.

I ran into this on 11-stable and tracked it to r326619 (MFC of r325851).
I initially got around the problem by reverting that commit but either
it or something very similar is still present in 11-stable r331053.

I've seen it in my main server (32GB RAM) but haven't managed to reproduce
it in smaller VBox guests - one difficulty I faced was artificially filling
ARC.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Build error: 'emmintrin.h' file not found

2018-01-24 Thread Peter Jeremy
On 2018-Jan-24 17:34:33 +0100, Florian Limberger <f...@snakeoilproductions.net> 
wrote:
>since a few days I can't build 12-CURRENT anymore, due to the 'emmintrin.h'
>header missing.

I ran into a similar problem about a month ago.  First of all, does
your host system have emmintrin.h?  E.g. what is the output of "find
/usr/lib/clang -name emmintrin.h" ?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Unable to build 12-current/amd64

2017-12-25 Thread Peter Jeremy
On 2017-Dec-23 13:42:40 +0100, Dimitry Andric <d...@freebsd.org> wrote:
>On 23 Dec 2017, at 10:56, Peter Jeremy <pe...@rulingia.com> wrote:
>> 
>> Since r326496, buildworld on my 12-current/amd64 system has consistently
>> died as follows.
>...
>> /usr/src/contrib/llvm/tools/clang/lib/Basic/SourceManager.cpp:1166:10: fatal 
>> error: 'emmintrin.h' file not found
>> #include 
>> ^
>> 1 error generated.
>> *** Error code 1
>> 
>> Stop.
>> make[4]: stopped in /usr/src/lib/clang/libclang
>> 
>> I'm building on a 12.0-CURRENT VirtualBox guest at r326430.  I've checked
>> that my /usr/src is clean and deleted /usr/obj to no effect.  I have dug
>> into SourceManager.cpp and the #include is protected by a #if __SSE2__,
>> which is relying on clang internal checks to define (and my CPU supports
>> SSE2).  Does anyone have any ideas to explain what is going on?
>
>First of all, does your host system have emmintrin.h?  E.g. what is the
>output of "find /usr/lib/clang -name emmintrin.h" ?

Aha.  Somehow my entire /usr/lib/clang/5.0.0 tree was missing.  I'm not sure
if that was an installworld glitch or something I accidently did.  In any
case, restoring it has fixed the problem.  Thanks for the pointer.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Unable to build 12-current/amd64

2017-12-23 Thread Peter Jeremy
Since r326496, buildworld on my 12-current/amd64 system has consistently
died as follows.  I have no problems building on i386 or building
12-current/amd64 on 11-stable.

...
>>> stage 3: cross tools
--
cd /usr/src; INSTALL="sh /usr/src/tools/install.sh"  
TOOLS_PREFIX=/usr/obj/usr/src/amd64.amd64/tmp  
PATH=/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/sbin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/bin:/usr/obj/usr/src/amd64.amd64/tmp/legacy/bin:/sbin:/bin:/usr/sbin:/usr/bin
  WORLDTMP=/usr/obj/usr/src/amd64.amd64/tmp  MAKEFLAGS="-m 
/usr/src/tools/build/mk  -m /usr/src/share/mk" make  -f Makefile.inc1  DESTDIR= 
 OBJTOP='/usr/obj/usr/src/amd64.amd64/tmp/obj-tools'  OBJROOT='${OBJTOP}/'  
MAKEOBJDIRPREFIX=  BOOTSTRAPPING=1200054  BWPHASE=cross-tools  SSP_CFLAGS=  
MK_HTML=no NO_LINT=yes MK_MAN=no  -DNO_PIC MK_PROFILE=no -DNO_SHARED  
-DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no  MK_CLANG_EXTRAS=no MK_CLANG_FULL=no  
MK_LLDB=no MK_TESTS=no  MK_INCLUDES=yes  TARGET=amd64 TARGET_ARCH=amd64  
MK_GDB=no MK_LLD_IS_LD=no MK_TESTS=no cross-tools
...
===> lib/clang/libclang (all)
...
c++  -O2 -pipe -I/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/libclang 
-I/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/libllvm 
-I/usr/src/contrib/llvm/tools/clang/lib/Driver 
-I/usr/src/contrib/llvm/tools/clang/include -I/usr/src/lib/clang/include 
-I/usr/src/contrib/llvm/include -DLLVM_BUILD_GLOBAL_ISEL -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd12.0\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd12.0\" 
-DDEFAULT_SYSROOT=\"/usr/obj/usr/src/amd64.amd64/tmp\" -ffunction-sections 
-fdata-sections -gline-tables-only -MD -MF.depend.Basic_SourceLocation.o 
-MTBasic/SourceLocation.o -Qunused-arguments 
-I/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/include  -std=c++11 
-fno-exceptions -fno-rtti -gline-tables-only -stdlib=libc++ 
-Wno-c++11-extensions  -c 
/usr/src/contrib/llvm/tools/clang/lib/Basic/SourceLocation.cpp -o 
Basic/SourceLocation.o
c++  -O2 -pipe -I/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/libclang 
-I/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/clang/libllvm 
-I/usr/src/contrib/llvm/tools/clang/lib/Driver 
-I/usr/src/contrib/llvm/tools/clang/include -I/usr/src/lib/clang/include 
-I/usr/src/contrib/llvm/include -DLLVM_BUILD_GLOBAL_ISEL -D__STDC_LIMIT_MACROS 
-D__STDC_CONSTANT_MACROS 
-DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd12.0\" 
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd12.0\" 
-DDEFAULT_SYSROOT=\"/usr/obj/usr/src/amd64.amd64/tmp\" -ffunction-sections 
-fdata-sections -gline-tables-only -MD -MF.depend.Basic_SourceManager.o 
-MTBasic/SourceManager.o -Qunused-arguments 
-I/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/include  -std=c++11 
-fno-exceptions -fno-rtti -gline-tables-only -stdlib=libc++ 
-Wno-c++11-extensions  -c 
/usr/src/contrib/llvm/tools/clang/lib/Basic/SourceManager.cpp -o 
Basic/SourceManager.o
/usr/src/contrib/llvm/tools/clang/lib/Basic/SourceManager.cpp:1166:10: fatal 
error: 'emmintrin.h' file not found
#include 
 ^
1 error generated.
*** Error code 1

Stop.
make[4]: stopped in /usr/src/lib/clang/libclang

I'm building on a 12.0-CURRENT VirtualBox guest at r326430.  I've checked
that my /usr/src is clean and deleted /usr/obj to no effect.  I have dug
into SourceManager.cpp and the #include is protected by a #if __SSE2__,
which is relying on clang internal checks to define (and my CPU supports
SSE2).  Does anyone have any ideas to explain what is going on?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: get_swap_pager(x) failed

2017-12-13 Thread Peter Jeremy
On 2017-Dec-13 11:23:46 +, Gary Palmer <gpal...@freebsd.org> wrote:
>An open question would be why ARC is not reducing if the system is
>under memory pressure.  It's meant to, but there have been various
>bugs in that implementation.

The OP doesn't say what version of -current he is running but I would
point the finger at r325851.  I have discovered that, in 11-stable,
r326619 (which is the MFC of r325851) stops ARC responding to memory
backpressure.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: dump trying to access incorrect block numbers?

2017-07-07 Thread Peter Jeremy
On 2017-Jul-07 10:44:36 -0400, Michael Butler <i...@protected-networks.net> 
wrote:
>Recent builds doing a backup (dump) cause nonsensical errors in syslog:

I can't directly offer any ideas but some more background might help:
When did you first notice this (what SVN revision)?
Do you know what the last good SVN revision was?
Is this a new or old filesystem?
Is the filesystem mounted/active or not when you dump it?
What are the relevant parameters for the filesystem on ada0s3a?
Are you running softupdates, journalling etc?
Which dump(8) phase is reporting the errors?
What are the exact dump and fsck commands you ran?

>I now have two UFS-based systems showing the same symptoms - what's up 
>with this?

Was there anything you did on either filesystem that might have triggered it?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: ino64? r318606 -> r318739 OK; r318739 -> r318781 fails SIGSEGV

2017-05-24 Thread Peter Jeremy
On 2017-May-24 20:21:54 +0300, Konstantin Belousov <kostik...@gmail.com> wrote:
>No SIGSEGV etc, so I think that the effects seen are due to build system.
>rm -rf obj/* is the safest trick, I believe.

But the behaviour does indicate that meta mode is not doing the right thing
under all circumstances.  It's blatently breaking in this scenario but could
be causing more subtle (and unnoticed) breakage in other cases.  This makes
me feel that this is worth investigating further.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: ino64? r318606 -> r318739 OK; r318739 -> r318781 fails SIGSEGV

2017-05-24 Thread Peter Jeremy
On 2017-May-24 18:01:42 -0700, "Simon J. Gerraty" <s...@juniper.net> wrote:
>Peter Jeremy <pe...@rulingia.com> wrote:
>> as follows.  My suspicion is that meta mode isn't seeing enough of the
>> differences between the bootstrap and main build steps and so causing make
>> to incorrectly skip steps.
>
>I see a number of places in src/Makefile* where BUILD_TOOLS_META=.NOMETA
>is added to env of things like CROSSENV, CD2MAKE, LIBCOMPATWMAKEENV
>
>Use of .NOMETA could be leading to problems - but I'm not familiar with
>where BUILD_TOOLS_META is used.

I've not looked at the guts of how meta mode works or is inhibited either.

In my case, I have "WITH_META_MODE=yes" in /etc/src-env.conf and was
using "make buildworld" - which failed.  The upgrade worked cleanly
when I manually deleted all the .meta files.  If I get a round tuit,
I'll try to revert to before the update and have a closer look at what
broke with the "normal" build, if no-one else beats me to it.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: ino64? r318606 -> r318739 OK; r318739 -> r318781 fails SIGSEGV

2017-05-24 Thread Peter Jeremy
On 2017-May-24 08:47:41 -0700, Ngie Cooper <yaneurab...@gmail.com> wrote:
>There was another report on the list about a stale MAKEOBJDIRPREFIX 
> causing someone grief. I think it's safe to say that meta mode and -DNO_CLEAN 
> might not work across this transition--in particular meta mode tends to err 
> on the side of not to rebuilding things.

I ran into a very similar problem trying to update from r318744 to r318781.
In my case, even two "make clean" wasn't enough and "make buildworld" died
as follows.  My suspicion is that meta mode isn't seeing enough of the
differences between the bootstrap and main build steps and so causing make
to incorrectly skip steps.

--
>>> stage 2.3: build tools
--
cd /usr/src; MAKEOBJDIRPREFIX=/usr/obj  INSTALL="sh /usr/src/tools/install.sh"  
TOOLS_PREFIX=/usr/obj/usr/src/tmp  
PATH=/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/sbin:/bin:/usr/sbin:/usr/bin
  WORLDTMP=/usr/obj/usr/src/tmp  MAKEFLAGS="-m /usr/src/tools/build/mk  -m 
/usr/src/share/mk" /usr/obj/usr/src/make.amd64/bmake  -f Makefile.inc1  
TARGET=amd64 TARGET_ARCH=amd64  DESTDIR=  BOOTSTRAPPING=1200031  SSP_CFLAGS=  
-DNO_LINT  -DNO_CPU_CFLAGS MK_WARNS=no MK_CTF=no  MK_CLANG_EXTRAS=no 
MK_CLANG_FULL=no  MK_LLDB=no MK_TESTS=no build-tools
...
===> usr.bin/mkesdb_static (obj,build-tools)
Building /usr/obj/usr/src/usr.bin/mkesdb_static/citrus_bcs.o
Building /usr/obj/usr/src/usr.bin/mkesdb_static/citrus_db_factory.o
Building /usr/obj/usr/src/usr.bin/mkesdb_static/citrus_db_hash.o
Building /usr/obj/usr/src/usr.bin/mkesdb_static/citrus_lookup_factory.o
Building /usr/obj/usr/src/usr.bin/mkesdb_static/lex.c
Building /usr/obj/usr/src/usr.bin/mkesdb_static/lex.o
/usr/src/usr.bin/mkesdb/lex.l:44:10: fatal error: 'yacc.h' file not found
#include "yacc.h"
 ^~~~
 1 error generated.
 *** Error code 1

Stop.
bmake[3]: stopped in /usr/src/usr.bin/mkesdb_static
.ERROR_TARGET='lex.o'
.ERROR_META_FILE='/usr/obj/usr/src/usr.bin/mkesdb_static/lex.o.meta'
.MAKE.LEVEL='3'
MAKEFILE=''
.MAKE.MODE='meta missing-filemon=yes missing-meta=yes silent=yes verbose'
.CURDIR='/usr/src/usr.bin/mkesdb_static'
.MAKE='/usr/obj/usr/src/make.amd64/bmake'
.OBJDIR='/usr/obj/usr/src/usr.bin/mkesdb_static'
.TARGETS='build-tools'
DESTDIR=''
LD_LIBRARY_PATH=''
MACHINE='amd64'
MACHINE_ARCH='amd64'
MAKEOBJDIRPREFIX='/usr/obj'
MAKESYSPATH='/usr/src/share/mk'
MAKE_VERSION='20161212'
PATH='/usr/obj/usr/src/tmp/legacy/usr/sbin:/usr/obj/usr/src/tmp/legacy/usr/bin:/usr/obj/usr/src/tmp/legacy/bin:/sbin:/bin:/usr/sbin:/usr/bin'
SRCTOP='/usr/src'
OBJTOP='/usr/obj/usr/src'
.MAKE.MAKEFILES='/usr/src/share/mk/sys.mk /usr/src/share/mk/local.sys.env.mk 
/usr/src/share/mk/src.sys.env.mk /etc/src-env.conf 
/usr/src/share/mk/bsd.mkopt.mk /usr/src/share/mk/bsd.suffixes.mk /etc/make.conf 
/usr/src/share/mk/local.sys.mk /usr/src/share/mk/src.sys.mk 
/usr/src/usr.bin/mkesdb_static/Makefile /usr/src/usr.bin/mkesdb/Makefile.inc 
/usr/src/tools/build/mk/bsd.prog.mk /usr/src/share/mk/bsd.prog.mk 
/usr/src/share/mk/bsd.init.mk /usr/src/share/mk/bsd.opts.mk 
/usr/src/share/mk/bsd.cpu.mk /usr/src/share/mk/local.init.mk 
/usr/src/share/mk/src.init.mk /usr/src/usr.bin/mkesdb_static/../Makefile.inc 
/usr/src/share/mk/bsd.own.mk /usr/src/share/mk/bsd.compiler.mk 
/usr/src/share/mk/bsd.compiler.mk /usr/src/share/mk/bsd.libnames.mk 
/usr/src/share/mk/src.libnames.mk /usr/src/share/mk/src.opts.mk 
/usr/src/share/mk/bsd.nls.mk /usr/src/share/mk/bsd.confs.mk 
/usr/src/share/mk/bsd.files.mk /usr/src/share/mk/bsd.incs.mk 
/usr/src/share/mk/bsd.links.mk /usr/src/share/mk/bsd.man.mk 
/usr/src/share/mk/bsd.dep.mk /usr/src/share/mk/bsd.clang-analyze.mk 
/usr/src/share/mk/bsd.obj.mk /usr/src/share/mk/bsd.subdir.mk 
/usr/src/share/mk/bsd.sys.mk /usr/src/tools/build/mk/Makefile.boot'
.PATH='. /usr/src/usr.bin/mkesdb_static /usr/src/lib/libc/iconv 
/usr/src/usr.bin/mkesdb'
*** Error code 1

I've done a "find /usr/obj -name \*.meta -print0 | xargs -0 rm" and am still
waiting for that to complete, though it has passed the above failure point.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: effect of strip(1) on du(1)

2017-03-03 Thread Peter Jeremy
On 2017-Mar-02 22:19:10 -0800, "Rodney W. Grimes" 
<freebsd-...@pdx.rh.cn85.dnsmgr.net> wrote:
>> du(1) is using fts_read(3), which is based on the stat(2) information.
>> The OpenGroup defines st_blocksize as "Number of blocks allocated for
>> this object."  In the case of ZFS, a write(2) may return before any
>> blocks are actually allocated.  And thanks to compression, gang
...
>My gut tells me that this is gona cause problems, is it ONLY
>the st_blocksize data that is incorrect then not such a big
>problem, or are we returning other meta data that is wrong?

Note that it's st_blocks, not st_blocksize.

I did an experiment, writing a (roughly) 113MB file (some data I had
lying around), close()ing it and then stat()ing it in a loop.  This is
FreeBSD 10.3 with ZFS and lz4 compression.  Over the 26ms following the
close(), st_blocks gradually rose from 24169 to 51231.  It then stayed
stable until 4.968s after the close, when st_blocks again started
increasing until it stabilized after a total of 5.031s at 87483.  Based
on this, st_blocks reflects the actual number of blocks physically
written to disk.  None of the other fields in the struct stat vary.

The 5s delay is presumably the TXG delay (since this system is basically
unloaded).  I'm not sure why it writes roughly ½ the data immediately
and the rest as part of the next TXG write.

>My expectactions of executing a stat(2) call on a file would
>be that the data returned is valid and stable.  I think almost
>any program would expect that.

I think a case could be made that st_blocks is a valid representation
of "the number of blocks allocated for this object" - with the number
increasing as the data is physically written to disk.  As for it being
stable, consider a (hypothetical) filesystem that can transparently
migrate data between different storage media, with different compression
algorithms etc (ZFS will be able to do this once the mythical block
rewrite code is written).

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: effect of strip(1) on du(1)

2017-03-02 Thread Peter Jeremy
On 2017-Mar-02 22:29:46 +0300, Subbsd <sub...@gmail.com> wrote:
>During some interval after strip call, du will show 512B for any file.
>If execute du(1) after strip(1) without delay, this behavior is reproduced 
>100%:

What filesystem are you using?  strip(1) rewrites the target file and du(1)
reports the number of blocks reported by stat(2).  It seems that you are
hitting a situation where the file metadata isn't immediately updated.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: removing SVR4 binary compatibilty layer

2017-02-15 Thread Peter Jeremy
On 2017-Feb-14 10:32:32 -0800, Gleb Smirnoff <gleb...@freebsd.org> wrote:
>  After some discussion on svn mailing list [1], there is intention
>to remove SVR4 binary compatibilty layer from FreeBSD head, meaning
>that FreeBSD 12.0-RELEASE, available in couple of years would
>be shipped without it. There is no intention of merge of the removal.
>The stable@ mailing list added for wider audience.

Can I suggest that we put some warnings into the SVr4 image activation
code and MFC that to at least 11 to try and smoke out anyone who might
actually be using it.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Somethign missing in my environment?

2016-08-17 Thread Peter Jeremy
On 2016-Aug-16 23:14:45 +0200, Willem Jan Withagen <w...@digiware.nl> wrote:
>And I'm running:
>make -j8 buildworld
>So getting a good target that give the error is hard.
>
>So I continued with make -DNOCLEAN -DNO_CLEAN buildworld.

There's nothing immediately obvious.  I suggest trying without the
"-DNOCLEAN -DNO_CLEAN" - they are shortcuts that aren't guaranteed to
work under all circumstances.  And if that still fails, skip the '-j8'
because it's possible there are still race conditions in buildworld
(though that is very unlikely).

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Somethign missing in my environment?

2016-08-16 Thread Peter Jeremy
On 2016-Aug-16 20:31:57 +0200, Willem Jan Withagen <w...@digiware.nl> wrote:
>I'm trying to compile world, but I keep getting:
>
>/usr/obj/usr/srcs/head/src/tmp/usr/lib/libgcc_s.so: undefined reference
>to `__gxx_personality_v0'
>cc: error: linker command failed with exit code 1 (use -v to see invocation)
>*** [h_raw.full] Error code 1
>
>Even after refetching the complete tree.

We need more context:
- What SVN revision of (presumably) -current is this?
- What architecture are you compiling on/for?
- What do you have in /etc/make.conf and /etc/src.conf
- What is your current environment?
- What is the output leading up to that error (what is being built?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Mosh regression between 10.x and 11-stable

2016-08-11 Thread Peter Jeremy
On 2016-Aug-11 10:06:35 -0700, Ngie Cooper <yaneurab...@gmail.com> wrote:
>
>> On Aug 11, 2016, at 09:30, John Hood <cg...@glup.org> wrote:
>> 
>> I still can't reproduce this on 3 different 11.0-BETA4 servers and a
>> variety of clients and networks.  Can you try and identify a more
>> portable repro or at least figure out why it fails on your system?
>> 
>> Please try applying this patch, too.  It's a shot in the dark, though.
>
>Dumb question: what ssh key type(s) (dsa, rsa, etc) are you using Peter :)?

I'm using ECDSA for both the host and user keys.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Mosh regression between 10.x and 11-stable

2016-08-11 Thread Peter Jeremy
On 2016-Aug-11 12:30:23 -0400, John Hood <cg...@glup.org> wrote:
>I still can't reproduce this on 3 different 11.0-BETA4 servers and a
>variety of clients and networks.  Can you try and identify a more
>portable repro or at least figure out why it fails on your system?
>
>Please try applying this patch, too.  It's a shot in the dark, though.

That patch seems to fix the problem I'm seeing.  Not waiting for output
to drain is consistent with the symptoms I'm seeing, though I have no
idea why only my Linux client is affected.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Mosh regression between 10.x and 11-stable

2016-08-11 Thread Peter Jeremy
On 2016-Aug-10 14:32:15 -0400, john hood <cg...@glup.org> wrote:
>On 8/10/16 4:18 AM, Peter Jeremy wrote:
>> I recently updated one of my VPS hosts from 10.3-RELEASE-p5 to 11.0-BETA4
>> r303811 and mosh to that host from my Linux laptop stopped working.  All
>> I get on the laptop is:
>> $ mosh remotehost
>> Connection to remotehost closed.
>> /usr/bin/mosh: Did not find mosh server startup message.

>> 1) the "MOSH CONNECT" message isn't making it out of the local ssh process.
>
>Do you know if the message is getting out of mosh-server?  into sshd?
>Do you know if mosh-server is actually running?  (It will log utmp
>entries on startup.)

mosh-server is running - I can see it from another session and redirecting
verbose output into a file, I get:

mosh-server (mosh 1.2.5) [build mosh 1.2.5]
Copyright 2012 Keith Winstein <mosh-de...@mit.edu>
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

[mosh-server detached, pid = 4202]

Warning: termios IUTF8 flag not defined.
Character-erase of multibyte character sequence
probably does not work properly on this platform.


I can't tell if it's actually writing into the remote ssh process.

>> 2) it's racy because I can get it from "always fails" to "sometimes works".
>
>How do you get it there?

- Add '-v' to the local ssh command.
- ktrace the remote mosh-server process (this seems to make it consistently 
work).

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Mosh regression between 10.x and 11-stable

2016-08-10 Thread Peter Jeremy
I recently updated one of my VPS hosts from 10.3-RELEASE-p5 to 11.0-BETA4
r303811 and mosh to that host from my Linux laptop stopped working.  All
I get on the laptop is:
$ mosh remotehost
Connection to remotehost closed.
/usr/bin/mosh: Did not find mosh server startup message.

I've tried rebuilding mosh (and all dependencies) on the host to no avail.

This isn't the DSA change that's been discussed elsewhere: I can SSH from my
laptop to the host without problem.  I can also manually invoke mosh-client
and mosh-server and it works.  Unfortunately, mosh has no provision for
debugging.  I've tried hacking the mosh perl script to make it more verbose
and that shows that:
1) the "MOSH CONNECT" message isn't making it out of the local ssh process.
2) it's racy because I can get it from "always fails" to "sometimes works".

My suspicion is that something has changed in either sshd or TCP that
is resulting in the connection going away before the stdout from the
remote mosh-server makes it out from the local ssh process.

I've looked at tcpdump's of both successful and failed SSH sessions
but don't see anything obviously different (encryption makes it
difficult to decode the session).

Has anyone else seen this behaviour or have any ideas what might be
causing it?

-- 
Peter Jeremy


signature.asc
Description: PGP signature


FreeBSD 11.0-BETA2 won't boot on an Acer Aspire 5560

2016-07-27 Thread Peter Jeremy
I'm trying to boot the 11.0-BETA2/amd64 memory stick image and the
kernel panics: (Following copied by hand):

ACPI APIC Table: 
...
acpi0:  on motherboard
ACPI Error: Hardware did not change modes (20160527/hwacpi-160)
ACPI Error: Could not transition to APCI mode (20160527/evxfevnt-105)
ACPI Warning: AcpiEnable failed (20160527/utxfinit-184)
acpi0: Could not enable ACPI: AE_NO_HARDWARE_RESPONSE
device_attach: acpi0 attach returned 6

Followed by a NULL dereference panic at nexus_acpi_attach+0x89

The system boots a 10.0-RELEASE/amd64 memstick (the only other image I
have conveniently to date) without problem.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: Recognizing SMR HDDs

2016-05-26 Thread Peter Jeremy
On 2016-May-26 08:42:53 +0200, Gary Jennejohn <gljennj...@gmail.com> wrote:
>Now that ken@ has checked in the SMR code I'm wondering how I can see
>whether it's having any effect.

camcontrol(8) has been enhanced with SMR options and there's a new
zonectl(8) command - these should be able to report whether the drive
is recognized as a host-aware or host-managed SMR drive.  I believe
that drive-managed SMR drives don't admit to anything.

>Does the fact that the drive appears as a /dev/daX play any role?

USB drives are handled via the SCSI CAM layer rather than as SATA
drives.  It's possible that either the umass(4) driver or your USB
to SATA adapter are not correctly handling the relevant commands.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: qsort() documentation

2016-04-20 Thread Peter Jeremy
On 2016-Apr-20 08:45:00 +0200, Hans Petter Selasky <h...@selasky.org> wrote:
>There is something which I don't understand. Why is quicksort falling 
>back to insertion sort which is an O(N**2) algorithm, when there exist a 
>O(log(N)*log(N)*N) algorithms, which I propose as a solution to the 
>"bad" characteristics of qsort.

O() notation just describes the (normally, worst case) ratio of input size
to runtime for a given algorithm: Increasing the input size by (say) 100×
means an insertion sort will take about 1× as long to run, whilst the
"best" algorithms would take about 2000× as long.  It says nothing about how
fast sorting (say) 1000 items takes with either sort or how they behave on
"typical" inputs.  In general, the fancier algorithms might have better
worst-case O() numbers but they have higher overheads and may not perform
any better on typical inputs - so, for small inputs, insertion sort or
bubble sort may be faster.

IMO:
- If you're only sorting a small number of items and/or doing it infrequently,
  the sort performance doesn't really matter and you can use any algorithm.
- If you're sorting lots of items and sort performance is a real issue, you
  need to examine the performance of a variety of algorithms on your input
  data and may need to roll your own implementation.

As long as qsort() behaves reasonably and its behaviour is documented
sufficiently well that someone can decide whether or not to rule it out
for their specific application, that is (IMHO) sufficient.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: gettimeofday((void *)-1, NULL) implicates core dump on recent FreeBSD 11-CURRENT

2015-07-08 Thread Peter Jeremy
On 2015-Jul-08 12:22:03 -0700, Garrett Cooper yaneurab...@gmail.com wrote:
On Jul 8, 2015, at 12:17, Doug Rabson d...@rabson.org wrote:

 As far as I can tell, POSIX doesn't require either EFAULT or any other
 behaviour - the text in http://www.open-std.org/jtc1/sc22/open/n4217.pdf
 just says, No errors are defined. Our man page is wrong and any real
 program which relies on gettimeofday not faulting when given bad inputs is
 broken.

I would suggest the following:
1. Document behavior in NOTES about gettimeofday returning EFAULT with the 
specific scenarios kib mentioned, segfaulting otherwise (wordsmithing the 
actual info of course). Otherwise, it might confuse people who look at the 
manpage later.

I would suggest adding a comment to intro(2) noting that not all functions
listed in section 2 are necessarily system calls and may report error
conditions (or maybe perform argument validation) differently when
implemented in userland.

Note that the issues with gettimeofday() also apply to clock_gettime().

I'm not sure if we want to explicitly document the conditions under which
gettimeofday() (or clock_gettime()) are implemented in userland vs syscalls
because that is guaranteed to get stale over time.  How about stating that
these functions are implemented as syscalls only if the AT_TIMEKEEP value
reported by procstat -x is NULL.

-- 
Peter Jeremy


pgpNkOswpFC0C.pgp
Description: PGP signature


Re: Bug-report of sorts...

2015-01-30 Thread Peter Jeremy
On 2015-Jan-30 22:24:50 +, Poul-Henning Kamp p...@phk.freebsd.dk wrote:
But the point is I never get to the webpage, local_unbound just doesn't
seem to be able to resolve anything through the DHCP appointed server,
despite the fact that dig(1) does so just fine.

How about some packet captures showing the request/response differences
between dig(1) and local_unbound?

-- 
Peter Jeremy


pgphVJ2onIPFJ.pgp
Description: PGP signature


Re: [CFT] Paravirtualized KVM clock

2015-01-21 Thread Peter Jeremy
On 2015-Jan-04 11:56:14 -0600, Bryan Venteicher bry...@daemoninthecloset.org 
wrote:
For the last few weeks, I've been working on adding support for KVM clock
in the projects/paravirt branch. Currently, a KVM VM guest will end up
selecting either the HPET or ACPI as the timecounter source. Unfortunately,
this is very costly since every timecounter fetch causes a VM exit. KVM
clock allows the guest to use the TSC instead; it is very similar to the
existing Xen timer.

A somewhat late response but have you looked at
https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f
I've been running this[*] on a Google Compute Engine instance for about 6
months without problems.

[*] I had to patch out the test for KVM_FEATURE_CLOCKSOURCE_STABLE_BIT but
I think that's a GCE issue.

-- 
Peter Jeremy


pgpi9_M8QUFuE.pgp
Description: PGP signature


Re: mk output during builds: duplicate script for target .... ignored

2014-09-06 Thread Peter Jeremy
On 2014-Sep-05 18:18:15 +, Bjoern A. Zeeb 
bzeeb-li...@lists.zabbadoz.net wrote:
Started the last 48 hours at some time:

It's now fixed for me.  I think the fix was r271168.

-- 
Peter Jeremy


pgpv2g5pS98PC.pgp
Description: PGP signature


Re: keyboard break to debugger broken?

2014-07-04 Thread Peter Jeremy
On 2014-Jul-04 02:28:48 -0700, John-Mark Gurney j...@funkthat.com wrote:
So, I recently tried to break into the debugger w/ the various key
sequences that I know about, and none of them worked... I've tried
CTRL-ESC, ALT-ESC, CTRL-ALT-ESC, CTRL-PRTSCR, ALT-PRTSCR and
CTRL-ALT-PRTSCR, and many other different ones...   I've verified that
I can sysctl debug.kdb.enter=1 to enter the debugger, and the
CTRL-ALT-PAUSE works to suspend the machine, and CTRL-ALT-DEL works
to reboot...

Does anyone know if this works?

It works for me on 10.0.  Do you have debug.kdb.break_to_debugger=1
and hw.syscons.kbd_debug=1 (if you're using syscons)?

-- 
Peter Jeremy


pgpRWEUgfMxEM.pgp
Description: PGP signature


Re: OpenSSL vs. LibreSSL (OpenBSD)

2014-04-25 Thread Peter Jeremy
On 2014-Apr-25 05:00:38 -0400, Zack Gold z...@linux.com wrote:
An important thing to note here is motive. The Linux Foundation is
housing this Core Infrastructure Initiative project, and so they are
the ones who get all the money. The Initiative's funds will be
administered by the Linux Foundation and a steering group comprised of
backers of the project as well as key open source developers and other
industry stakeholders. So, it might be in the interest of these
people to not necessarily fix bugs. They might be interested in other
things, like ownership. Though, this may be a bit irrational.

It has occurred to me that Linux (in general, not the Foundation)
contains a number of religious zealots and the current OpenSSL license
is not in keeping with their religion.  And there have been previous
cases where portable open source software has passed into the
maintainership of Linux groups and had all the cross-platform code
excised to make it Linux-only.

-- 
Peter Jeremy


pgpwNAwcA6h9m.pgp
Description: PGP signature


Re: Import of DragonFly Mail Agent

2014-02-24 Thread Peter Jeremy
On 2014-Feb-24 10:44:30 -0600, Bryan Drewery bdrew...@freebsd.org wrote:
troll
I have the Oreilly sendmail book here and it's thicker than The Design
and Implementation of the FreeBSD Operating System. That's quite an
application!

More impressively, ISTR it's thicker than The Magic Garden Explained
- which is the SVR4 internals.

-- 
Peter Jeremy


pgpXr6FrMeCfw.pgp
Description: PGP signature


Re: ZFS command can block the whole ZFS subsystem!

2014-01-05 Thread Peter Jeremy
On 2014-Jan-05 09:11:38 +0100, O. Hartmann ohart...@zedat.fu-berlin.de 
wrote:
On Sun, 5 Jan 2014 10:14:26 +1100
Peter Jeremy pe...@rulingia.com wrote:

 On 2014-Jan-04 23:26:42 +0100, O. Hartmann
 ohart...@zedat.fu-berlin.de wrote:
 zfs list -r BACKUP00
 NAME  USED  AVAIL  REFER  MOUNTPOINT
 BACKUP00 1.48T  1.19T   144K  /BACKUP00
 BACKUP00/backup  1.47T  1.19T  1.47T  /backup
 
 Well, that at least shows it's making progress - it's gone from 2.5T
 to 1.47T used (though I gather that has taken several days).  Can you
 pleas post the result of
 zfs get all BACKUP00/backup

BACKUP00/backup  deduponlocal

This is your problem.  Before it can free any block, it has to check
for other references to the block via the DDT and I suspect you don't
have enough RAM to cache the DDT.

Your options are:
1) Wait until the delete finishes.
2) Destroy the pool with extreme prejudice: Forcably export the pool
   (probably by booting to single user and not starting ZFS) and write
   zeroes to the first and last MB of ada3p1.

BTW, this problem will occur on any filesystem where you've ever
enabled dedup - once there are any dedup'd blocks in a filesystem,
all deletes need to go via the DDT.

-- 
Peter Jeremy


pgp3MDihoDvIU.pgp
Description: PGP signature


Re: ZFS command can block the whole ZFS subsystem!

2014-01-04 Thread Peter Jeremy
On 2014-Jan-03 20:25:35 +0100, O. Hartmann ohart...@zedat.fu-berlin.de 
wrote:
[~] zfs get all BACKUP00
NAME  PROPERTY  VALUE SOURCE
...
BACKUP00  usedbysnapshots   0 -
BACKUP00  usedbydataset 144K  -
BACKUP00  usedbychildren2.53T -
BACKUP00  usedbyrefreservation  0 -

Funny, the disk is supposed to be empty ... but is marked as used by
2.5 TB ...

That says there's another filesystem inside BACKUP00 which has 2.5TB used.

What are the results of:
zpool status -v BACKUP00
zfs list -r BACKUP00

-- 
Peter Jeremy


pgpJndNkyBTKH.pgp
Description: PGP signature


Re: ZFS command can block the whole ZFS subsystem!

2014-01-04 Thread Peter Jeremy
On 2014-Jan-04 23:26:42 +0100, O. Hartmann ohart...@zedat.fu-berlin.de 
wrote:
zfs list -r BACKUP00
NAME  USED  AVAIL  REFER  MOUNTPOINT
BACKUP00 1.48T  1.19T   144K  /BACKUP00
BACKUP00/backup  1.47T  1.19T  1.47T  /backup

Well, that at least shows it's making progress - it's gone from 2.5T
to 1.47T used (though I gather that has taken several days).  Can you
pleas post the result of
zfs get all BACKUP00/backup

-- 
Peter Jeremy


pgpmSrBIo4DlN.pgp
Description: PGP signature


Re: PACKAGESITE spam

2013-12-26 Thread Peter Jeremy
On 2013-Dec-22 11:53:17 -0800, Darren Pilgrim list_free...@bluerosetech.com 
wrote:
Because of that deinstall log.  When you use `pkg install` to upgrade a 
port, you get something like this:

Jul 10 23:06:40 chombo pkg-static: ca_root_nss-3.15.1 installed
Nov 29 15:04:52 chombo pkg: ca_root_nss reinstalled: 3.15.2_1

That information does not exist in the pkg database.

I agree that's a serious bug/regression in the pkg database: With the
old pkg system, I could tell when a port was installed by looking at
the timestamps on the +COMMENT file.  The install time is needed to
answer questions like does this entry in UPDATING affect me (ie have
I rebuilt the port since the entry date).  It's something I used
regularly and its absence is a PITA.

I shouldn't need to rummage through /var/log/messages - and in any case,
by default FreeBSD only keeps 500K of messages history (about a month
in my case) so the information has probably rotated into the bit bucket.

I agree that having a pkg audit trail would be useful.  Unfortunately,
what we have today is not an audit trail and isn't especially useful.

-- 
Peter Jeremy


pgpVS_m9BxiAC.pgp
Description: PGP signature


Re: [Call For Help] Clang + OpenJDK + head + amd64 == cocktail of death (for clusters)

2013-07-25 Thread Peter Jeremy
On 2013-Jul-25 10:39:17 +0200, Baptiste Daroussin b...@freebsd.org wrote:
After some investigation we discover that blacklisting openjdk6 allows the
building process to go to completion again.
...
It seems to happen only on head amd64, so far we think it is only
happening when jdk is built with clang.

This mail arrives at an opportune time.  I've just discovered that if
I build openjdk6 with clang (on head/amd64), the resultant jdk SEGV's
if I again try to build openjdk6.  If I build it with USE_GCC=any
then the problem goes away.

I have no time, neither skill to investigate that,

I don't have the time to investigate further but forcing the use of gcc
instead of clang is at least a workaround.

-- 
Peter Jeremy


pgpDa0UXCa_Nr.pgp
Description: PGP signature


Re: access to hard drives is blocked by writes to a flash drive

2013-03-04 Thread Peter Jeremy
On 2013-Mar-03 23:12:40 -0800, Don Lewis truck...@freebsd.org wrote:
On  4 Mar, Konstantin Belousov wrote:
 It could be argued that the current typical value of 16MB for the
 hirunningbufspace is too low, but experiments with increasing it did
 not provided any measureable change in the throughput or latency for
 some loads.

The correct value is probably proportional to the write bandwidth
available.

The problem is that write bandwidth varies widely depending on the
workload.  For spinning rust, this will vary between maybe 64KBps
(512B random writes) and 100-150MBps (single-theaded large sequential
writes).  The (low-end) SSD in my Netbook also has about 100:1 variance
due to erase blocking.  How do you tune hirunningbufspace in the face
of 2 or 3 orders of magnitude variance in throughput?  Especially since
SSDs don't gradually degrade - they hit a brick wall.

-- 
Peter Jeremy


pgpZfJbSDrVSA.pgp
Description: PGP signature


Re: access to hard drives is blocked by writes to a flash drive

2013-03-02 Thread Peter Jeremy
On 2013-Mar-02 18:29:54 +0100, deeptech71 deeptec...@gmail.com wrote:

When one of my flash drives is being heavily written to; typically by
``svn update'' on /usr/src, located on the flash drive; the following
can be said about filesystem behavior:

- ``svn update'' seems to be able to quickly update a bunch of files,
   but is then unable to continue for a period of time. This behavior
   is cyclical, and cycles several times, depending on the amount of
   updating work to be done for a particular run of ``svn update''.

This sounds like normal flash behaviour:  You can only write to erased
blocks.  The SSD firmware attempts to keep a free pool of erased blocks
but if you write too fast, you empty the free pool and need to wait for
the wear-levelling algorithm to move blocks around and erase them.

Enabling TRIM (the '-t' flag on tunefs) will help if the drive supports
TRIM (if it doesn't, it'll probably just lockup).  Otherwise, you need
to either put up with it or upgrade to a better SSD.

I run into this regularly with the low-end SuperTalent drive in my
Netbook but have never seen it with the OCZ Agility4 that I use for
L2ARC in my fileserver.

-- 
Peter Jeremy


pgpPsz41Q1HhI.pgp
Description: PGP signature


Re: No ZFS when loading modules from loeader prompt

2013-02-21 Thread Peter Jeremy
On Wed, Feb 20, 2013 at 7:05 AM, O. Hartmann ohart...@zedat.fu-berlin.de 
wrote:
 At the loader prompt, I need to unload the buggy kernel and load the old
 working one via

 load /boot/kernel.old/kernel

 Then I load also the ZFS related modules

 load /boot/kernel.old/opensolaris.ko
 load /boot/kernel.old/zfs.ko

 Issuing boot at the end of that stage boots the kernel - the old one
 -successfully - but there is no working ZFS and no ZFS volume gets
 mounted although the rc.conf is executed correctly.

 What am I doing wrong at that point? Why isn't ZFS run and mount properly?

Last time I ran into this problem, the issue was that unload also
unloaded the zpool.cache file and the ZFS code relied on that to find
the kernel.  I don't recall what the workaround was.

On 2013-Feb-20 08:17:46 -0800, Freddie Cash fjwc...@gmail.com wrote:
Sounds like a perfect use case for Boot Environments.  Create a new BE,
install the new kernel into it, set it as the default, reboot.  If it
fails, you manually set the previous BE as the default, and reboot.  That
way, your known-good, working environment is never affected.

How do you change your BE in the loader?  Or how do you change your
BE when you can't boot?

-- 
Peter Jeremy


pgpHx5Un14coz.pgp
Description: PGP signature


Re: Zpool surgery

2013-01-27 Thread Peter Jeremy
On 2013-Jan-27 14:31:56 -, Steven Hartland kill...@multiplay.co.uk wrote:
- Original Message - 
From: Ulrich Spörlein u...@freebsd.org
 I want to transplant my old zpool tank from a 1TB drive to a new 2TB
 drive, but *not* use dd(1) or any other cloning mechanism, as the pool
 was very full very often and is surely severely fragmented.

Cant you just drop the disk in the original machine, set it as a mirror
then once the mirror process has completed break the mirror and remove
the 1TB disk.

That will replicate any fragmentation as well.  zfs send | zfs recv
is the only (current) way to defragment a ZFS pool.

-- 
Peter Jeremy


pgp7mByYv45q2.pgp
Description: PGP signature


Re: Programmer dvorak layout for syscons

2012-11-19 Thread Peter Jeremy
On 2012-Nov-20 02:42:50 +0200, mbsd m...@isgroup.com.ua wrote:
I've been using this layout for a long time in X and I create kbdmap for
syscons.

Does it any chance to be put in source tree? So my question is, is it
worth.

I suggest you write a PR that includes the keymap and an appropriate
patch for /usr/share/syscons/keymaps/INDEX.keymaps as well as explaining
how it differs from the 9 existing Dvorak keymaps.

-- 
Peter Jeremy


pgpSNEQbnvSGA.pgp
Description: PGP signature


Re: HEADS UP: Forth Optimizations

2012-11-11 Thread Peter Jeremy
On 2012-Nov-10 16:53:10 -0800, Devin Teske devin.te...@fisglobal.com wrote:
Can someone help review this for the commit log?

I've had a look through the proposed patch and my comments follow.
Other than that, it looks good to me.

Index: menu-commands.4th
===
--- menu-commands.4th  (revision 242835)
+++ menu-commands.4th  (working copy)
...
@@ -185,21 +240,21 @@ variable root_state
...
   s set kernel=${kernel_prefix}${kernel[N]}${kernel_suffix}
-\ command to assemble full kernel-path
-  -rot tuck 36 + c! swap\ replace 'N' with array index value
-  evaluate  \ sets $kernel to full kernel-path
+  36 +c! \ replace 'N' with ASCII numeral
+  evaluate

I think the sets $kernel to full kernel-path comment is worth keeping.

   s set root=${root_prefix}${root[N]}${root_suffix}
-\ command to assemble root image-path
-  -rot tuck 30 + c! swap\ replace 'N' with array index value
-  evaluate  \ sets $kernel to full kernel-path
+  30 +c! \ replace 'N' with ASCII numeral
+  evaluate

Likewise, this could do with a (corrected) comment that it sets $root
to the full path to root.

Index: menu.4th
===
--- menu.4th   (revision 242835)
+++ menu.4th   (working copy)
@@ -184,18 +223,15 @@ create init_text8 255 allot
 
   \ base name of environment variable
   loader_color? if
-  s ansi_caption[x]
+  dup ansi_caption[x]
   else
-  s menu_caption[x]
+  dup menu_caption[x]
   then

Could this be simplified to

=   dup
=   loader_color? if
=   ansi_caption[x]
=   else
=   menu_caption[x]
=   then

Or, at a higher level, should this whole block be pulled into a new
word (along with similar words for toggled_{ansi,text}[x] and
{ansi,menu}_caption[x][y]?

@@ -227,36 +263,26 @@ create init_text8 255 allot
...
   getenv dup -1  if
   \ Assign toggled text to menu caption

Some comments on stack contents around here would make it somewhat
easier to follow what is going on.

@@ -329,19 +340,18 @@ create init_text8 255 allot
...
   \ This is highly unlikely to occur, but to make
   \ sure that things move along smoothly, allocate
   \ a temporary NULL string
 
+  drop ( getenv cruft )
   s 
   then
   then

Is this the memory leak?  If so, can I suggest that this be commited
separately since it is a simple change and is distinct from the other
changes you are proposing.

@@ -357,14 +367,14 @@ create init_text8 255 allot
   \ 
   \ Let's perform what we need to with the above.
 
-  \ base name of menuitem caption var
+  \ Assign array value text to menu caption
+  4 pick

According to the docementation just above this hunk, there are only 4
items on the stack, so 4 pick seems wrong, though it is consistent
with my understanding of the old code.  The 2 pick [char] 0 you
added earlier seems to similarly be out-by-one, though consistent.

@@ -521,17 +528,20 @@ create init_text8 255 allot
 
   \ If this is the ACPI menu option, act accordingly.
   dup menuacpi @ = if
-  acpimenuitem ( -- C-Addr/U | -1 )
+  dup acpimenuitem ( n -- n n c-addr/u | n n -1 )
+  dup -1  if
+  13 +c! ( n n c-addr/u -- n ) \ replace 'x'

I think the stack here should be ( n n c-addr/u -- n c-addr/u )

@@ -950,100 +914,43 @@ create init_text8 255 allot
 
   49 \ Iterator start (loop range 49 to 56; ASCII '1' to '8')
   begin
-  \ Unset variables in-order of appearance in menu.4th(8)

Does the order matter?  I notice you've changed it.


pgpjhm7HlFkWe.pgp
Description: PGP signature


Re: [head tinderbox] failure on arm/arm

2012-11-10 Thread Peter Jeremy
On 2012-Nov-10 09:16:32 +1100, Brett brett.ma...@gmx.com wrote:
Just an observation: a few years ago when I got sick of Linux's
headlong rush development model, I subscribed to various BSD
mailing lists to see what else was out there. I considered FreeBSD at
the time - there was a neverending avalanche of [head tinderbox]
failure messages.

The Project tries to avoid it but occasional build failures on the
development branch are very likely to occur.  As a new user, you
would be much better off starting with a release branch.

This told me that I would be more likely to be running code written
by people who knew what they were doing if I went with Open, Net, or
DragonflyBSD.

I think that's being unfair.  Do Open, Net or DFly have an equivalent
to the tinderboxes that do automated test builds and report failures?
And, since you have replied to an ARM failure, DragonflyBSD would not
be an option since it doesn't support ARM.

-- 
Peter Jeremy


pgpggt7LmRYN1.pgp
Description: PGP signature


Re: FORTRAN vs. Fortran (was: November 5th is Clang-Day)

2012-11-03 Thread Peter Jeremy
On 2012-Nov-02 11:21:10 -0500, Brooks Davis bro...@freebsd.org wrote:
On Fri, Nov 02, 2012 at 10:21:19AM +, Anton Shterenlikht wrote:
 It's a shame though that, with LLVM as the
 default compiler, further development of
 FreeBSD/ia64 and FreeBSD/sparc64
 will probably suffer and then stop altogether.

If you read either my annoucment or the diff closly you will note that
the default it only changing for x86 architectures.

Even with all the best of intentions, once the x86 architectures (which
cover the bulk of the user and developer mass) migrate to a different
toolchain, the risk of bitrot in the GNU toolchain decomes non-negligible.
And once it breaks, there may not be the critical mass to repair it.
This is basically what happened to the Alpha.

-- 
Peter Jeremy


pgpPdXemjRuOy.pgp
Description: PGP signature


Re: memory warnings r240891 | dmesgg

2012-10-06 Thread Peter Jeremy
On 2012-Oct-04 23:51:09 +0400, Sergey Kandaurov pluk...@gmail.com wrote:
On 4 October 2012 20:18, Darrel levi...@iglou.com wrote:
 warning: total configured swap (2621440 pages) exceeds maximum
 recommended amount (1852656 pages).
...
This is because kernel needs some memory to manage swap too.
Currently for amd64 this roughly reduces to the following rule
(My apologies in advance for the extra simplification):

100MB RAM per 800MB swap space.

That is oversimplified to the point of being wrong.  As of HEAD
r239255 and 9-stable r240097, there's no longer a limit on amd64.  The
limit is still required on 32-bit architectures due to the limited KVA
available.

The actual KVA requirements (RAM is only allocated when the swap space
is actually used) is about 5MB KVA per 1GB swap.  The default swzone
for i386 was 32MiB - which is sufficient for ~7GB swap (the 1852656
pages reported above) and was increased to 34.5MB for i386 in r239730
to support ~8GB swap (this is also in r240097).  (It's all approximate
because of the way swap space is allocated using struct swblock).

See the thread starting
http://lists.freebsd.org/pipermail/freebsd-current/2012-August/035839.html
for more details.

-- 
Peter Jeremy


pgprxHjDiuWkT.pgp
Description: PGP signature


Re: sysctl kern.ipc.somaxconn limit 65535 why?

2012-10-04 Thread Peter Jeremy
On 2012-Oct-03 19:45:01 +0100, free...@chrysalisnet.org wrote:
In addition we had to migrate all our mysql servers from freebsd to debian
because they were hitting some arbitary OS limit but I could never figure
out what, sys% usage went through the roof when this limit was hit, issue
didnt occur on debian.

Did you report this issue on any of the FreeBSD mailing lists?
Reporting a problem doesn't guarantee that it will be fixed
(unfortunately) but not reporting a problem makes it extremely
unlikely that it will be fixed.

  I feel recently freebsd is more focused on desktop's
and as such developer's never develop for a heavy server usage scenario,

This isn't intentionally true but it's true that few developers run
large servers so they may not run into some issues that only impact
large systems.  Again, it's up to people who do run such systems to
provide feedback about bottlenecks  issues they hit so that they can
be fixed.

I keep coming across hardcoded low limits.  As rightly pointed out default

There are lots of defaults that were set some time (potentially
decades) ago and may no longer be optimal.  It's unrealistic to expect
that all the defaults are correct in all circumstances and this is one
area where end users can help by flagging defaults that they find need
tuning.

values now days are useless 128 for somaxconn? maybe ok for a desktop.

But, as others have pointed out, this isn't one of them.  Can you
please provide more details on a use scenario where a listen(2)
backlog exceeding 128 is reasonable.

  I cant tell app developers to
fix their apps to work on FreeBSD, they dont care, if it works fine on
windows and linux then the app isnt broken as far as they are concerned.

FreeBSD is not Windows or Linux and never will be.  There are lots of
grey areas in the various standards that *BSD, Linux, Solaris, Windows
etc comply with and some OSs interpret these grey areas differently to
others (in some areas, it seems Linux has deliberately done things
differently to other Unices for no obvious reason, and the GNU
embrace-and-extend philosophy doesn't help).  Writing portable code
takes more than adding some .ac/.am files to an arbitrary blob of code
and just because a developer thinks their app isn't broken doesn't
make them right.

BTW, I note that this was sent to -current?  Are you running HEAD on
production servers?  If so, your feedback on issues you encounter
would be appreciated so that they can be corrected before they make
it into a RELEASE.

-- 
Peter Jeremy
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Shouldn't world be able to build without /usr/include?

2012-09-16 Thread Peter Jeremy
No.  The first stage of the buildworld is creating cross-tools - which
run on the existing world (and hence need its include files and libs).

-- 
Peter Jeremy


pgpFV9rJata7v.pgp
Description: PGP signature


Re: pkgng suggestion: renaming /usr/sbin/pkg to /usr/sbin/pkg-bootstrap

2012-08-26 Thread Peter Jeremy
On 2012-Aug-26 12:27:41 -0700, Doug Barton do...@freebsd.org wrote:
On 08/26/2012 12:08, Ian Lepore wrote:
 Maybe it could rename itself to /usr/local/sbin/pkg-bootstrap as part of
 replacing itself, so that you could re-bootstrap your way out of a
 problem later.

That's certainly creative thinking, but I'm still queasy about 2
commands with the same name that do 2 different things. And having it
rename itself adds to the confusion down the road.

I also like the idea of a pkg-bootstrap command.  Possibly a symlink
from pkg to pkg-bootstrap, that gets removed as part of the bootstrap
process, would help - but it should just tell you how to run
pkg-bootstrap.  I don't like the idea of pkg{-bootstrap} autonomously
installing something I didn't ask for.  And I don't like the idea that
all pkg commands get bounced through a /usr/sbin/pkg once it has been
bootstrapped.

Having a simple pkg bootstrapping tool in the base is a good idea. But
the functionality needs to be extremely limited so that we don't
increase the security exposure; and so that we don't end up in a
situation where a bug fix for something in the base limits our ability
to innovate with pkg in the ports tree.

Agreed.  BTW, one thing that needs to be considered is how to recover
from the embedded public key needing to be invalidated (eg due to the
private key being exposed).

-- 
Peter Jeremy


pgp6uilrjhsXu.pgp
Description: PGP signature


Re: dhclient cause up/down cycle after 239356 ?

2012-08-22 Thread Peter Jeremy
On 2012-Aug-22 15:35:01 -0400, John Baldwin j...@freebsd.org wrote:
Hmm.  Perhaps we could use a debouncer to ignore short link flaps?  Kind of
gross (and OpenBSD doesn't do this).  For now this change basically ignores
link up events if they occur with 5 seconds of the link down event.  The 5 is
hardcoded which is kind of yuck.

I'm also a bit concerned about this for similar reasons to adrian@.
We need to distinguish between short link outages caused by (eg) a
switch admin reconfiguring the switch (which needs the lease to be
re-checked) and those caused by broken NICs which report link status
changes when they are touched.  Maybe an alternative is to just ignore
link flaps when they occur within a few seconds of a script_go().
(And/or make the ignore timeout configurable).

Apart from fxp(4), does anyone know how many NICs are similarly
broken?

Does anyone know why this issue doesn't bite OpenBSD?  Does it have
a work-around to avoid resetting the link, not report link status
changes or just no-one has noticed the issue?

BTW to jhb: Can you check your mailer's list configuration.  You
appear to be adding freebsd-current@freebsd.org and leaving
curr...@freebsd.org in the Cc list.

-- 
Peter Jeremy


pgp9SoqeQglFI.pgp
Description: PGP signature


Re: dhclient cause up/down cycle after 239356 ?

2012-08-21 Thread Peter Jeremy
On 2012-Aug-21 19:42:17 +0300, Vitalij Satanivskij sa...@ukr.net wrote:
Look's like dhclient do down/up sequence -

Not intentionally.

Aug 21 19:21:00 home kernel: fxp0: link state changed to UP
Aug 21 19:21:01 home kernel: fxp0: link state changed to DOWN
Aug 21 19:21:01 home dhclient: New IP Address (fxp0): xx.xx.xx.xx
Aug 21 19:21:01 home dhclient: New Subnet Mask (fxp0): 255.255.255.0
Aug 21 19:21:01 home dhclient: New Broadcast Address (fxp0): xx.xx.xx.xx
Aug 21 19:21:01 home dhclient: New Routers (fxp0): xx.xx.xx.xx
Aug 21 19:21:03 home kernel: fxp0: link state changed to UP

I can reproduce this behaviour - but only on fxp (i82559 in my case)
NICs.  My bge (BCM5750) and rl (RTL8139) NICs do not report the
spurious DOWN/UP.  (I don't normally run DHCP on any fxp interfaces,
so I didn't see it during my testing).

The problem appears to be the 
  $IFCONFIG $interface inet alias 0.0.0.0 netmask 255.0.0.0 broadcast 
255.255.255.255 up
executed by /sbin/dhclient-script during PREINIT.  This is making the
fxp NIC reset the link (actually, assigning _any_ IP address to an fxp
NIC causes it to reset the link).  The post r239356 dhclient detects
the link going down and exits.

Before r239356 iface just doing down/up without dhclient exit and
everything work fine.

For you, anyway.  Failing to detect link down causes problems for me
because my dhclient was not seeing my cable-modem resets and therefore
failing to reacquire a DHCP lease.

-- 
Peter Jeremy


pgptb9EOcZ9Yg.pgp
Description: PGP signature


Re: r239356: does it mean, that synchronous dhcp and dhcplcinet with disabled devd gone?

2012-08-21 Thread Peter Jeremy
On 2012-Aug-21 17:25:23 -0400, John Baldwin j...@freebsd.org wrote:
Ok, this is what I came up with, somewhat loosely based on OpenBSD's dhclient.
I tested that it survives the following:

I've also done some limited testing on both bge and fxp NICs and
haven't run into any problems.  In particular the spurious link resets
from fxp don't seem to cause any problems.

-- 
Peter Jeremy


pgp5gbqPFkDoz.pgp
Description: PGP signature


Re: buildworld c++ internal error

2012-08-20 Thread Peter Jeremy
On 2012-Aug-20 07:17:59 +0900, Randy Bush ra...@psg.com wrote:
the only thing a night's sleep got me was the idea of attaching an
external sata drive and putting swap on it.

You can also swap to a file via NFS.

-- 
Peter Jeremy


pgp62N8KdUtmP.pgp
Description: PGP signature


Re: Time to bump default VM_SWZONE_SIZE_MAX?

2012-08-13 Thread Peter Jeremy
On 2012-Aug-12 15:44:07 -0700, Colin Percival cperc...@freebsd.org wrote:
If I'm understanding things correctly, the maxswzone value -- set by the
kern.maxswzone loader tunable or to VM_SWZONE_SIZE_MAX by default -- should
be approximately 9 MiB per GiB of swap space.

I'm not sure how you got that value.  By default, struct swblock is
288 bytes (280 bytes on 32-bit archs) and can store up to 32 pages of
swap (the comment in vm/swap_pager.c:swap_pager_swap_init() is wrong).
For x86, this is 2.25 MiB per GiB (best case).

The current default for VM_SWZONE_SIZE_MAX was set in August 2002 to 32 MiB;
meaning that anyone who wants to use more than ~ 3.5 GB of swap space ought
to set kern.maxswzone in /boot/loader.conf.

In practice, you can't fully populate each swblock.  I did a test on
my amd64 box by running multiple copies of a program that allocates
and dirties a big chunk of RAM and then pause()s.  That gave me a 90%
swblock utilisation - which I suspect is higher than a typical
scenario where memory pressure pushes more randomly unused pages out.

Realistically, I'd say that the default VM_SWZONE_SIZE_MAX can handle
about 9GB swap (at least, that was my experience).

BTW, if you plan on allocating lots of swap, be aware that each swap
device is limited to 32GiB - see vm/swap_pager.c:swaponsomething().

-- 
Peter Jeremy


pgpwSk7xMhpGY.pgp
Description: PGP signature


Re: [HEADSUP CFT] pkg 1.0rc1 and schedule

2012-07-16 Thread Peter Jeremy
On 2012-Jul-16 07:18:05 +0100, Matthew Seaman matt...@freebsd.org wrote:
No.  Parallel installs will not work -- the first to start will lock the
DB, and the second won't be able to proceed.

Good - it was the locking I was mostly concerned about.  As long as
the install is locked, it's safe to run multiple port installs on
different terminalls without them treading on each other.  (Next step,
outside pkgng, in to allow paralles builds).

Thank you for all the answers.

-- 
Peter Jeremy


pgp0v7MUuicxP.pgp
Description: PGP signature


Re: [HEADSUP CFT] pkg 1.0rc1 and schedule

2012-07-15 Thread Peter Jeremy
On 2012-Jul-12 10:01:10 +, Baptiste Daroussin b...@freebsd.org wrote:
What is pkg
---
pkg is a new package manager for FreeBSD. It is designed as a replacement for
the pkg_* tools, and as a full featured binary package manager.

A couple of specific questions that I haven't seen answered during
this thread or in the wiki:
- Can pkgng cope with parallel installs?  What happpens if I
  simultaneously (attempt to) install conflicting packages?
- If I use pkg delete -f, what happens to packages that depended
  on the forcibly-deleted package?
- What happens if I delete a package where I've modified one of the
  files managed by the package?
- What facilities does it have for auditing and repairing the package
  database? (ie checking for inconsistencies between installed files
  and the content of the package database)
- How does it handle the situation where I install a package that
  depends on foo version 1.2.3 but have foo version 1.2.4 (or 1.2.2)
  installed?  What about if I have bar version 1.3, which is ABI-
  compatible with foo version 1.2.3, installed?
- Will it detect that a package install would overwrite an existing
  file?  What does it do in this case?
- I gather it handles update package more intelligently than
  uninstall old package, install new package.  Will it avoid
  replacing an old file with an identical one in the new package?
  If so, what happens to the file metadata (particularly uid, gid
  and mtime)?
- Can it track user-edited configuration files that are associated
  with packages?
- Can it do 2- or 3-way merges of package configuration files?
- The README states Directory leftovers are automatically removed if
  they are not in the MTREE.  How does this work for directories
  that are shared between multiple packages?  Does this mean that if
  I add a file to a directory that was created by a package, that
  file will be deleted automatically if I delete the package?

-- 
Peter Jeremy


pgpJM9KZGxJce.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-07-13 Thread Peter Jeremy
On 2012-Jul-11 15:32:47 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
I know an approach to implementing many of the missing
functions.

Are you willing to share this insight so someone else could do the work?

  When I do find
some free time, I look at what is missing and start to
put together a new function.  At the moment, it seems
that it takes 3+ years to get a new function written,
tested, and committed.

And, from what I can see, much of this is done quietly - which opens
up the possibility that two people might both implement the same code
or that people will avoid the area in fear of treading on someone
else's toes.  As I said previously, I believe the existing wiki page
could be improved to form a central co-ordinating point to show what
what activity is (or isn't) occurring.

but most people seem to push the easy button and want
to grab either cephes or netlib's libm.  There are
technical issues with this approach that I won't 
rehash again.

Doing it properly requires significant effort by people with fairly
specialised skills.  Whilst the project has several people with the
skills, it appears that none of them currently have the time.  In the
meantime, FreeBSD is taking free kicks from other FOSS groups that
have gone down the quick-and-dirty path.

AFAIK, none of the relevant standards (POSIX, IEEE754) have any
precision requirements for functions other than +-*/ and sqrt() - all
of which we have correctly implemented.  I therefore believe that, for
the remaining missing functions, the Project would be best served by
committing the best code that is currently available under a suitable
license and cleaning it up over time (as was done for the current
libm).

-- 
Peter Jeremy


pgpPVXxJTjV0R.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-07-13 Thread Peter Jeremy
On 2012-Jul-13 11:58:05 -0400, David Schultz d...@freebsd.org wrote:
I propose we set a timeframe for this, on the order of a few months.
...
If the schedule can't be met, then we can just import Cephes as an
interim solution without further ado.  This provides Bruce and Steve
an opportunity to commit what they have been working on, without
forcing the rest of the FreeBSD community to wait indefinitely for
the pie in the sky.

This sounds good to me as well and I'd be happy to help.

-- 
Peter Jeremy


pgpmY7CNvs676.pgp
Description: PGP signature


Re: Adding support for WC (write-combining) memory to bus_dma

2012-07-12 Thread Peter Jeremy
On 2012-Jul-12 10:40:27 -0400, John Baldwin j...@freebsd.org wrote:
contigmalloc().  In fact, even better is to call kmem_alloc_contig() directly
rather than using contigmalloc().
...
Peter, this is somewhat orthognal (but related) to your bus_dma patch which is
what prompted me to post this.

Overall, the change seems good to me.  My sole thought on the API was
whether the actual attribute should be passed, rather than having a
couple of new BUS_DMA_ flags but you've addressed that in a followup.

One change is that previously allocated memory was all charged to
M_DEVBUF via the malloc_type_allocated() call in contigmalloc()
whereas now only small allocations are counted.  This would seem to
indicate that large bus_dmamem_alloc() allocations won't be visible in
(eg) vmstat -m.

-- 
Peter Jeremy


pgpZoejmmJeAW.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-07-10 Thread Peter Jeremy
On 2012-Jul-08 19:01:07 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
Well, on the most popular hardware (that being i386/amd64),
ld80 will use hardware fp instruction while ld128 must be
done completely in software.  The speed difference is
significant.

AFAIK, of the architectures that FreeBSD supports, only sparc64
defines ld128 in the architecture and I don't believe there are any
SPARC chip implementations that implement ld128 math in hardware.

For that matter, I don't believe anything except x86 provides full
IEEE FP support in hardware - most architectures require software
assistance for subnormals and some corner cases.  If your application
happens to hit those cases often, performance will also suffer.

On 2012-Jul-08 20:05:04 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
AFAIK, neither gcc in base nor clang would be c99 complaint
even if all of the c99 math functions were available.

That sort of argument can easily get circular.  Lets get the C99 bits
of libm out of the way and then we can have another bikeshed about the
shortcomings of the compiler(s).

On 2012-Jul-08 19:56:52 -0400, David Schultz d...@freebsd.org wrote:
Yes, Bruce has ld128 versions, and clusteradm very kindly got us a
sparc64 machine to test on.  That was about the time I ran out of time
to keep working on it.  If someone wants to pick it up, that would be
great.

I have access to a couple of SPARC systems as well and would be willing
to help work on the missing bits.

On 2012-Jul-10 18:58:01 -0400, David Schultz d...@freebsd.org wrote:
On Tue, Jul 10, 2012, Rainer Hurling wrote:
 powl:   src/extra/trio/triostr.c
 src/extra/trio/trio.c
 src/main/format.c

It's hard to do a good job on powl(), but the simple approach
(exp(log(x)*y)) plus a few special cases may suffice for many uses.

A simplistic exp(log(x)*y) throws away 15 bits of precision (size of
the FP exponent field).  cephes has a powl() that appears to do better
or, alternatively, it shouldn't be too difficult to extend the approach
used by __ieee754_pow() using long doubles.

 BTW: There seems to be a discrepancy about missing functions listed in
 http://wiki.freebsd.org/MissingMathStuff and in
 http://svnweb.freebsd.org/base/head/lib/msun/src/math.h?r1=227472r2=236148pathrev=236148.
 So the wiki is a bit outdated now?
My list:
[elided]

I was thinking that a wiki page would be a good spot to co-ordinate
the work (as well as making it clear what is still to be done).  The
existing page needs some TLC to be useful.

-- 
Peter Jeremy


pgpJMDQgZRF8K.pgp
Description: PGP signature


RAM fragmention problems

2012-07-05 Thread Peter Jeremy
I am running into a problem with RAM fragmentation causing contigmalloc()
failures and wonder if anyone has a tool that that would allow me to
identify the owner(s) of pages of RAM within a region on amd64.

-- 
Peter Jeremy


pgpJ5bQo0Tiwa.pgp
Description: PGP signature


Re: Add new syscons font to FreeBSD current release

2012-06-21 Thread Peter Jeremy
On 2012-Jun-20 17:38:36 +0430, Mohammad Shafiee muhammad.shaf...@gmail.com 
wrote:
I've made a Persian font for FreeBSD syscons.
You can download the font from here:
http://sourceforge.net/projects/bsdpersiancons/

How can I add this font to FreeBSD current release?

As a first step, I'd create a port for it.  See
http://www.freebsd.org/doc/en/books/porters-handbook/

-- 
Peter Jeremy


pgprd7bzEzHR2.pgp
Description: PGP signature


Re: OptionalObsoleteFiles.inc completeness

2012-06-01 Thread Peter Jeremy
On 2012-Jun-01 20:50:24 +0200, Ulrich Spörlein u...@freebsd.org wrote:
Why is xargs even calling /bin/echo when utility is not specified.

Because that's what it's documented as doing.

Shouldn't it just print a certain number of arguments (one in this
case)?

The current approach is simpler - there's always utility and it
defaults to /bin/echo.  Therefore xargs can just always fork/exec.
I agree that special-casing the default to have xargs print the
relevant number of arguments would be more efficient.

-- 
Peter Jeremy


pgpjWzNyZgd8T.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-06-01 Thread Peter Jeremy
On 2012-Jun-01 10:29:13 -0400, John Baldwin j...@freebsd.org wrote:
On Friday, June 01, 2012 1:55:10 am Eitan Adler wrote:
 Also, are there BSD licensed naive implementations of these functions
 we can use? Would it be okay to has slow, but accurate versions of
 these functions as a stopgap?

Peter Jeremy more or less has a stopgap already ready judging by the comments 
in the thread thus far.

There's probably an hours work by either stephen@ or myself to adapt
the work I did on cephes in Sage to a standalone FreeBSD port.
Unfortunately, both stephen@  I are currently otherwise occupied and
other comments in this thread suggest that the inclusion of such a port
would be strongly opposed.

Note that cephes isn't slow but accurate - it's reasonably fast but
naive and therefore dodgy in edge cases.

-- 
Peter Jeremy


pgpHAsPC0mWbI.pgp
Description: PGP signature


Re: OptionalObsoleteFiles.inc completeness

2012-05-31 Thread Peter Jeremy
On 2012-May-30 13:27:03 +1000, Peter Jeremy pe...@rulingia.com wrote:
On 2012-May-29 02:18:25 +0400, Dmitry Marakasov amd...@amdmi3.ru wrote:
Then you should try to profile it - my script basically runs
delete-old delete-old-libs for every knob (131 of them), and it
hadn't taken more than 4 seconds even once.

I've done some investigating and the problem is that xargs -n1
fork()/exec()s /bin/echo on each file (and there are 5538 files for
me).  Changing this to tr ' ' '\n' reduces make delete-old runtime
to 1.75s - which is much nicer.  I've checked a variety of other
systems running 8.x  9.x and the 97s seems to be anomalously long so
I'll do some more investigating.

I've tracked the problem down to excessive VM faults caused by
jemalloc.  Whilst executing /bin/echo, jemalloc mmap()s two 4MiB
chunks of memory.  Unless you build with MALLOC_PRODUCTION (which I
hadn't), it then proceeds to verify that both blocks are zero-filled.
This causes 2048 (unnecessary) page faults (out of a total of 2133).
When I rebuilt jemalloc with MALLOC_PRODUCTION, this dropped to 87
page faults (cf 76 an 8.x and 62 on 9.x) and the elapsed time for
make delete-old dropped to slightly more than 8.x  9.x.

xargs -n1 is probably a worst case scenario for jemalloc but this
probably similarly affects other short-lived processes (and the shell
scripts that invoke them).  It's a pity that this particular test is a
compile-time option.

I still think that saving 5500 fork()/exec() pairs is a good reason
to switch from xargs -n1 to tr ' ' '\n'.

-- 
Peter Jeremy


pgp66hvYrS7pF.pgp
Description: PGP signature


Re: OptionalObsoleteFiles.inc completeness

2012-05-29 Thread Peter Jeremy
On 2012-May-29 02:18:25 +0400, Dmitry Marakasov amd...@amdmi3.ru wrote:
* Peter Jeremy (pe...@rulingia.com) wrote:
 My experience is that it now takes about 2½ minutes on 10.x with warm
 caches, compared to less than 1 second on 8.x.

Now = after applying my patch or after changing system? Which knobs
were enabled?

Now as in -current as against 8.x.  But, that 2½ mins was wrong,
sorry.  I recalled 150s but actually checking, it's really 1:50
(100s).  It occurred to me that was an oldish -current (r235127) so I
updated to r236183 and the time dropped to 107s.  Since this is an
oldish P4, I tried a UP kernel and that reduced it to 96s.  Your patch
made no noticable change (ministat reported no difference with 95%
confidence).

The system is amd64 with no MK_* knobs defined.

Then you should try to profile it - my script basically runs
delete-old delete-old-libs for every knob (131 of them), and it
hadn't taken more than 4 seconds even once.

I've done some investigating and the problem is that xargs -n1
fork()/exec()s /bin/echo on each file (and there are 5538 files for
me).  Changing this to tr ' ' '\n' reduces make delete-old runtime
to 1.75s - which is much nicer.  I've checked a variety of other
systems running 8.x  9.x and the 97s seems to be anomalously long so
I'll do some more investigating.

-- 
Peter Jeremy


pgp23vtZvpadf.pgp
Description: PGP signature


Re: OptionalObsoleteFiles.inc completeness

2012-05-28 Thread Peter Jeremy
On 2012-May-27 18:05:41 +0400, Dmitry Marakasov amd...@amdmi3.ru wrote:
2) Is this ok to backport the list from current to stable branches? Pro
- it's really simple, con - it will contain files never installed with
this (old) branch.

Another con:  make delete-old on -current takes about 2 orders of
magnitude longer to run than on 8.x.  I would prefer to see some
effort put into speeding it up before it was backported.

-- 
Peter Jeremy


pgptJtyQZ4Lv8.pgp
Description: PGP signature


Re: OptionalObsoleteFiles.inc completeness

2012-05-28 Thread Peter Jeremy
On 2012-May-28 23:55:42 +0400, Dmitry Marakasov amd...@amdmi3.ru wrote:
* Peter Jeremy (pe...@rulingia.com) wrote:

 2) Is this ok to backport the list from current to stable branches? Pro
 - it's really simple, con - it will contain files never installed with
 this (old) branch.
 
 Another con:  make delete-old on -current takes about 2 orders of
 magnitude longer to run than on 8.x.  I would prefer to see some
 effort put into speeding it up before it was backported.

Is that really a reason while it is still under 4 seconds and is not
usually run more often than updates (which take minutes if not hours)?

My experience is that it now takes about 2½ minutes on 10.x with warm
caches, compared to less than 1 second on 8.x.  For most of that time,
there's no output and there's no warning of the increased time.  I
actually wrote about the poor performance here a couple of weeks ago.

-- 
Peter Jeremy


pgpj1hAqZ4ktC.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-05-28 Thread Peter Jeremy
On 2012-May-28 11:01:24 -0500, Stephen Montgomery-Smith step...@missouri.edu 
wrote:
One thing that could be done is to have a math/cephes port that adds 
the extra C99 math functions.  This is already done in the math/sage 
port, using a rather clever patch due to Peter Jeremy, that applies to 
the cephes code.

What it would do is to create a /usr/local/lib/libm.so that would 
provide the extra functions not currently included in /lib/libm.so, and 
then link in /lib/libm.so as well.  It would also create its own 
/usr/local/include/math.h and /usr/local/include/complex.h as well.

Basically, as long as the compiler searches /usr/local/{include,lib}
before the base include/lib then math.h, complex.h and -lm give
the application a complete C99 math implementation by using base
functions where they exist and cephes functions where they don't.

The patch I wrote for sage can be found at
http://trac.sagemath.org/sage_trac/ticket/9543
If there's any interest, I could produce a port for this.

Another option would be to import cephes into base and use it to
provide the missing C99 functions.  Cephes includes copyright notices
but the closest I can find to a license is:
   Some software in this archive may be from the book _Methods and
 Programs for Mathematical Functions_ (Prentice-Hall or Simon  Schuster
 International, 1989) or from the Cephes Mathematical Library, a
 commercial product. In either event, it is copyrighted by the author.
 What you see here may be used freely but it comes with no support or
 guarantee.

-- 
Peter Jeremy


pgpYmCz2gMd3i.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-05-28 Thread Peter Jeremy
On 2012-May-28 13:31:59 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
On Mon, May 28, 2012 at 11:01:24AM -0500, Stephen Montgomery-Smith wrote:
 One thing that could be done is to have a math/cephes port that adds 
 the extra C99 math functions.  This is already done in the math/sage 
 port, using a rather clever patch due to Peter Jeremy, that applies to 
 the cephes code.
...
This is a horrible, horrible, horrible idea.  Have you
looked at the cephes code, particularly the complex.h
functions?

The cephes code is somewhat a mess layout-wise.  Algorithmetically,
it seems somewhat variable - some functions are implemented (hopefully
correctly) using semi-numerical techniques, whereas others just use
mathematical identities which will result in precision loss - though
most of the functions include accuracy information.

I agree it would be far preferable to have a properly validated C99
libm with all functions having maximum errors of a no more than a few
LSB over their complete domain, as well as correct support for signed
zeroes, infinities and signalling and non-signalling NaNs but that is
a non-trivial undertaking.

In the interim, how should FreeBSD handle apps that want a C99 libm?
1) Fail to build them
2) Provide possibly imperfect fallbacks for the unimplemented bits.

If someone (I don't have the expertise) wants to identify the cephes
functions that are sub-standard, we can include link-time warnings
(as done for eg gets(3)) when they are used.

-- 
Peter Jeremy


pgpcG5SKNkFm9.pgp
Description: PGP signature


Re: Use of C99 extra long double math functions after r236148

2012-05-28 Thread Peter Jeremy
On 2012-May-28 15:54:06 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
Given that cephes was written years before C99 was even
conceived, I suspect all functions are sub-standard.

Well, most of cephes was written before C99.  The C99 parts of
cephes were written to turn it into a complete C99 implementation.

  For
example, AFAIK, none of the long double functions are
appropriate for any platform that has an 128-bit long double;
as cephes was written for an Intel 80-bit format.

FreeBSD currently supports:
64-bit long doubles on ARM, MIPS and PowerPC;
80-bit long doubles on amd64, i386 and iA64;
128-bit long doubles on SPARC.

The lack of LD128 in cephes therefore only affects one (not widely
used) platform.  The lack of even de facto standards for long
double mean that any applications wanting to use them already need
to cope with at least a 2:1 precision range.

If portmgr or a port maintainer wants to use a library with
untested implementations of missing libm functions, please do
not put it into /usr/local/lib and call it libm.

There some test code in cephes.  Can you point me to a suitable test
suite for LD80 and LD128?  The reason for calling it libm is to avoid
having to hack every consumer to add an additional library.

On 2012-May-28 16:30:35 -0700, Steve Kargl s...@troutmask.apl.washington.edu 
wrote:
Who's writing the code to test the implementations?  That is
better much the problem.  Without testing, one might get an
implementation that appears to work until it doesn't!

That is equally true of the rest of FreeBSD.  The list of open PRs
suggests that FreeBSD still has a fair way to go before reaching
perfection.  And, most of this thread has been about using this code
in ports - where the bar is much lower.  Who is writing the code to
test all the other ports?  What is so special about this particular
proposed port that it needs to come with solid-gold credentials?

  It took
me 3+ years to get sqrtl() into libm, but bde and das (and
myself) wanted to make sure the code worked.

Last time I checked (a couple of years ago), FreeBSD was missing 65
C99 libm functions.  At 3 years per function, we should have C99
support available early in the 23rd century - which may be a bit late.

On 2012-May-28 22:03:43 -0500, Stephen Montgomery-Smith step...@missouri.edu 
wrote:
1.  By being so picky about being so precise, FreeBSD is behind the time 
line in rolling out a usable set of C99 functions.

And at the current rate, we'll all be long dead before they are
available.  Whilst I'd far prefer to have a properly verifed library
function, I think we are better off with an implementation that has
some caveats regarding edge-case behaviour than having nothing.

In the end, I do think it is good to ultimately settle on good C99 
compliant code.  But having something intermediate that mostly works is 
better than nothing.  Especially if it exists only in the ports, and not 
in the base code.

I agree with this sentiment.

What do people do on other free OSs?  Does a tested open source C99
libm exist anywhere?  glibc implements cpow(x,y) as cexp(y*clog(x))
and cephes does better than that.  Is FreeBSD wasting its time writing
correct C99 code because all the libm consumers expect no better
than what glibc offers?

I agree that writing correct libm functions is hard.  I think a lot of
the problem is that it's a mix of lots of boilerplate code testing for
special conditions and edge cases that is boring to write and fiddly
to get right, together with a kernel that is a pile of polynomial
evaluations full of magic numbers that needs specialist skills to
write.  If we could get someone with the relevant skills to formally
list all the special conditions  edge cases for each function, it
should be possible to generate both the library C code and test cases
from that - which would remove a lot of the tedium.

-- 
Peter Jeremy


pgpUnZGDcc79l.pgp
Description: PGP signature


Re: UFS+J panics on HEAD

2012-05-24 Thread Peter Jeremy
On 2012-May-24 12:04:21 +0400, Lev Serebryakov l...@freebsd.org wrote:
  I afraid, that after real hardware failure (like real HDD death,
not these pseudo-broken-hardware situations, when HDDs is perfectly
alive and in good condition), all data will be lost. I could restore
data from remains of FFS by hands (format is straightforward and
well-known), but ZFS is different story...

If your disk dies then you need a redundant copy of your data - either
via backups or via RAID.  Normally, you'd run ZFS with some level of
redundancy so that disk failures did not result in data loss.  That
said, ZFS is touchier about data - if it can't verify the checksums in
your data, it will refuse to give it to you - whereas UFS will hand
you back a pile of bytes that may or may the same as what you gave it
to store.  And you can't necessarily get _any_ data off a failed disk.

 Yes, backups is solution, but I don't have money to buy (reliable)
hardware to backup 4Tb of data :(

4TB disks are available but not really economical at present.  2TB
disks still seem to be the happy medium.  If your data will compress
down to 2TB then save it to a disk, otherwise split your backups
across a pair of disks.  A 2TB disk with enclosure is USD150.  If
you don't trust that, buy a second set.  (And if you value your data,
get a trusted friend to store one copy at their house in case anything
happens at your house).

 I attended Solaris internals 5-days training four years ago (when I
worked for Sun Microsystems), and instructor says same words...

I have had lots of problems at $work with Solaris UFS quietly
corrupting data following crashes.  At least with ZFS, you have a
detter chance of knowing when your data has been corrupted.

-- 
Peter Jeremy


pgpk4t2qrNnV7.pgp
Description: PGP signature


make delete-old performance.

2012-05-16 Thread Peter Jeremy
I recently ran make delete-old on a -current box and felt it was
rather slow.  That prompted me to do some more careful experiments.

On one box where I have both 8-stable and 9-stable available, there
was a ~30x slowdown (based on 5 runs, ignoring the first).  I don't
have a -current world on that box so I can't directly compare but on
another pair of fairly similar boxes, I get a ~180x slowdown between
8-stable and -current (and that figure is probably optimistic since
the -current box was idle whereas the 8-stable box was fairly busy).

I realise that make delete-old isn't something you nede to do every
day but going from sub-second to multi-minute duration is quite
noticable.  Can anyone suggest what has caused the change?

-- 
Peter Jeremy


pgpedlZA6ISMi.pgp
Description: PGP signature


  1   2   3   4   >