Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

2020-05-03 Thread Mark Millard
[The bit argument ot bitmap_unset seems to be way
too large.]

On 2020-May-3, at 11:08, Mark Millard  wrote:

> [At around 4AM local time dhcient got a signal 11,
> despite the jemalloc revert. The other exmaples
> have not happened.]
> 
> On 2020-May-2, at 18:46, Mark Millard  wrote:
> 
>> [I'm only claiming the new jemalloc is involved and that
>> reverting avoids the problem.]
>> 
>> I've been reporting to some lists problems with:
>> 
>> dhclient
>> sendmail
>> rpcbind
>> mountd
>> nfsd
>> 
>> getting SIGSEGV (signal 11) crashes and some core
>> dumps on the old 2-socket (1 core per socket) 32-bit
>> PowerMac G4 running head -r360311.
>> 
>> Mikaël Urankar sent a note suggesting that I try
>> testing reverting head -r360233 for my head -r360311
>> context. He got it right . . .
>> 
>> 
>> Context:
>> 
>> The problem was noticed by an inability to have
>> other machines do a:
>> 
>> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
>> 
>> sort of operation and to have succeed. By contrast, on
>> the old PowerMac G4 I could initiate mounts against
>> other machines just fine.
>> 
>> I do not see any such problems on any of (all based
>> on head -r360311):
>> 
>> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
>> armv7 (OrangePi+ 2ed)
>> aarch64 (Rock64, RPi4, RPi3,
>>OverDrive 1000,
>>Macchiatobin Double Shot)
>> amd64 (ThreadRipper 1950X)
>> 
>> So I expect something 32-bit powerpc specific
>> is somehow involved, even if jemalloc is only
>> using whatever it is.
>> 
>> (A kyua run with a debug kernel did not find other
>> unexpected signal 11 sources on the 32-bit PowerMac
>> compared to past kyua runs, at least that I noticed.
>> There were a few lock order reversals that I do not
>> know if they are expected or known-safe or not.
>> I've reported those reversals to the lists as well.)
>> 
>> 
>> Recent experiments based on the suggestion:
>> 
>> Doing the buildworld, buildkernel and installing just
>> the new kernel and rebooting made no difference.
>> 
>> But then installing the new world and rebooting did
>> make things work again: I no longer get core files
>> for the likes of (old cores from before the update):
>> 
>> # find / -name "*.core" -print
>> /var/spool/clientmqueue/sendmail.core
>> /rpcbind.core
>> /mountd.core
>> /nfsd.core
>> 
>> Nor do I see the various notices for sendmail
>> signal 11's that did not leave behind a core file
>> --or for dhclient (no core file left behind).
>> And I can mount the old PowerMac's drive from
>> other machines just fine.
>> 
>> 
>> Other notes:
>> 
>> I do not actively use sendmail but it was left
>> to do its default things, partially to test if
>> such default things are working. Unfortunately,
>> PowerMacs have a problematical status under
>> FreeBSD and my context has my historical
>> experiments with avoiding various problems.
> 
> Looking, I see that I got a:
> 
> pid 572 (dhclient), jid 0, uid 0: exited on signal 11 (core dumped)
> 
> notice under the reverted build. No instances
> of the other examples. This is the first that a
> dhclient example has produced a .core file.
> 
> gdb indicates 0x5180936c for r7 in:
> 
> lwz r8,36(r7)
> 
> as leading to the failure. This was in
> arena_dalloc_bin_locked_impl (where
> arena_slab_reg_dalloc and bitmap_unset
> were apparently inlined).
> 
> The chain for the example seems to be:
> fork_privchld -> dispatch_imsg -> jemalloc
> 
> For reference . . .
> 
> # gdb dhclient /dhclient.core 
> GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
> Copyright (C) 2020 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later 
> . . .
> Reading symbols from dhclient...
> Reading symbols from /usr/lib/debug//sbin/dhclient.debug...
> [New LWP 100089]
> Core was generated by `dhclient: gem0 [priv]'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  bitmap_unset (bitmap=0x50407164, binfo=, bit=167842154) at 
> /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
> 341   /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h: 
> No such file or directory.
> (gdb) bt -full
> #0  bitmap_unset (bitmap=0x50407164, binfo=, bit=167842154) at 
> /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
>goff = 
>gp = 0x51809390
>propagate = 
>g = 
>i = 
> #1  arena_slab_reg_dalloc (slab=0x50407140, slab_data=0x50407164, 
> ptr=0x50088b50) at jemalloc_arena.c:273
>bin_info = 
>binind = 0
>regind = 167842154
> #2  arena_dalloc_bin_locked_impl (tsdn=0x5009f018, arena=, 
> slab=, ptr=, junked=) at 
> jemalloc_arena.c:1540
>slab_data = 
>binind = 
>bin_info = 
>bin = 
>nfree = 
> #3  0x502916a8 in __je_arena_dalloc_bin_junked_locked (tsdn=, 
> arena=, extent=, ptr=) at 
> jemalloc_arena.c:1559
> No locals.
> #4  0x50250d2c in __je_tcache_bin_flush_small (tsd=0x5009f018, 
> tcache=, 

Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

2020-05-03 Thread Mark Millard
[At around 4AM local time dhcient got a signal 11,
despite the jemalloc revert. The other exmaples
have not happened.]

On 2020-May-2, at 18:46, Mark Millard  wrote:

> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
> 
> I've been reporting to some lists problems with:
> 
> dhclient
> sendmail
> rpcbind
> mountd
> nfsd
> 
> getting SIGSEGV (signal 11) crashes and some core
> dumps on the old 2-socket (1 core per socket) 32-bit
> PowerMac G4 running head -r360311.
> 
> Mikaël Urankar sent a note suggesting that I try
> testing reverting head -r360233 for my head -r360311
> context. He got it right . . .
> 
> 
> Context:
> 
> The problem was noticed by an inability to have
> other machines do a:
> 
> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
> 
> sort of operation and to have succeed. By contrast, on
> the old PowerMac G4 I could initiate mounts against
> other machines just fine.
> 
> I do not see any such problems on any of (all based
> on head -r360311):
> 
> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
> armv7 (OrangePi+ 2ed)
> aarch64 (Rock64, RPi4, RPi3,
> OverDrive 1000,
> Macchiatobin Double Shot)
> amd64 (ThreadRipper 1950X)
> 
> So I expect something 32-bit powerpc specific
> is somehow involved, even if jemalloc is only
> using whatever it is.
> 
> (A kyua run with a debug kernel did not find other
> unexpected signal 11 sources on the 32-bit PowerMac
> compared to past kyua runs, at least that I noticed.
> There were a few lock order reversals that I do not
> know if they are expected or known-safe or not.
> I've reported those reversals to the lists as well.)
> 
> 
> Recent experiments based on the suggestion:
> 
> Doing the buildworld, buildkernel and installing just
> the new kernel and rebooting made no difference.
> 
> But then installing the new world and rebooting did
> make things work again: I no longer get core files
> for the likes of (old cores from before the update):
> 
> # find / -name "*.core" -print
> /var/spool/clientmqueue/sendmail.core
> /rpcbind.core
> /mountd.core
> /nfsd.core
> 
> Nor do I see the various notices for sendmail
> signal 11's that did not leave behind a core file
> --or for dhclient (no core file left behind).
> And I can mount the old PowerMac's drive from
> other machines just fine.
> 
> 
> Other notes:
> 
> I do not actively use sendmail but it was left
> to do its default things, partially to test if
> such default things are working. Unfortunately,
> PowerMacs have a problematical status under
> FreeBSD and my context has my historical
> experiments with avoiding various problems.

Looking, I see that I got a:

pid 572 (dhclient), jid 0, uid 0: exited on signal 11 (core dumped)

notice under the reverted build. No instances
of the other examples. This is the first that a
dhclient example has produced a .core file.

gdb indicates 0x5180936c for r7 in:

lwz r8,36(r7)

as leading to the failure. This was in
arena_dalloc_bin_locked_impl (where
arena_slab_reg_dalloc and bitmap_unset
were apparently inlined).

The chain for the example seems to be:
fork_privchld -> dispatch_imsg -> jemalloc

For reference . . .

# gdb dhclient /dhclient.core 
GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
. . .
Reading symbols from dhclient...
Reading symbols from /usr/lib/debug//sbin/dhclient.debug...
[New LWP 100089]
Core was generated by `dhclient: gem0 [priv]'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  bitmap_unset (bitmap=0x50407164, binfo=, bit=167842154) at 
/usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
341 /usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h: 
No such file or directory.
(gdb) bt -full
#0  bitmap_unset (bitmap=0x50407164, binfo=, bit=167842154) at 
/usr/powerpc32_src/contrib/jemalloc/include/jemalloc/internal/bitmap.h:341
goff = 
gp = 0x51809390
propagate = 
g = 
i = 
#1  arena_slab_reg_dalloc (slab=0x50407140, slab_data=0x50407164, 
ptr=0x50088b50) at jemalloc_arena.c:273
bin_info = 
binind = 0
regind = 167842154
#2  arena_dalloc_bin_locked_impl (tsdn=0x5009f018, arena=, 
slab=, ptr=, junked=) at 
jemalloc_arena.c:1540
slab_data = 
binind = 
bin_info = 
bin = 
nfree = 
#3  0x502916a8 in __je_arena_dalloc_bin_junked_locked (tsdn=, 
arena=, extent=, ptr=) at 
jemalloc_arena.c:1559
No locals.
#4  0x50250d2c in __je_tcache_bin_flush_small (tsd=0x5009f018, 
tcache=, tbin=0x5009f1c0, binind=, rem=24) at 
jemalloc_tcache.c:149
ptr = 
i = 0
extent = 0x50407140
bin_arena = 0x50400380
bin = 
ndeferred = 0
merged_stats = 
arena = 0x50400380
nflush = 75
__vla_expr0 = 
item_extent = 0xd1f0
#5  

Re: sysutils/screen-ncurses port

2020-05-03 Thread Cy Schubert
In message <20200430130449.cwsf3x42o6w67...@ivaldir.net>, Baptiste 
Daroussin wr
ites:
> 
>
> --mvhxgm4zl62unzlf
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
>
> On Thu, Apr 30, 2020 at 05:56:54AM -0700, Cy Schubert wrote:
> > In message <20200430075337.3wdzglshhorcd...@ivaldir.net>, Baptiste=20
> > Daroussin wr
> > ites:
> > >=20
> > >
> > > --vwrr5drfobpkyvop
> > > Content-Type: text/plain; charset=3Dus-ascii
> > > Content-Disposition: inline
> > > Content-Transfer-Encoding: quoted-printable
> > >
> > > On Wed, Apr 29, 2020 at 11:41:46AM -0700, Cy Schubert wrote:
> > > > Would people be open to the idea of a sysutils/screen-ncurses port th=
> at=3D
> > > =3D20
> > > > depends on devel/ncurses instead of ncureses in base? The reason for =
> this=3D
> > > =3D20
> > > > is there are screen.* terminfo entries in devel/ncurses that don't ex=
> ist =3D
> > > in=3D20
> > > > termcap(5). People who want that extra functionality would be advised=
>  to=3D
> > > =3D20
> > > > install the alternative pkg or build the sysutils/screen port with th=
> e=3D20
> > > > appropriate option.
> > > >=3D20
> > > > Or, simply change the default from whatever ncurses is available to a=
> lway=3D
> > > s=3D20
> > > > install devel/ncurses. People could always select one of the other op=
> tion=3D
> > > s.=3D20
> > > > Personally, I'm not enamoured with this approach.
> > >
> > > I think it is a terrible idea, and we should fix the initial problem in=
> stea=3D
> > > d of
> > > workarounding it.
> > >
> > > 1/ why those are not in our termcap(5) ? they should be added if they a=
> re
> > > missing. and MFC asap (prior 11.4 and 12.2 would be nice)
> >=20
> > I came to this conclusion last night after sending this email thread oud=
> =20
> > and will test it some time today.
> >=20
> > >
> > > 2/ we should allow our base ncurses to get informations from newer term=
> cap(=3D
> > > 5) if
> > > needed.
> > > So far the default TERMCAP is
> > > ${HOME}/.termcap{,.db}:/etc/termcap{,.db}:/usr/share/misc/termcap{,.db}
> > >
> > > First the user can be advise to point configure the $home/.termcap this=
>  is =3D
> > > for
> > > quick now.
>
> that is in your scope via a pkg-message :D
>
> > >
> > > Second for later futur proof mechanism we could modify our termcap read=
> er (=3D
> > > we
> > > use our own, not the one in provided by ncurses). to be able to fetch t=
> ermc=3D
> > > ap
> > > capabilities from /usr/local/share/misc/termcap/*.conf for example
> > >
> > > This way ports with random termcap info to add would be able to do it w=
> itho=3D
> > > ut
> > > the requirement to wait for a commit in base and a MFC.
> >=20
> > This is probably outside of my scope at the moment but, yes, agreed.
> >=20
> I will then.
> I added that to my TODO

There's already a utility in devel/ncurses called infotocap (and its 
corresponding captoinfo) that already does this. Both are links to tic. Our 
ncurses import includes tic. Looks like all that's needed is add it to 
buildworld.

I can look at it later tonight. Seems like a quick win.


-- 
Cheers,
Cy Schubert 
FreeBSD UNIX: Web:  https://FreeBSD.org
NTP:   Web:  https://nwtime.org

The need of the many outweighs the greed of the few.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

2020-05-03 Thread Mark Millard


On 2020-May-3, at 01:26, nonameless at ukr.net wrote:




> --- Original message ---
> From: "Mark Millard" 
> Date: 3 May 2020, 04:47:14
> 
> 
> 
>> [I'm only claiming the new jemalloc is involved and that
>> reverting avoids the problem.]
>> 
>> I've been reporting to some lists problems with:
>> 
>> dhclient
>> sendmail
>> rpcbind
>> mountd
>> nfsd
>> 
>> getting SIGSEGV (signal 11) crashes and some core
>> dumps on the old 2-socket (1 core per socket) 32-bit
>> PowerMac G4 running head -r360311.
>> 
>> Mikaël Urankar sent a note suggesting that I try
>> testing reverting head -r360233 for 
>> my head -r360311
>> context. He got it right . . .
>> 
>> 
>> Context:
>> 
>> The problem was noticed by an inability to have
>> other machines do a:
>> 
>> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
>> 
>> sort of operation and to have succeed. By contrast, on
>> the old PowerMac G4 I could initiate mounts against
>> other machines just fine.
>> 
>> I do not see any such problems on any of (all based
>> on head -r360311):
>> 
>> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
>> armv7 (OrangePi+ 2ed)
>> aarch64 (Rock64, RPi4, RPi3,
>> OverDrive 1000,
>> Macchiatobin Double Shot)
>> amd64 (ThreadRipper 1950X)
>> 
>> So I expect something 32-bit powerpc specific
>> is somehow involved, even if jemalloc is only
>> using whatever it is.
>> 
>> (A kyua run with a debug kernel did not find other
>> unexpected signal 11 sources on the 32-bit PowerMac
>> compared to past kyua runs, at least that I noticed.
>> There were a few lock order reversals that I do not
>> know if they are expected or known-safe or not.
>> I've reported those reversals to the lists as well.)
>> 
>> 
>> Recent experiments based on the suggestion:
>> 
>> Doing the buildworld, buildkernel and installing just
>> the new kernel and rebooting made no difference.
>> 
>> But then installing the new world and rebooting did
>> make things work again: I no longer get core files
>> for the likes of (old cores from before the update):
>> 
>> # find / -name "*.core" -print
>> /var/spool/clientmqueue/sendmail.core
>> /rpcbind.core
>> /mountd.core
>> /nfsd.core
>> 
>> Nor do I see the various notices for sendmail
>> signal 11's that did not leave behind a core file
>> --or for dhclient (no core file left behind).
>> And I can mount the old PowerMac's drive from
>> other machines just fine.
>> 
>> 
>> Other notes:
>> 
>> I do not actively use sendmail but it was left
>> to do its default things, partially to test if
>> such default things are working. Unfortunately,
>> PowerMacs have a problematical status under
>> FreeBSD and my context has my historical
>> experiments with avoiding various problems.
>> 
>> ===
>> Mark Millard
>> marklmi at yahoo.com
>> ( dsl-only.net went
>> away in early 2018-Mar)
>> 
> 
> Hi Mark,
> 
> It should be fixed, but not by reverting to old version. We can't stuck on 
> old version because of ancient hardware. I think upstream is not interested 
> in support such hardware. So, it have to patched locally.

Observing and reporting the reverting result is an initial
part of problem isolation. I made no request for FreeBSD
to give up on using the updated jemalloc. (Unfortunately,
I'm not sure what a good next step of problem isolation
might be for the dual-socket PowerMac G4 context.)

Other than reverting, no patch is known for the issue at
this point. More problem isolation is needed first.

While I do not have access, https://wiki.freebsd.org/powerpc
lists more modern 32-bit powerpc hardware as supported:
MPC85XX evaluation boards and AmigaOne A1222 (powerpcspe).
(The AmigaOne A1222 seems to be dual-ore/single-socket.)

So folks with access to one of those may want to see
if they also see the problem(s) with head -r360233 or
later.

Another interesting context to test could be single-socket
with just one core. (I might be able to do that on another
old PowerMac, booting the same media after moving the
media.)

If I understand right, the most common 32-bit powerpc
tier 2 hardware platforms may still be old PowerMac's.
They are considered supported and "mature", instead of
just "stable". See https://wiki.freebsd.org/powerpc .
However, the reality is that there are various problems
for old PowerMacs (32-bit and 64-bit, at least when
there is more than one socket present). The wiki page
does not hint at such. (I'm not sure about
single socket/multi-core PowerMacs: no access to
such.)

It is certainly possible for some problem to happen
that would lead to dropping the supported-status
for some or all old 32-bit PowerMacs, even as tier 2.
But that has not happened yet and I'd have no say in
such a choice.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: lock order reversal and poudriere

2020-05-03 Thread Mark Linimon
On Sun, May 03, 2020 at 10:04:04AM -0600, Ian Lepore wrote:
> That LOR site hasn't been updated in years.  Many many years.

If someone wants to help me set up a page on the wiki, let me know.
(I have too much on my plate to do it myself.)

mcl
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: beadm no longer able to destroy, maybe since using OpenZFS

2020-05-03 Thread Graham Perrin

On 03/05/2020 06:05, Graham Perrin wrote:
After reverting from OpenZFS to ZFS, I was able to destroy two of the 
boot environments that previously (below, 1st May, OpenZFS) could not 
be destroyed.


I'm left with at least one BE 'r360237c' that can not be destroyed. Is 
it ever normal to find a snapshot described as '-'?


root@momh167-gjp4-8570p:~ # kldstat | grep zfs
 2    1 0x82109000   3a8b40 zfs.ko
root@momh167-gjp4-8570p:~ # date
Sun May  3 05:59:06 BST 2020


…


root@momh167-gjp4-8570p:~ # beadm destroy r360237c
Are you sure you want to destroy 'r360237c'?
This action cannot be undone (y/[n]): y
Boot environment 'r360237c' was created from existing snapshot
Destroy '-' snapshot? (y/[n]): y
cannot destroy 'copperbowl/ROOT/r360237c': filesystem has dependent 
clones

use '-R' to destroy the following datasets:
copperbowl/ROOT/r360237e@2020-05-01-17:58:33
copperbowl/ROOT/r360237e
copperbowl/ROOT/r357746h
copperbowl/ROOT/Waterfox


…

I temporarily activated (but did not boot from) an older environment BE 
r357746h, then created and activated r360237f.


Then the previously indestructible r360237c was destroyed successfully:



root@momh167-gjp4-8570p:~ # beadm list -as
BE/Dataset/Snapshot    Active Mountpoint Space 
Created


Waterfox
  copperbowl/ROOT/Waterfox -  - 314.0M 
2020-03-10 18:24
    r360237f@2020-03-20-06:19:45   -  - 1.0G 2020-03-20 
06:19


r357746h
  copperbowl/ROOT/r357746h -  - 3.4M 2020-04-08 
09:28
    r360237f@2020-04-09-17:59:32   -  - 1.1G 2020-04-09 
17:59


r360237c
  copperbowl/ROOT/r360237c -  - 17.0M 
2020-04-29 13:24
    r360237f@2020-04-29-13:24:52   -  - 229.0M 
2020-04-29 13:24


r360237e
  copperbowl/ROOT/r360237e N  / 680.0K 
2020-05-01 17:58
    r360237f@2020-05-03-18:30:25   -  - 440.0K 
2020-05-03 18:30


r360237f
  copperbowl/ROOT/r360237f@2020-03-20-06:19:45 -  - 1.0G 2020-03-20 
06:19
  copperbowl/ROOT/r360237f@2020-04-09-17:59:32 -  - 1.1G 2020-04-09 
17:59
  copperbowl/ROOT/r360237f@2020-04-29-13:24:52 -  - 229.0M 
2020-04-29 13:24
  copperbowl/ROOT/r360237f@2020-05-01-17:58:33 -  - 162.0M 
2020-05-01 17:58
  copperbowl/ROOT/r360237f R  - 81.7G 
2020-05-03 18:30
  copperbowl/ROOT/r360237f@2020-05-03-18:30:25 -  - 440.0K 
2020-05-03 18:30

root@momh167-gjp4-8570p:~ # date ; uname -v ; uptime
Sun May  3 18:32:31 BST 2020
FreeBSD 13.0-CURRENT #54 r360237: Fri Apr 24 09:10:37 BST 2020 
root@momh167-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG

 6:32PM  up 12:54, 5 users, load averages: 0.48, 0.44, 0.41
root@momh167-gjp4-8570p:~ # kldstat | grep zfs
 2    1 0x82109000   3a8b40 zfs.ko
root@momh167-gjp4-8570p:~ # beadm destroy r360237c
Are you sure you want to destroy 'r360237c'?
This action cannot be undone (y/[n]): y
Destroyed successfully
root@momh167-gjp4-8570p:~ # zfs destroy 
copperbowl/ROOT/r360237f@2020-03-20-06:19:45
cannot destroy 'copperbowl/ROOT/r360237f@2020-03-20-06:19:45': snapshot 
has dependent clones

use '-R' to destroy the following datasets:
copperbowl/ROOT/Waterfox
root@momh167-gjp4-8570p:~ # zfs destroy 
copperbowl/ROOT/r360237f@2020-04-09-17:59:32
cannot destroy 'copperbowl/ROOT/r360237f@2020-04-09-17:59:32': snapshot 
has dependent clones

use '-R' to destroy the following datasets:
copperbowl/ROOT/r357746h
root@momh167-gjp4-8570p:~ # zfs destroy 
copperbowl/ROOT/r360237f@2020-05-01-17:58:33

root@momh167-gjp4-8570p:~ # beadm list -as
BE/Dataset/Snapshot    Active Mountpoint Space 
Created


Waterfox
  copperbowl/ROOT/Waterfox -  - 314.0M 
2020-03-10 18:24
    r360237f@2020-03-20-06:19:45   -  - 1.0G 2020-03-20 
06:19


r357746h
  copperbowl/ROOT/r357746h -  - 3.4M 2020-04-08 
09:28
    r360237f@2020-04-09-17:59:32   -  - 1.1G 2020-04-09 
17:59


r360237e
  copperbowl/ROOT/r360237e N  / 756.0K 
2020-05-01 17:58
    r360237f@2020-05-03-18:30:25   -  - 440.0K 
2020-05-03 18:30


r360237f
  copperbowl/ROOT/r360237f@2020-03-20-06:19:45 -  - 1.0G 2020-03-20 
06:19
  copperbowl/ROOT/r360237f@2020-04-09-17:59:32 -  - 1.1G 2020-04-09 
17:59
  copperbowl/ROOT/r360237f R  - 80.9G 
2020-05-03 18:30
  copperbowl/ROOT/r360237f@2020-05-03-18:30:25 -  - 440.0K 
2020-05-03 18:30

root@momh167-gjp4-8570p:~ # bectl list -as
BE/Dataset/Snapshot  Active Mountpoint Space 
Created


Waterfox
  copperbowl/ROOT/Waterfox   -  - 314M  
2020-03-10 18:24


r357746h
  copperbowl/ROOT/r357746h   -  - 3.44M 
2020-04-08 09:28


r360237e
  copperbowl/ROOT/r360237e   N  / 764K  
2020-05-01 17:58


r360237f
  copperbowl/ROOT/r360237f  

Re: lock order reversal and poudriere

2020-05-03 Thread Ian Lepore
On Sat, 2020-05-02 at 20:36 +0200, Kurt Jaeger wrote:
> Hi!
> 
> > > > I am compiling some packages with poudriere on 13-current kernel. I
> > > > noticed some strange messages printed into the terminal and dmesg:
> > > > 
> > > > lock order reversal:
> > > 
> > > [...]
> > > > Are those the debug messages that aren't visible on non-current kernel
> > > > and should they be reported?
> > > 
> > > Yes, they should be checked and reported.
> > > 
> > > For more details see:
> > > 
> > > http://sources.zabbadoz.net/freebsd/lor.html
> > > 
> > > There's a webpage with a list of all known LORs and a way to
> > > report new LORs.
> > Thanks Kurt. I can't find those two specific LORs in the list on that
> > page. The page also says to report them using a link, which leads to 404
> > :-), or on this mailing list, which I did. I am not sure what else should
> > I do.
> 
> I don't know, either 8-} bz@ is in Cc:, so he'll probably know what
> to do.
> 

That LOR site hasn't been updated in years.  Many many years.

The sad truth appears to be that nobody cares about LORs anymore.  The
same ones have been there for years.  Nobody fixes them, nobody does
anything to suppress reporting them.  We just keep pointing new users
to a dead website because that has always been the only response
available.

> > How do I know if I have got a backtrace?
> > 
> > Are those errors:
> > 
> > pid 43297 (conftest), jid 5, uid 0: exited on signal 11
> > 
> > related or it's a different issue?
> 
> I think that's a different issue.
> 

Segfaults and other problems with a program named "conftest" while
building ports is normal.  Autotools' configure script writes and runs
programs named conftest to detect the presence or absence of features
or bugs.  That doesn't mean every failure of a program named conftest
is normal and expected, but in general it's not a thing to worry about.

-- Ian

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re[2]: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

2020-05-03 Thread nonameless


 
 --- Original message ---
 From: "Mark Millard" 
 Date: 3 May 2020, 17:38:14
  


> 
> 
> On 2020-May-3, at 01:26, nonameless at 
> ukr.net wrote:
> 
> 
> 
> 
> > --- Original message ---
> > From: "Mark Millard" 
> > Date: 3 May 2020, 04:47:14
> > 
> > 
> > 
> >> [I'm only claiming the new jemalloc is involved and that
> >> reverting avoids the problem.]
> >> 
> >> I've been reporting to some lists problems with:
> >> 
> >> dhclient
> >> sendmail
> >> rpcbind
> >> mountd
> >> nfsd
> >> 
> >> getting SIGSEGV (signal 11) crashes and some core
> >> dumps on the old 2-socket (1 core per socket) 32-bit
> >> PowerMac G4 running head -r360311"> >> data-ukrnet-code="360311">360311.
> >> 
> >> Mikaël Urankar sent a note suggesting that I try
> >> testing reverting head -r360233"> >> data-ukrnet-code="360233">360233 for my head -r >> data-ukrnet-code="360311"> >> data-ukrnet-code="360311">360311
> >> context. He got it right . . .
> >> 
> >> 
> >> Context:
> >> 
> >> The problem was noticed by an inability to have
> >> other machines do a:
> >> 
> >> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
> >> 
> >> sort of operation and to have succeed. By contrast, on
> >> the old PowerMac G4 I could initiate mounts against
> >> other machines just fine.
> >> 
> >> I do not see any such problems on any of (all based
> >> on head -r360311):
> >> 
> >> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
> >> armv7 (OrangePi+ 2ed)
> >> aarch64 (Rock64, RPi4, RPi3,
> >> OverDrive 1000"> >> data-ukrnet-code="1000">1000,
> >> Macchiatobin Double Shot)
> >> amd64 (ThreadRipper 1950X)
> >> 
> >> So I expect something 32-bit powerpc specific
> >> is somehow involved, even if jemalloc is only
> >> using whatever it is.
> >> 
> >> (A kyua run with a debug kernel did not find other
> >> unexpected signal 11 sources on the 32-bit PowerMac
> >> compared to past kyua runs, at least that I noticed.
> >> There were a few lock order reversals that I do not
> >> know if they are expected or known-safe or not.
> >> I've reported those reversals to the lists as well.)
> >> 
> >> 
> >> Recent experiments based on the suggestion:
> >> 
> >> Doing the buildworld, buildkernel and installing just
> >> the new kernel and rebooting made no difference.
> >> 
> >> But then installing the new world and rebooting did
> >> make things work again: I no longer get core files
> >> for the likes of (old cores from before the update):
> >> 
> >> # find / -name "*.core" -print
> >> /var/spool/clientmqueue/sendmail.core
> >> /rpcbind.core
> >> /mountd.core
> >> /nfsd.core
> >> 
> >> Nor do I see the various notices for sendmail
> >> signal 11's that did not leave behind a core file
> >> --or for dhclient (no core file left behind).
> >> And I can mount the old PowerMac's drive from
> >> other machines just fine.
> >> 
> >> 
> >> Other notes:
> >> 
> >> I do not actively use sendmail but it was left
> >> to do its default things, partially to test if
> >> such default things are working. Unfortunately,
> >> PowerMacs have a problematical status under
> >> FreeBSD and my context has my historical
> >> experiments with avoiding various problems.
> >> 
> >> ===
> >> Mark Millard
> >> marklmi at yahoo.com
> >> ( dsl-only.net went
> >> away in early 2018"> >> data-ukrnet-code="2018">2018-Mar)
> >> 
> > 
> > Hi Mark,
> > 
> > It should be fixed, but not by reverting to old version. We can't stuck on 
> > old version because of ancient hardware. I think upstream is not interested 
> > in support such hardware. So, it have to patched locally.
> 
> Observing and reporting the reverting result is an initial
> part of problem isolation. I made no request for FreeBSD
> to give up on using the updated jemalloc. (Unfortunately,
> I'm not sure what a good next step of problem isolation
> might be for the dual-socket PowerMac G4 context.)
> 
> Other than reverting, no patch is known for the issue at
> this point. More problem isolation is needed first.
> 
> While I do not have access, https://wiki.freebsd.org/powerpc
> lists more modern 32-bit powerpc hardware as supported:
> MPC85XX evaluation boards and AmigaOne A data-ukrnet-code="1222">1222 (powerpcspe).
> (The AmigaOne A1222 seems to be 
> dual-ore/single-socket.)
> 
> So folks with access to one of those may want to see
> if they also see the problem(s) with head -r data-ukrnet-code="360233">360233 or
> later.
> 
> Another interesting context to test could be single-socket
> with just one core. (I might be able to do that on another
> old PowerMac, booting the same media after moving the
> media.)
> 
> If I understand right, the most common 32-bit powerpc
> tier 2 hardware platforms may still be old PowerMac's.
> They are considered supported and "mature", instead of
> just "stable". See https://wiki.freebsd.org/powerpc .
> However, the reality is that there are various problems
> for old PowerMacs (32-bit and 64-bit, at least when
> there is more than one socket present). The wiki page
> does not hint at 

Re: panic: Assertion lock == sq->sq_lock failed at /usr/src-13/sys/kern/subr_sleepqueue.c:371

2020-05-03 Thread Grzegorz Junka



On 03/05/2020 15:13, Gary Jennejohn wrote:

On Sun, 3 May 2020 14:11:09 +0100
Grzegorz Junka  wrote:


I don't have a partition that I could use for swap. I have two whole
disks added to ZFS. Maybe on the boot drive but that would require
repartitioning and I have Windows/FreeBSD there, so not so straightforward.


As the dumpon man pages states, by the time a crash dump is needed the
files systems are dead.  No way to dump to a ZFS file system.  That's
why a raw partition is required.

The other option would be netdump.  See the dumpon man page.



I will consider a separate partition next time I partition my disk. For 
now I will have to ignore panics and dumps. I tried netdump and it 
didn't work - it couldn't ARP the netmapd server.


--GrzegorzJ

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: lock order reversal and poudriere

2020-05-03 Thread Grzegorz Junka


On 03/05/2020 15:00, Niclas Zeising wrote:

On 2020-05-02 20:36, Kurt Jaeger wrote:


I don't know, either 8-} bz@ is in Cc:, so he'll probably know what
to do.


How do I know if I have got a backtrace?

Are those errors:

pid 43297 (conftest), jid 5, uid 0: exited on signal 11

related or it's a different issue?


I think that's a different issue.



conftest is when configure scripts do things.  Configure works a lot 
by compiling (and sometimes running) small snippets of code to figure 
out what's going on.  Sometimes those snippets core dump. It's all 
normal.




Good to know. It's mostly conftest but sometimes others too:

pid 37407 (cc), jid 9, uid 0: exited on signal 6
pid 95358 (conftest), jid 3, uid 0: exited on signal 11
pid 70242 (conftest), jid 9, uid 0: exited on signal 11
pid 27480 (ngc27183), jid 3, uid 0: exited on signal 11

Regards

--GrzegorzJ

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion lock == sq->sq_lock failed at /usr/src-13/sys/kern/subr_sleepqueue.c:371

2020-05-03 Thread Gary Jennejohn
On Sun, 3 May 2020 14:11:09 +0100
Grzegorz Junka  wrote:

> On 03/05/2020 08:05, Gary Jennejohn wrote:
> > On Sat, 02 May 2020 16:28:46 -0700
> > Chris  wrote:
> >  
> >>  
> >>>  
> > Another thing is that I don't quite understand why the crash couldn't
> > be dumped.
> >
> > root@crayon2:~ # swapinfo
> > Device__ 1K-blocks Used__ Avail Capacity
> > /dev/zvol/tank3/swap__ 33554432__ 0 33554432 0%
> >
> > There is no entry in /etc/fstab though, should it be there too?  
>  How about your rc.conf(5) ?
> 
>  You need to define a dumpdev within it as:
> 
>  # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
>  dumpdev="YES"
> 
>  Which defaults to the location of:
> 
>  /var/crash
>  
> >>> Yes, of course I have 'dumpdev="AUTO"'. Should it be "YES" instead?  
> >> Yes, it should of course be AUTO. I was distracted at the time of writing.
> >> Sorry.
> >> Does /var/crash exist?
> >>
> >> That _should_ be enough. Assuming /var/crash is writable.
> >>  
> > Sorry, but read the man page for rc.conf.
> >
> > This is the entry for dumpdev:
> >
> >   dumpdev (str) Indicates the device (usually a swap partition) to
> >   which a crash dump should be written in the event of a 
> > system
> >   crash.  If the value of this variable is "AUTO", the first
> >   suitable swap device listed in /etc/fstab will be used as
> >   dump device.  Otherwise, the value of this variable is 
> > passed
> >   as the argument to dumpon(8).  To disable crash dumps, set
> >   this variable to "NO".
> >
> > If there are no swap devices in /etc/fstab then "AUTO" will not work.  But
> > a partition can be specified.  I have dumpdev="/dev/ada0p5" in my rc.conf.
> >
> > /var/crash is the target for crash dumps after the system is re-booted.
> >  
> 
> /var/crash existed but might not have had the right permissions. I think 
> it was 755 whereas the handbook recommends 700. Shouldn't matter though.
> 

/var/crash is irrelevant when a crash dump is being written out.

> I don't have anything about swap in fstab since I am using Root on ZFS. 
> swapinfo correctly recognizes the swap partition and uses it. This the 
> typical usage while I am compiling ports:
> 
> last pid: 85116;__ load averages:__ 8.95,__ 8.50, 8.34 up 0+18:06:31__ 
> 13:02:32
> 72 processes:__ 14 running, 57 sleeping, 1 zombie
> CPU:__ 0.0% user, 90.5% nice,__ 9.5% system,__ 0.0% interrupt,__ 0.0% idle
> Mem: 993M Active, 594M Inact, 6400K Laundry, 12G Wired, 2225M Free
> ARC: 6160M Total, 3093M MFU, 2657M MRU, 214M Anon, 100M Header, 193M Other
>   5300M Compressed, 5861M Uncompressed, 1.11:1 Ratio
> Swap: 32G Total, 61M Used, 32G Free
> 
> The crash happened in similar conditions so there should be nothing 
> preventing dumping the crash to the zfs swap, unless dumpon isn't smart 
> enough to use zfs swap.
> 
> I don't have a partition that I could use for swap. I have two whole 
> disks added to ZFS. Maybe on the boot drive but that would require 
> repartitioning and I have Windows/FreeBSD there, so not so straightforward.
> 

As the dumpon man pages states, by the time a crash dump is needed the
files systems are dead.  No way to dump to a ZFS file system.  That's
why a raw partition is required.

The other option would be netdump.  See the dumpon man page.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: lock order reversal and poudriere

2020-05-03 Thread Niclas Zeising

On 2020-05-02 20:36, Kurt Jaeger wrote:

Hi!


I am compiling some packages with poudriere on 13-current kernel. I
noticed some strange messages printed into the terminal and dmesg:

lock order reversal:

[...]

Are those the debug messages that aren't visible on non-current kernel
and should they be reported?

Yes, they should be checked and reported.

For more details see:

http://sources.zabbadoz.net/freebsd/lor.html

There's a webpage with a list of all known LORs and a way to
report new LORs.



Thanks Kurt. I can't find those two specific LORs in the list on that
page. The page also says to report them using a link, which leads to 404
:-), or on this mailing list, which I did. I am not sure what else should
I do.


I don't know, either 8-} bz@ is in Cc:, so he'll probably know what
to do.


How do I know if I have got a backtrace?

Are those errors:

pid 43297 (conftest), jid 5, uid 0: exited on signal 11

related or it's a different issue?


I think that's a different issue.



conftest is when configure scripts do things.  Configure works a lot by 
compiling (and sometimes running) small snippets of code to figure out 
what's going on.  Sometimes those snippets core dump.  It's all normal.

Regards
--
Niclas
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: lock order reversal and poudriere

2020-05-03 Thread Grzegorz Junka


On 02/05/2020 10:08, Grzegorz Junka wrote:
I am compiling some packages with poudriere on 13-current kernel. I 
noticed some strange messages printed into the terminal and dmesg:


lock order reversal:
 1st 0xf8010ca78250 zfs (zfs) @ /usr/src-13/sys/kern/vfs_mount.c:1005
 2nd 0xf8010cd37250 devfs (devfs) @ 
/usr/src-13/sys/kern/vfs_mount.c:1016

stack backtrace:
#0 0x80c2d5f1 at witness_debugger+0x71
#1 0x80b92f18 at lockmgr_lock_flags+0x188
#2 0x80cae744 at _vn_lock+0x54
#3 0x80c90756 at vfs_domount+0xd16
#4 0x80c8efd1 at vfs_donmount+0x871
#5 0x80c8e729 at sys_nmount+0x69
#6 0x81060c40 at amd64_syscall+0x140
#7 0x810370a0 at fast_syscall_common+0x101
pid 17216 (conftest), jid 6, uid 0: exited on signal 11
pid 51159 (conftest), jid 6, uid 0: exited on signal 11
pid 23833 (conftest), jid 3, uid 0: exited on signal 11
pid 4916 (conftest), jid 3, uid 0: exited on signal 11

(... then there is a bunch of similar ones, then ...)

pid 14504 (conftest), jid 3, uid 0: exited on signal 11
pid 27466 (conftest), jid 6, uid 0: exited on signal 11
pid 43297 (conftest), jid 5, uid 0: exited on signal 11
lock order reversal:
 1st 0xfe00bc68c030 filedesc structure (filedesc structure) @ 
/usr/src-13/sys/kern/sys_generic.c:1557
 2nd 0xf803baeddbd8 tmpfs (tmpfs) @ 
/usr/src-13/sys/kern/vfs_vnops.c:1553

stack backtrace:
#0 0x80c2d5f1 at witness_debugger+0x71
#1 0x80b946b5 at lockmgr_xlock+0x55
#2 0x80cae744 at _vn_lock+0x54
#3 0x80cad0da at vn_poll+0x3a
#4 0x80c33e19 at kern_poll+0x419
#5 0x80c340df at sys_ppoll+0x6f
#6 0x81060c40 at amd64_syscall+0x140
#7 0x810370a0 at fast_syscall_common+0x101
pid 37533 (conftest), jid 5, uid 0: exited on signal 11
pid 43474 (conftest), jid 5, uid 0: exited on signal 11




I restarted the compilation and again seeing similar LORs:

lock order reversal:
 1st 0xf80115d32068 zfs (zfs) @ /usr/src-13/sys/kern/vfs_mount.c:1005
 2nd 0xf800243d6808 devfs (devfs) @ 
/usr/src-13/sys/kern/vfs_mount.c:1016

stack backtrace:
#0 0x80c2d5f1 at witness_debugger+0x71
#1 0x80b92f18 at lockmgr_lock_flags+0x188
#2 0x80cae744 at _vn_lock+0x54
#3 0x80c90756 at vfs_domount+0xd16
#4 0x80c8efd1 at vfs_donmount+0x871
#5 0x80c8e729 at sys_nmount+0x69
#6 0x81060c40 at amd64_syscall+0x140
#7 0x810370a0 at fast_syscall_common+0x101

lock order reversal:
 1st 0xfe00a7aa49b0 filedesc structure (filedesc structure) @ 
/usr/src-13/sys/kern/sys_generic.c:1557

 2nd 0xf800aa2cdbd8 zfs (zfs) @ /usr/src-13/sys/kern/vfs_vnops.c:1553
stack backtrace:
#0 0x80c2d5f1 at witness_debugger+0x71
#1 0x80b946b5 at lockmgr_xlock+0x55
#2 0x80cae744 at _vn_lock+0x54
#3 0x80cad0da at vn_poll+0x3a
#4 0x80c33e19 at kern_poll+0x419
#5 0x80c339f0 at sys_poll+0x50
#6 0x81060c40 at amd64_syscall+0x140
#7 0x810370a0 at fast_syscall_common+0x101


The page to report still returns 404 :)

--

GrzegorzJ

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion lock == sq->sq_lock failed at /usr/src-13/sys/kern/subr_sleepqueue.c:371

2020-05-03 Thread Grzegorz Junka


On 03/05/2020 08:05, Gary Jennejohn wrote:

On Sat, 02 May 2020 16:28:46 -0700
Chris  wrote:






Another thing is that I don't quite understand why the crash couldn't
be dumped.

root@crayon2:~ # swapinfo
Device__ 1K-blocks Used__ Avail Capacity
/dev/zvol/tank3/swap__ 33554432__ 0 33554432 0%

There is no entry in /etc/fstab though, should it be there too?

How about your rc.conf(5) ?

You need to define a dumpdev within it as:

# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="YES"

Which defaults to the location of:

/var/crash
  

Yes, of course I have 'dumpdev="AUTO"'. Should it be "YES" instead?

Yes, it should of course be AUTO. I was distracted at the time of writing.
Sorry.
Does /var/crash exist?

That _should_ be enough. Assuming /var/crash is writable.


Sorry, but read the man page for rc.conf.

This is the entry for dumpdev:

  dumpdev (str) Indicates the device (usually a swap partition) to
  which a crash dump should be written in the event of a system
  crash.  If the value of this variable is "AUTO", the first
  suitable swap device listed in /etc/fstab will be used as
  dump device.  Otherwise, the value of this variable is passed
  as the argument to dumpon(8).  To disable crash dumps, set
  this variable to "NO".

If there are no swap devices in /etc/fstab then "AUTO" will not work.  But
a partition can be specified.  I have dumpdev="/dev/ada0p5" in my rc.conf.

/var/crash is the target for crash dumps after the system is re-booted.



/var/crash existed but might not have had the right permissions. I think 
it was 755 whereas the handbook recommends 700. Shouldn't matter though.


I don't have anything about swap in fstab since I am using Root on ZFS. 
swapinfo correctly recognizes the swap partition and uses it. This the 
typical usage while I am compiling ports:


last pid: 85116;  load averages:  8.95,  8.50, 8.34 up 0+18:06:31  13:02:32
72 processes:  14 running, 57 sleeping, 1 zombie
CPU:  0.0% user, 90.5% nice,  9.5% system,  0.0% interrupt,  0.0% idle
Mem: 993M Active, 594M Inact, 6400K Laundry, 12G Wired, 2225M Free
ARC: 6160M Total, 3093M MFU, 2657M MRU, 214M Anon, 100M Header, 193M Other
 5300M Compressed, 5861M Uncompressed, 1.11:1 Ratio
Swap: 32G Total, 61M Used, 32G Free

The crash happened in similar conditions so there should be nothing 
preventing dumping the crash to the zfs swap, unless dumpon isn't smart 
enough to use zfs swap.


I don't have a partition that I could use for swap. I have two whole 
disks added to ZFS. Maybe on the boot drive but that would require 
repartitioning and I have Windows/FreeBSD there, so not so straightforward.


--GrzegorzJ



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Trying to compiles today Current

2020-05-03 Thread Marco Steinbach
On Sun, 29 Mar 2020 15:36:22 +0200
Willem Jan Withagen  wrote:

> I keep getting errors in building my ports from portsbuilder.
> But it is with Current/Clang10.
> 
> So I'm trying to get a server at that level, but building world
> keeps giving me:
> --- all_subdir_cddl ---
> ld:
> error: /usr/obj/usr/srcs/head/amd64.amd64/tmp/usr/lib/libuutil.so:
> undefined reference to __assfail cc: error: linker command failed
> with exit code 1 (use -v to see invocation) *** [zfs.full] Error code
> 1
> 
> 
> Completely cleared src and obj, but the error persists.
> 
> Current on this system is:
> FreeBSD 13.0-CURRENT #0 r358358: Thu Feb 27 04:40:39 UTC 2020 
> r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
> 
> How to fix this?

In my case a leftover CFLAGS=-O0 in /etc/make.conf was the culprit
('=' instead of '+=' not being a typo). As another side effect, build
world took about four times as long as usual, before it errored out.

While at it, I of also ran into what's described
in
https://lists.freebsd.org/pipermail/freebsd-current/2018-December/072531.html,
before removing DEBUG_FLAGS from /etc/make.conf.


Conditionalizing the flags to only be pulled in for the directories I
actually wanted them to be relevant for would certainly have helped.

MfG CoCo
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re[2]: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

2020-05-03 Thread nonameless


 
 --- Original message ---
 From: "Mark Millard" 
 Date: 3 May 2020, 04:47:14
  


> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
> 
> I've been reporting to some lists problems with:
> 
> dhclient
> sendmail
> rpcbind
> mountd
> nfsd
> 
> getting SIGSEGV (signal 11) crashes and some core
> dumps on the old 2-socket (1 core per socket) 32-bit
> PowerMac G4 running head -r360311.
> 
> Mikaël Urankar sent a note suggesting that I try
> testing reverting head -r360233 for my 
> head -r360311
> context. He got it right . . .
> 
> 
> Context:
> 
> The problem was noticed by an inability to have
> other machines do a:
> 
> mount -onoatime,soft OLDPOWERMAC-LOCAL-IP:/... /mnt
> 
> sort of operation and to have succeed. By contrast, on
> the old PowerMac G4 I could initiate mounts against
> other machines just fine.
> 
> I do not see any such problems on any of (all based
> on head -r360311):
> 
> powerpc64 (old PowerMac G5 2-sockets with 2 cores each)
> armv7 (OrangePi+ 2ed)
> aarch64 (Rock64, RPi4, RPi3,
> OverDrive 1000,
> Macchiatobin Double Shot)
> amd64 (ThreadRipper 1950X)
> 
> So I expect something 32-bit powerpc specific
> is somehow involved, even if jemalloc is only
> using whatever it is.
> 
> (A kyua run with a debug kernel did not find other
> unexpected signal 11 sources on the 32-bit PowerMac
> compared to past kyua runs, at least that I noticed.
> There were a few lock order reversals that I do not
> know if they are expected or known-safe or not.
> I've reported those reversals to the lists as well.)
> 
> 
> Recent experiments based on the suggestion:
> 
> Doing the buildworld, buildkernel and installing just
> the new kernel and rebooting made no difference.
> 
> But then installing the new world and rebooting did
> make things work again: I no longer get core files
> for the likes of (old cores from before the update):
> 
> # find / -name "*.core" -print
> /var/spool/clientmqueue/sendmail.core
> /rpcbind.core
> /mountd.core
> /nfsd.core
> 
> Nor do I see the various notices for sendmail
> signal 11's that did not leave behind a core file
> --or for dhclient (no core file left behind).
> And I can mount the old PowerMac's drive from
> other machines just fine.
> 
> 
> Other notes:
> 
> I do not actively use sendmail but it was left
> to do its default things, partially to test if
> such default things are working. Unfortunately,
> PowerMacs have a problematical status under
> FreeBSD and my context has my historical
> experiments with avoiding various problems.
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
> 

Hi Mark,

It should be fixed, but not by reverting to old version. We can't stuck on old 
version because of ancient hardware. I think upstream is not interested in 
support such hardware. So, it have to patched locally.

Thanks.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: Assertion lock == sq->sq_lock failed at /usr/src-13/sys/kern/subr_sleepqueue.c:371

2020-05-03 Thread Gary Jennejohn
On Sat, 02 May 2020 16:28:46 -0700
Chris  wrote:

> On Sun, 3 May 2020 00:15:48 +0100 Grzegorz Junka li...@gjunka.com said
> 
> > On 02/05/2020 20:43, Chris wrote:  
> > > On Sat, 2 May 2020 20:19:56 +0100 Grzegorz Junka li...@gjunka.com said
> > >  
> > >> On 02/05/2020 14:56, Grzegorz Junka wrote:  
> > >> >
> > >> > On 02/05/2020 14:15, Grzegorz Junka wrote:  
> > >> >> cpuid = 3
> > >> >>
> > >> >> time = 1588422616
> > >> >>
> > >> >> KDB: stack backtrace:
> > >> >>
> > >> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>   
> > >> 0xfe00b27e86b0  
> > >> >>
> > >> >> vpanic() at vpanic+0x182/frame 0xfe00b27e8700
> > >> >>
> > >> >> panic() at panic+0x43/frame ...
> > >> >>
> > >> >> sleepq_add()
> > >> >>
> > >> >> ...
> > >> >>
> > >> >> I see
> > >> >>  
> > >> >> db>  
> > >> >>
> > >> >> in the terminal. I tried "dump" but it says, Cannot dump: no dump 
> > >> >> device specified.
> > >> >>
> > >> >> Is there a guide how to deal wit those, i.e. to gather information 
> > >> >> required to investigate issues?  
> > >> >  
> > >>
> > >> Another thing is that I don't quite understand why the crash couldn't 
> > >> be dumped.
> > >>
> > >> root@crayon2:~ # swapinfo
> > >> Device__ 1K-blocks Used__ Avail Capacity
> > >> /dev/zvol/tank3/swap__ 33554432__ 0 33554432 0%
> > >>
> > >> There is no entry in /etc/fstab though, should it be there too?  
> > >
> > > How about your rc.conf(5) ?
> > >
> > > You need to define a dumpdev within it as:
> > >
> > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
> > > dumpdev="YES"
> > >
> > > Which defaults to the location of:
> > >
> > > /var/crash
> > >  
> > 
> > Yes, of course I have 'dumpdev="AUTO"'. Should it be "YES" instead?  
> Yes, it should of course be AUTO. I was distracted at the time of writing.
> Sorry.
> Does /var/crash exist?
> 
> That _should_ be enough. Assuming /var/crash is writable.
> 

Sorry, but read the man page for rc.conf.

This is the entry for dumpdev:

 dumpdev (str) Indicates the device (usually a swap partition) to
 which a crash dump should be written in the event of a system
 crash.  If the value of this variable is "AUTO", the first
 suitable swap device listed in /etc/fstab will be used as
 dump device.  Otherwise, the value of this variable is passed
 as the argument to dumpon(8).  To disable crash dumps, set
 this variable to "NO".

If there are no swap devices in /etc/fstab then "AUTO" will not work.  But
a partition can be specified.  I have dumpdev="/dev/ada0p5" in my rc.conf.

/var/crash is the target for crash dumps after the system is re-booted.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"