Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
Clean build resolved the problem.

Thanks,

Chavdar

On Thu, 18 May 2023 at 15:10, Chavdar Ivanov  wrote:
>
> On Thu, 18 May 2023 at 15:06, Brad Spencer  wrote:
> >
> > Chavdar Ivanov  writes:
> >
> > > On Thu, 18 May 2023 at 11:31, Robert Swindells  wrote:
> > >>
> > >>
> > >> Chavdar Ivanov  wrote:
> > >> > The weird and suspicious thing is that /usr/bin/ftp is linked to both
> > >> > existing libcrypto.so versions:
> > >> >
> > >> > ldd /usr/bin/ftp
> > >> > /usr/bin/ftp:
> > >> >-ledit.3 => /usr/lib/libedit.so.3
> > >> >-lterminfo.2 => /usr/lib/libterminfo.so.2
> > >> >-lc.12 => /usr/lib/libc.so.12
> > >> >-lssl.15 => /usr/lib/libssl.so.15
> > >> >-lcrypto.14 => /usr/lib/libcrypto.so.14
> > >> >-lcrypt.1 => /lib/libcrypt.so.1
> > >> >-lcrypto.15 => /usr/lib/libcrypto.so.15
> > >>
> > >> I'm guessing you did an update build not a clean one. Also, did you use
> > >> the -j flag to build in parallel?
> > >
> > > I indeed usually do update builds; in this case however, I had deleted
> > > the entire obj (it is on a zfs, removing takes a lot of time, so
> > > subsequently I replaced it with a dedicated zfs so that I can just
> > > destroy and recreate it...). I also 'make cleandir' in [x]src in
> > > advance; that usually is enough.
> >
> > 
> > Snapshot the zfs fileset when it is empty and rollback when you want to
> > make it empty again.  Very quick  works best when obj is its own
> > fileset.
>
> It didn't come to my mind... I was doing destroy, create, set
> mountpoint and chown, so the above saves me three commands...
>
> > 
> >
> > >>
> > >> Do a clean build and everything should work.
> > >
> > > That's next.
> >
> >
> >
> > --
> > Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
>
>
>
> --
> 



-- 



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
On Thu, 18 May 2023 at 15:06, Brad Spencer  wrote:
>
> Chavdar Ivanov  writes:
>
> > On Thu, 18 May 2023 at 11:31, Robert Swindells  wrote:
> >>
> >>
> >> Chavdar Ivanov  wrote:
> >> > The weird and suspicious thing is that /usr/bin/ftp is linked to both
> >> > existing libcrypto.so versions:
> >> >
> >> > ldd /usr/bin/ftp
> >> > /usr/bin/ftp:
> >> >-ledit.3 => /usr/lib/libedit.so.3
> >> >-lterminfo.2 => /usr/lib/libterminfo.so.2
> >> >-lc.12 => /usr/lib/libc.so.12
> >> >-lssl.15 => /usr/lib/libssl.so.15
> >> >-lcrypto.14 => /usr/lib/libcrypto.so.14
> >> >-lcrypt.1 => /lib/libcrypt.so.1
> >> >-lcrypto.15 => /usr/lib/libcrypto.so.15
> >>
> >> I'm guessing you did an update build not a clean one. Also, did you use
> >> the -j flag to build in parallel?
> >
> > I indeed usually do update builds; in this case however, I had deleted
> > the entire obj (it is on a zfs, removing takes a lot of time, so
> > subsequently I replaced it with a dedicated zfs so that I can just
> > destroy and recreate it...). I also 'make cleandir' in [x]src in
> > advance; that usually is enough.
>
> 
> Snapshot the zfs fileset when it is empty and rollback when you want to
> make it empty again.  Very quick  works best when obj is its own
> fileset.

It didn't come to my mind... I was doing destroy, create, set
mountpoint and chown, so the above saves me three commands...

> 
>
> >>
> >> Do a clean build and everything should work.
> >
> > That's next.
>
>
>
> --
> Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org



-- 



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Brad Spencer
Chavdar Ivanov  writes:

> On Thu, 18 May 2023 at 11:31, Robert Swindells  wrote:
>>
>>
>> Chavdar Ivanov  wrote:
>> > The weird and suspicious thing is that /usr/bin/ftp is linked to both
>> > existing libcrypto.so versions:
>> >
>> > ldd /usr/bin/ftp
>> > /usr/bin/ftp:
>> >-ledit.3 => /usr/lib/libedit.so.3
>> >-lterminfo.2 => /usr/lib/libterminfo.so.2
>> >-lc.12 => /usr/lib/libc.so.12
>> >-lssl.15 => /usr/lib/libssl.so.15
>> >-lcrypto.14 => /usr/lib/libcrypto.so.14
>> >-lcrypt.1 => /lib/libcrypt.so.1
>> >-lcrypto.15 => /usr/lib/libcrypto.so.15
>>
>> I'm guessing you did an update build not a clean one. Also, did you use
>> the -j flag to build in parallel?
>
> I indeed usually do update builds; in this case however, I had deleted
> the entire obj (it is on a zfs, removing takes a lot of time, so
> subsequently I replaced it with a dedicated zfs so that I can just
> destroy and recreate it...). I also 'make cleandir' in [x]src in
> advance; that usually is enough.


Snapshot the zfs fileset when it is empty and rollback when you want to
make it empty again.  Very quick  works best when obj is its own
fileset.


>>
>> Do a clean build and everything should work.
>
> That's next.



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
On Thu, 18 May 2023 at 13:16, RVP  wrote:
>
> On Thu, 18 May 2023, Chavdar Ivanov wrote:
>
> > Yes indeed, with SIGILL passed I get:
> >
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0xf03114c97890 in EC_GROUP_order_bits () from /usr/lib/libcrypto.so.14
> > (gdb) bt
> > #0  0xf03114c97890 in EC_GROUP_order_bits () from 
> > /usr/lib/libcrypto.so.14
> > #1  0xf031154898a4 in engine_unlocked_init () from 
> > /usr/lib/libcrypto.so.15
> > #2  0xf03115489ab0 in ENGINE_init () from /usr/lib/libcrypto.so.15
> > #3  0xf031153d11f0 in ?? () from /usr/lib/libcrypto.so.15
> > #4  0xf03115694c30 in ssl_setup_sig_algs () from /usr/lib/libssl.so.15
> > #5  0xf031156a85c4 in SSL_CTX_new_ex () from /usr/lib/libssl.so.15
> > #6  0x0f1be6d8 in fetch_start_ssl ()
> > #7  0x0f1b0dfc in fetch_url ()
> > #8  0x0f1b3128 in auto_fetch ()
> > #9  0x0f1bf944 in main ()
> >
>
> You can see the cause right in that stack trace:
>
> EC_GROUP_order_bits is from libcrypto.so.14, but,
> engine_unlocked_init etc., are from libcrypto.so.15
>
> This is our old friend: library interpositioning and it happens due to
> this:
>
> $ readelf -d /mnt/usr/bin/ftp | f NEEDED
>   0x0001 NEEDED   Shared library: [libedit.so.3]
>   0x0001 NEEDED   Shared library: [libterminfo.so.2]
>   0x0001 NEEDED   Shared library: [libssl.so.14]
>   0x0001 NEEDED   Shared library: [libcrypto.so.14]
>   0x0001 NEEDED   Shared library: [libc.so.12]
> $ readelf -d /mnt/usr/lib/libssl.so.14 | f NEEDED
>   0x0001 NEEDED   Shared library: [libcrypto.so.14]
>   0x0001 NEEDED   Shared library: [libc.so.12]
>
>
> So, my ftp binary explicitly needs `libcrypto.so.14'. and `libssl' also has
> _the same version_ as a dependency. But, in your case, the ftp binary will
> show `libcrypto.so.15', but libssl will need `libcrypto.so.14'. Ie. the
> compiler linked in the newer version explicitly (cc ... -lcrypto') and the
> other one was brought in implicitly via libssl.

Indeed:

# readelf -d /usr/bin/ftp | grep NEEDED
 0x0001 (NEEDED) Shared library: [libedit.so.3]
 0x0001 (NEEDED) Shared library: [libterminfo.so.2]
 0x0001 (NEEDED) Shared library: [libssl.so.15]
 0x0001 (NEEDED) Shared library: [libcrypto.so.15]
 0x0001 (NEEDED) Shared library: [libc.so.12]
# readelf -d /usr/lib/libssl.so.15.0 | grep NEEDED
 0x0001 (NEEDED) Shared library: [libcrypto.so.14]
 0x0001 (NEEDED) Shared library: [libc.so.12]

Rebuilding clean now.

Thanks,

Chavdar

>
> -RVP
>


-- 



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread RVP

On Thu, 18 May 2023, Chavdar Ivanov wrote:


Yes indeed, with SIGILL passed I get:


Program received signal SIGSEGV, Segmentation fault.
0xf03114c97890 in EC_GROUP_order_bits () from /usr/lib/libcrypto.so.14
(gdb) bt
#0  0xf03114c97890 in EC_GROUP_order_bits () from /usr/lib/libcrypto.so.14
#1  0xf031154898a4 in engine_unlocked_init () from /usr/lib/libcrypto.so.15
#2  0xf03115489ab0 in ENGINE_init () from /usr/lib/libcrypto.so.15
#3  0xf031153d11f0 in ?? () from /usr/lib/libcrypto.so.15
#4  0xf03115694c30 in ssl_setup_sig_algs () from /usr/lib/libssl.so.15
#5  0xf031156a85c4 in SSL_CTX_new_ex () from /usr/lib/libssl.so.15
#6  0x0f1be6d8 in fetch_start_ssl ()
#7  0x0f1b0dfc in fetch_url ()
#8  0x0f1b3128 in auto_fetch ()
#9  0x0f1bf944 in main ()



You can see the cause right in that stack trace:

EC_GROUP_order_bits is from libcrypto.so.14, but,
engine_unlocked_init etc., are from libcrypto.so.15

This is our old friend: library interpositioning and it happens due to
this:

$ readelf -d /mnt/usr/bin/ftp | f NEEDED
 0x0001 NEEDED   Shared library: [libedit.so.3]
 0x0001 NEEDED   Shared library: [libterminfo.so.2]
 0x0001 NEEDED   Shared library: [libssl.so.14]
 0x0001 NEEDED   Shared library: [libcrypto.so.14]
 0x0001 NEEDED   Shared library: [libc.so.12]
$ readelf -d /mnt/usr/lib/libssl.so.14 | f NEEDED
 0x0001 NEEDED   Shared library: [libcrypto.so.14]
 0x0001 NEEDED   Shared library: [libc.so.12]


So, my ftp binary explicitly needs `libcrypto.so.14'. and `libssl' also has
_the same version_ as a dependency. But, in your case, the ftp binary will
show `libcrypto.so.15', but libssl will need `libcrypto.so.14'. Ie. the
compiler linked in the newer version explicitly (cc ... -lcrypto') and the
other one was brought in implicitly via libssl.

-RVP



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
On Thu, 18 May 2023 at 11:31, Robert Swindells  wrote:
>
>
> Chavdar Ivanov  wrote:
> > The weird and suspicious thing is that /usr/bin/ftp is linked to both
> > existing libcrypto.so versions:
> >
> > ldd /usr/bin/ftp
> > /usr/bin/ftp:
> >-ledit.3 => /usr/lib/libedit.so.3
> >-lterminfo.2 => /usr/lib/libterminfo.so.2
> >-lc.12 => /usr/lib/libc.so.12
> >-lssl.15 => /usr/lib/libssl.so.15
> >-lcrypto.14 => /usr/lib/libcrypto.so.14
> >-lcrypt.1 => /lib/libcrypt.so.1
> >-lcrypto.15 => /usr/lib/libcrypto.so.15
>
> I'm guessing you did an update build not a clean one. Also, did you use
> the -j flag to build in parallel?

I indeed usually do update builds; in this case however, I had deleted
the entire obj (it is on a zfs, removing takes a lot of time, so
subsequently I replaced it with a dedicated zfs so that I can just
destroy and recreate it...). I also 'make cleandir' in [x]src in
advance; that usually is enough.

>
> Do a clean build and everything should work.

That's next.



-- 



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
On Thu, 18 May 2023 at 11:33, RVP  wrote:
>
> On Thu, 18 May 2023, Chavdar Ivanov wrote:
>
> > This turned out to be /usr/bin/ftp crashing:
> >
> > #  /usr/bin/ftp -o node-v20.2.0.tar.xz
> > 'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
> > Trying 104.20.23.46:443 ...
> > [1]7100 segmentation fault  /usr/bin/ftp -o node-v20.2.0.tar.xz
> > 
> >
> > If I run it under gdb, I get:
> >
> > (gdb) run -o node-v20.2.0.tar.xz
> > 'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
> > Starting program: /usr/bin/ftp -o node-v20.2.0.tar.xz
> > 'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
> >
> > Program received signal SIGILL, Illegal instruction.
> > 0xf7db5d54be70 in _armv8_sha512_probe () from /usr/lib/libcrypto.so.14
> > (gdb) bt
> > #0  0xf7db5d54be70 in _armv8_sha512_probe () from 
> > /usr/lib/libcrypto.so.14
> > #1  0xf7db5d54c23c in OPENSSL_cpuid_setup () from 
> > /usr/lib/libcrypto.so.14
> > #2  0xef643398 in _rtld_call_init_function () from
> > /usr/libexec/ld.elf_so
> > #3  0xef6436a4 in _rtld_call_init_functions () from
> > /usr/libexec/ld.elf_so
> > #4  0xef643f74 in _rtld () from /usr/libexec/ld.elf_so
> > #5  0xef640b10 in _rtld_start () from /usr/libexec/ld.elf_so
> > Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> >
>
> You should ignore SIGILL when it's in libcrypto on some archs. eg. ARM,
> PPC & Sparc. On x86 systems, libcrypto uses the CPUID instruction to
> determine which optimized assembly routines can be used for speedup. On
> ARM etc, it installs a SIGILL handler and just runs test instructions. The
> handler being called means _those_ instructions are not available.
>
> So, on ARM, you have to tell gdb to pass through SIGILL to the program:
>
> ```
> (gdb) handle SIGILL nostop noprint pass
> ```
>
> > The weird and suspicious thing is that /usr/bin/ftp is linked to both
> > existing libcrypto.so versions:
> >
> > ldd /usr/bin/ftp
> > /usr/bin/ftp:
> >-ledit.3 => /usr/lib/libedit.so.3
> >-lterminfo.2 => /usr/lib/libterminfo.so.2
> >-lc.12 => /usr/lib/libc.so.12
> >-lssl.15 => /usr/lib/libssl.so.15
> >-lcrypto.14 => /usr/lib/libcrypto.so.14
> >-lcrypt.1 => /lib/libcrypt.so.1
> >-lcrypto.15 => /usr/lib/libcrypto.so.15
> >
>
> I would say this is the real reason for the crash (SIGSEGV).

Yes indeed, with SIGILL passed I get:


Program received signal SIGSEGV, Segmentation fault.
0xf03114c97890 in EC_GROUP_order_bits () from /usr/lib/libcrypto.so.14
(gdb) bt
#0  0xf03114c97890 in EC_GROUP_order_bits () from /usr/lib/libcrypto.so.14
#1  0xf031154898a4 in engine_unlocked_init () from /usr/lib/libcrypto.so.15
#2  0xf03115489ab0 in ENGINE_init () from /usr/lib/libcrypto.so.15
#3  0xf031153d11f0 in ?? () from /usr/lib/libcrypto.so.15
#4  0xf03115694c30 in ssl_setup_sig_algs () from /usr/lib/libssl.so.15
#5  0xf031156a85c4 in SSL_CTX_new_ex () from /usr/lib/libssl.so.15
#6  0x0f1be6d8 in fetch_start_ssl ()
#7  0x0f1b0dfc in fetch_url ()
#8  0x0f1b3128 in auto_fetch ()
#9  0x0f1bf944 in main ()


>
> -RVP



-- 



Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread RVP

On Thu, 18 May 2023, Chavdar Ivanov wrote:


This turned out to be /usr/bin/ftp crashing:

#  /usr/bin/ftp -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
Trying 104.20.23.46:443 ...
[1]7100 segmentation fault  /usr/bin/ftp -o node-v20.2.0.tar.xz


If I run it under gdb, I get:

(gdb) run -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
Starting program: /usr/bin/ftp -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'

Program received signal SIGILL, Illegal instruction.
0xf7db5d54be70 in _armv8_sha512_probe () from /usr/lib/libcrypto.so.14
(gdb) bt
#0  0xf7db5d54be70 in _armv8_sha512_probe () from /usr/lib/libcrypto.so.14
#1  0xf7db5d54c23c in OPENSSL_cpuid_setup () from /usr/lib/libcrypto.so.14
#2  0xef643398 in _rtld_call_init_function () from
/usr/libexec/ld.elf_so
#3  0xef6436a4 in _rtld_call_init_functions () from
/usr/libexec/ld.elf_so
#4  0xef643f74 in _rtld () from /usr/libexec/ld.elf_so
#5  0xef640b10 in _rtld_start () from /usr/libexec/ld.elf_so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)



You should ignore SIGILL when it's in libcrypto on some archs. eg. ARM,
PPC & Sparc. On x86 systems, libcrypto uses the CPUID instruction to
determine which optimized assembly routines can be used for speedup. On
ARM etc, it installs a SIGILL handler and just runs test instructions. The
handler being called means _those_ instructions are not available.

So, on ARM, you have to tell gdb to pass through SIGILL to the program:

```
(gdb) handle SIGILL nostop noprint pass
```


The weird and suspicious thing is that /usr/bin/ftp is linked to both
existing libcrypto.so versions:

ldd /usr/bin/ftp
/usr/bin/ftp:
   -ledit.3 => /usr/lib/libedit.so.3
   -lterminfo.2 => /usr/lib/libterminfo.so.2
   -lc.12 => /usr/lib/libc.so.12
   -lssl.15 => /usr/lib/libssl.so.15
   -lcrypto.14 => /usr/lib/libcrypto.so.14
   -lcrypt.1 => /lib/libcrypt.so.1
   -lcrypto.15 => /usr/lib/libcrypto.so.15



I would say this is the real reason for the crash (SIGSEGV).

-RVP


Re: /usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Robert Swindells


Chavdar Ivanov  wrote:
> The weird and suspicious thing is that /usr/bin/ftp is linked to both
> existing libcrypto.so versions:
>
> ldd /usr/bin/ftp
> /usr/bin/ftp:
>-ledit.3 => /usr/lib/libedit.so.3
>-lterminfo.2 => /usr/lib/libterminfo.so.2
>-lc.12 => /usr/lib/libc.so.12
>-lssl.15 => /usr/lib/libssl.so.15
>-lcrypto.14 => /usr/lib/libcrypto.so.14
>-lcrypt.1 => /lib/libcrypt.so.1
>-lcrypto.15 => /usr/lib/libcrypto.so.15

I'm guessing you did an update build not a clean one. Also, did you use
the -j flag to build in parallel?

Do a clean build and everything should work.


/usr/bin/ftp crash on -current (10.00.4) aarch64

2023-05-18 Thread Chavdar Ivanov
Hi,

After having upgraded my aarch64 host to

(NetBSD narvi 10.99.4 NetBSD 10.99.4 (GENERIC64) #0: Sun May 14
19:13:18 BST 2023
sysbu...@ymir.lorien.lan:/dumps/sysbuild/evbarm64/obj/home/sysbuild/src/sys/arch/evbarm/compile/GENERIC64
evbarm)

I found out I can no longer fetch some packages:
...
 cd /usr/pkgsrc/lang/nodejs
➜  nodejs make fetch
=> Bootstrap dependency digest>=20211023: found digest-20220214
=> Fetching node-v20.2.0.tar.xz
=> Total size: 41778040 bytes
Trying 104.20.22.46:443 ...
[1]   Segmentation fault  (cd ${fetchdir}; if ${TEST} -n "${resume}"; th...
fetch: Unable to fetch expected file node-v20.2.0.tar.xz
Trying 151.101.61.6:80 ...
...

This turned out to be /usr/bin/ftp crashing:

#  /usr/bin/ftp -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
Trying 104.20.23.46:443 ...
[1]7100 segmentation fault  /usr/bin/ftp -o node-v20.2.0.tar.xz


If I run it under gdb, I get:

(gdb) run -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'
Starting program: /usr/bin/ftp -o node-v20.2.0.tar.xz
'https://nodejs.org/dist/v20.2.0/node-v20.2.0.tar.xz'

Program received signal SIGILL, Illegal instruction.
0xf7db5d54be70 in _armv8_sha512_probe () from /usr/lib/libcrypto.so.14
(gdb) bt
#0  0xf7db5d54be70 in _armv8_sha512_probe () from /usr/lib/libcrypto.so.14
#1  0xf7db5d54c23c in OPENSSL_cpuid_setup () from /usr/lib/libcrypto.so.14
#2  0xef643398 in _rtld_call_init_function () from
/usr/libexec/ld.elf_so
#3  0xef6436a4 in _rtld_call_init_functions () from
/usr/libexec/ld.elf_so
#4  0xef643f74 in _rtld () from /usr/libexec/ld.elf_so
#5  0xef640b10 in _rtld_start () from /usr/libexec/ld.elf_so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

The weird and suspicious thing is that /usr/bin/ftp is linked to both
existing libcrypto.so versions:

ldd /usr/bin/ftp
/usr/bin/ftp:
-ledit.3 => /usr/lib/libedit.so.3
-lterminfo.2 => /usr/lib/libterminfo.so.2
-lc.12 => /usr/lib/libc.so.12
-lssl.15 => /usr/lib/libssl.so.15
-lcrypto.14 => /usr/lib/libcrypto.so.14
-lcrypt.1 => /lib/libcrypt.so.1
-lcrypto.15 => /usr/lib/libcrypto.so.15

whereas on amd64, built a few hours earlier, I get:
# ldd =ftp
/usr/bin/ftp:
-ledit.3 => /usr/lib/libedit.so.3
-lterminfo.2 => /usr/lib/libterminfo.so.2
-lc.12 => /usr/lib/libc.so.12
-lssl.15 => /usr/lib/libssl.so.15
-lcrypto.15 => /usr/lib/libcrypto.so.15
-lcrypt.1 => /lib/libcrypt.so.1

I will obviously rebuild the aarch64 system just in case, but thought
it worth mentioning.

Chavdar

--