daily CVS update output
Updating src tree: P src/distrib/notes/arc/prep P src/distrib/notes/cats/prep P src/distrib/notes/macppc/prep.OPENFIRMWARE P src/distrib/sets/lists/base/shl.mi P src/distrib/sets/lists/comp/stl.mi P src/distrib/sets/lists/debug/shl.mi P src/doc/3RDPARTY P src/doc/CHANGES P src/doc/CHANGES.prev P src/external/bsd/elftoolchain/Makefile P src/external/bsd/elftoolchain/lib/Makefile P src/external/bsd/libevent/Makefile.inc P src/external/bsd/libevent/libevent2netbsd P src/external/bsd/libevent/dist/Doxyfile P src/external/bsd/libevent/dist/buffer.c P src/external/bsd/libevent/dist/bufferevent-internal.h P src/external/bsd/libevent/dist/bufferevent.c P src/external/bsd/libevent/dist/bufferevent_openssl.c P src/external/bsd/libevent/dist/bufferevent_ratelim.c cvs update: `src/external/bsd/libevent/dist/compile' is no longer in the repository cvs update: `src/external/bsd/libevent/dist/config.guess' is no longer in the repository cvs update: `src/external/bsd/libevent/dist/config.sub' is no longer in the repository cvs update: `src/external/bsd/libevent/dist/depcomp' is no longer in the repository P src/external/bsd/libevent/dist/evbuffer-internal.h cvs update: `src/external/bsd/libevent/dist/evconfig-private.h' is no longer in the repository P src/external/bsd/libevent/dist/evdns.c P src/external/bsd/libevent/dist/event-internal.h P src/external/bsd/libevent/dist/event.c P src/external/bsd/libevent/dist/event_tagging.c P src/external/bsd/libevent/dist/evmap.c P src/external/bsd/libevent/dist/evrpc.c P src/external/bsd/libevent/dist/evthread-internal.h P src/external/bsd/libevent/dist/evthread.c P src/external/bsd/libevent/dist/evutil.c P src/external/bsd/libevent/dist/evutil_rand.c P src/external/bsd/libevent/dist/http.c cvs update: `src/external/bsd/libevent/dist/install-sh' is no longer in the repository P src/external/bsd/libevent/dist/kqueue.c P src/external/bsd/libevent/dist/log-internal.h P src/external/bsd/libevent/dist/log.c cvs update: `src/external/bsd/libevent/dist/ltmain.sh' is no longer in the repository P src/external/bsd/libevent/dist/minheap-internal.h cvs update: `src/external/bsd/libevent/dist/missing' is no longer in the repository P src/external/bsd/libevent/dist/select.c cvs update: `src/external/bsd/libevent/dist/test-driver' is no longer in the repository P src/external/bsd/libevent/dist/util-internal.h P src/external/bsd/libevent/dist/include/event2/rpc.h P src/external/bsd/libevent/dist/include/event2/util.h P src/external/bsd/libevent/dist/test/regress.c P src/external/bsd/libevent/dist/test/regress.h P src/external/bsd/libevent/dist/test/regress_buffer.c P src/external/bsd/libevent/dist/test/regress_bufferevent.c P src/external/bsd/libevent/dist/test/regress_dns.c P src/external/bsd/libevent/dist/test/regress_et.c P src/external/bsd/libevent/dist/test/regress_http.c P src/external/bsd/libevent/dist/test/regress_listener.c P src/external/bsd/libevent/dist/test/regress_main.c P src/external/bsd/libevent/dist/test/regress_rpc.c P src/external/bsd/libevent/dist/test/regress_ssl.c P src/external/bsd/libevent/dist/test/regress_thread.c P src/external/bsd/libevent/dist/test/regress_util.c P src/external/bsd/libevent/dist/test/tinytest_macros.h P src/external/bsd/libevent/include/event2/event-config.h P src/external/cddl/osnet/lib/Makefile P src/sbin/modstat/main.c P src/sys/arch/mips/mips/db_disasm.c P src/usr.bin/audiocfg/audiocfg.1 Updating xsrc tree: Killing core files: Updating file list: -rw-rw-r-- 1 srcmastr netbsd 41115825 Apr 8 03:03 ls-lRA.gz
Re: regarding the changes to kernel entropy gathering
On Sun, Apr 04, 2021 at 11:02:02PM +, Taylor R Campbell wrote: > > Lots of SoCs have on-board RNGs these days; there are Intel and ARM > CPU instructions (no ARMv8.5 hardware yet that I know of, but we're > ready for its RNG!); some crypto decelerators like tpm(4), ubsec(4), > and hifn(4) have RNGs; and there are some dedicated RNG devices like > ualea(4). Can we actually use the TPM RNG from in-kernel? Whether we should is a different, interesting question, given how it is typically implemented. -- Thor Lancelot Simon t...@panix.com "Whether or not there's hope for change is not the question. If you want to be a free person, you don't stand up for human rights because it will work, but because it is right." --Andrei Sakharov
Re: regarding the changes to kernel entropy gathering
On Tue, Apr 06, 2021 at 10:54:51AM -0700, Greg A. Woods wrote: > At Mon, 5 Apr 2021 23:18:55 -0400, Thor Lancelot Simon wrote: > > > But what you're missing is that neither does what you > > think. When rndctl -L runs after the system comes up multiuser, all > > entropy samples that have been added (which are in the per-cpu pools) > > are propagated to the global pool. Every stream RNG on the system then > > rekeys itself - they are _not_ just using the entropy from the seed on > > disk. Even if nothing does so earlier, when rndctl -S runs as the system > > shuts down, again all entropy samples that have been added (which, again, > > are accumulating in the per-cpu pools) are propagated to the global pool; > > all the stream RNGs rekey themselves again; then the seed is extracted. > > That's all great, and more or less what I've assumed from all the > previous discussion > > Except it seems to be useless in practice without an initial seed, Again there's really little I can do other than suggest you read the code. You are certainly competent to do so, and the code does not do what you keep claiming it does. Read the code, all of it -- it's only a few hundred lines -- and have a think. When rndctl -L runs, or you perform a sufficiently long write to /dev/random, all the per-CPU pools, which, counter to what you keep claiming, *do* accumulate samples from all the same sources they used to, are coalesced into the global pool. When rndctl -S runs, all the per-CPU pools, which, counter to what you keep claiming, *do* accumulate samples from all the same sources they used to, are coalesced into the global pool. If you'd like those samples coalesced into the global pool more frequently, you can use the sysctl to do so. Thor
Re: regarding the changes to kernel entropy gathering
At Wed, 7 Apr 2021 22:47:39 +0200, Martin Husemann wrote: Subject: Re: regarding the changes to kernel entropy gathering > > When you create a custom setup like that, you will have to replace > etc/rc.d/entropy with a custom solution (e.g. mounting some flash storage). No storage means "NO storage.". > Or you ignore the issue and do the dd at each boot - hopefully not generating > any strong keys on that machine then (but you would have no good storage > for those anyway). Or I don't ignore the issue and instead I fix the code so that it's still possible to get entropy estimates from non-hardware-RNG devices and then things keep working the way they used to, and there's still some possibility of _real_ entropy being used to seed the PRNGs. From what I've seen here so far I'm far from alone in wanting that ability. What's most confusing is to why there's such animosity and stubborn unwillingness to even consider that the old way of getting some entropy from a few less-than-perfect sources was good enough for many, or even most, of us. It's better than no entropy when there are no "perfect" sources, and that's also a situation that includes many of us. It doesn't have to be the default. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms pgpiP2WuJhrQy.pgp Description: OpenPGP Digital Signature
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Wed, 7 Apr 2021, Manuel Bouyer wrote: > It should not be fatal. The library traps sigill specially to test for > instructions. > > Does the program really exit if you hit 'continue' in ddb ? Not 'ddb' but 'gdb'... I sent a message earlier with a more thorough analysis, but it has not made it to anyone yet (at least not the mailing list archive). I was confusing the earlier, caught ILL_COPROC instructions for the later, uncaught ILL_ILLOPC that occurs in "gcm_ghash_4bit()" in "libcrypto.so.14". -- |/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X |\ / jdbaker[snail]consolidated[flyspeck]net OpenBSDFreeBSD | X No HTML/proprietary data in email. BSD just sits there and works! |/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Wed, 7 Apr 2021, Martin Husemann wrote: > > In any case, while one may be able to do that in 'gdb', when running > > normally, it is fatal and there is no recourse. Odd that it doesn't > > dump core. > > That would be exactly explained by the above - SIGILL gets caught by lib, > no core dump. Your real problem happens later. Except that there is no "later". The sendmail process terminates on the first SIGILL, as shown by the output of 'sendmail -odi -v -q' and a 'ktrace' of the same. I'm updating my -current install and will try again from there and see what happens. -- |/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X |\ / jdbaker[snail]consolidated[flyspeck]net OpenBSDFreeBSD | X No HTML/proprietary data in email. BSD just sits there and works! |/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
Dropping pkgsrc-users@ as it appears not to be a pkgsrc problem. On Wed, 7 Apr 2021, Martin Husemann wrote: > On Wed, Apr 07, 2021 at 11:26:05AM -0500, John D. Baker wrote: > > > > (gdb) run -odi -v -q > > Starting program: /usr/sbin/sendmail -odi -v -q > > process 867 is executing new program: /usr/pkg/libexec/sendmail/sendmail > > > > Program received signal SIGILL, Illegal instruction. > > 0xedd6d40c in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14 > > (gdb) bt > > This is normal, you should be able to "continue" from it. > The library catches the SIGILL and avoids the instruction. ISTR that I tried that and simply got the SIGILL again. Maybe that was from a later sparcV9 instruction... In any case, while one may be able to do that in 'gdb', when running normally, it is fatal and there is no recourse. Odd that it doesn't dump core. -- |/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X |\ / jdbaker[snail]consolidated[flyspeck]net OpenBSDFreeBSD | X No HTML/proprietary data in email. BSD just sits there and works! |/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
Re: regarding the changes to kernel entropy gathering
On Wed, Apr 07, 2021 at 12:14:58PM -0700, Greg A. Woods wrote: > > You run it once. Manually. And never again. > > Nope, sorry, that's not a good enough answer. It is for the typical and default installs. > It doesn't solve the > problem of dealing with a lack of mutable storage. When you create a custom setup like that, you will have to replace etc/rc.d/entropy with a custom solution (e.g. mounting some flash storage). Or you ignore the issue and do the dd at each boot - hopefully not generating any strong keys on that machine then (but you would have no good storage for those anyway). Martin
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Tue, 6 Apr 2021, John Nemeth wrote: > What happens if you do 'sudo sendmail -odi -q -v'. Finally had a chance to check this. With OpenSSL 1.1.1k as pulled up to netbsd-9, I see the following: $ sudo sendmail -odi -v -q Running /var/spool/mqueue/137FSpnA000465 (sequence 1 of 1) ... Connecting to mail.consolidated.net. port 587 via relay... 220 mail26c25.carrierzone.com ESMTP Sendmail 8.14.9 ready at Wed, 7 Apr 2021 15:30:05 + >>> EHLO 250-mail26c25.carrierzone.com Hello [], pleased to meet you 250-ENHANCEDSTATUSCODES 250-8BITMIME 250-SIZE 52428800 250-DSN 250-AUTH 250-STARTTLS 250-DELIVERBY 250 HELP >>> STARTTLS 220 Ready to start TLS Illegal instruction With OpenSSL rolled back to before the pullup, there is no "Illegal instruction" message and it proceeds to relay the mail. 'ktrace' the program getting SIGILL, but wasn't otherwise helpful. Running under 'gdb' showed: (gdb) run -odi -v -q Starting program: /usr/sbin/sendmail -odi -v -q process 867 is executing new program: /usr/pkg/libexec/sendmail/sendmail Program received signal SIGILL, Illegal instruction. 0xedd6d40c in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14 (gdb) bt #0 0xedd6d40c in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14 #1 0xedd173e8 in OPENSSL_cpuid_setup () at /x/netbsd-9/src/crypto/external/bsd/openssl/dist/crypto/sparcv9cap.c:239 #2 0xedc2ef48 in _init () from /usr/lib/libcrypto.so.14 #3 0xeded45d8 in _rtld_call_function_void (obj=0xedefdc00, addr=) at /x/netbsd-9/src/libexec/ld.elf_so/rtld.h:490 #4 _rtld_call_initfini_function (obj=, mask=, func=) at /x/netbsd-9/src/libexec/ld.elf_so/rtld.c:143 #5 _rtld_call_init_function (cur_objgen=1, mask=0xe8c0, obj=0xedefdc00) at /x/netbsd-9/src/libexec/ld.elf_so/rtld.c:242 #6 _rtld_call_init_functions (mask=mask@entry=0xe8c0) at /x/netbsd-9/src/libexec/ld.elf_so/rtld.c:327 #7 0xeded4f2c in _rtld (sp=, relocbase=) at /x/netbsd-9/src/libexec/ld.elf_so/rtld.c:782 #8 0xeded0e20 in _rtld_start () from /usr/libexec/ld.elf_so Backtrace stopped: previous frame identical to this frame (corrupt stack?) I can't 'list' the address in frame #0. The 'gdb' prompt simply returns immediately. I can list the address in frame #1: (gdb) list *0xedd173e8 0xedd173e8 is in OPENSSL_cpuid_setup (/x/netbsd-9/src/crypto/external/bsd/openssl/dist/crypto/sparcv9cap.c:240). 235 OPENSSL_sparcv9cap_P[0] &= ~SPARCV9_TICK_PRIVILEGED; 236 } 237 238 if (sigsetjmp(common_jmp, 1) == 0) { 239 _sparcv9_vis1_probe(); 240 OPENSSL_sparcv9cap_P[0] |= SPARCV9_VIS1 | SPARCV9_BLK; 241 /* detect UltraSPARC-Tx, see sparccpud.S for details... */ 242 if (_sparcv9_vis1_instrument() >= 12) 243 OPENSSL_sparcv9cap_P[0] &= ~(SPARCV9_VIS1 | SPARCV9_PREFER_FPU); 244 else { I can arrange to boot -current on this machine and see how it behaves. > Also are you using host status? I guess not, since I don't know what that is. -- |/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X |\ / jdbaker[snail]consolidated[flyspeck]net OpenBSDFreeBSD | X No HTML/proprietary data in email. BSD just sits there and works! |/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
Re: Partial reads on unix domain sockets
On Wed 07 Apr 2021 at 13:43:52 +0200, Tom Ivar Helbekkmo wrote: > While there is no guarantee of a one to one relationship between writes > and reads, it seems that some applications expect this. In my case, it > was jack (pkgsrc/audio/jack) that failed. It comes with, among other > things, a daemon, jackd, and a library for use by clients wishing to > connect to it. Communication between jackd and its clients became > impossible with this change, because the code in jack expects to be able > to exchange C structs between server and clients. The jackd server has > a thread that uses poll() to wait for available packets from clients, > and when something arrives, it is read with code like this example: Shouldn't code that expects that open a SOCK_SEQPACKET socket instead of SOCK_STREAM? (Or SOCK_DGRAM perhaps, since socket(2) seems to say that SOCK_SEQPACKET doesn't exist for PF_LOCAL) -Olaf. -- ___ Q: "What's an anagram of Banach-Tarski?" -- Olaf "Rhialto" Seibert \X/ A: "Banach-Tarski Banach-Tarski." -- rhialto at falu dot nl signature.asc Description: PGP signature
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Wed, Apr 07, 2021 at 11:47:10AM -0500, John D. Baker wrote: > Dropping pkgsrc-users@ as it appears not to be a pkgsrc problem. > > On Wed, 7 Apr 2021, Martin Husemann wrote: > > > On Wed, Apr 07, 2021 at 11:26:05AM -0500, John D. Baker wrote: > > > > > > (gdb) run -odi -v -q > > > Starting program: /usr/sbin/sendmail -odi -v -q > > > process 867 is executing new program: /usr/pkg/libexec/sendmail/sendmail > > > > > > Program received signal SIGILL, Illegal instruction. > > > 0xedd6d40c in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14 > > > (gdb) bt > > > > This is normal, you should be able to "continue" from it. > > The library catches the SIGILL and avoids the instruction. > > ISTR that I tried that and simply got the SIGILL again. Maybe that > was from a later sparcV9 instruction... > > In any case, while one may be able to do that in 'gdb', when running > normally, it is fatal and there is no recourse. Odd that it doesn't > dump core. It should not be fatal. The library traps sigill specially to test for instructions. Does the program really exit if you hit 'continue' in ddb ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: regarding the changes to kernel entropy gathering
At Wed, 7 Apr 2021 09:52:29 +0200, Martin Husemann wrote: Subject: Re: regarding the changes to kernel entropy gathering > > On Tue, Apr 06, 2021 at 03:12:45PM -0700, Greg A. Woods wrote: > > > Isn't it as simple as: > > > > > > dd bs=32 if=/dev/urandom of=/dev/random > > > > No, that still leaves the question of _when_ to run it. (And, at least > > at the moment, where to put it. /etc/rc.local?) > > Of course not! > > You run it once. Manually. And never again. Nope, sorry, that's not a good enough answer. It doesn't solve the problem of dealing with a lack of mutable storage. A system _MUST_ be able to be booted and with no user intervention be able to (eventually) get to the state where /dev/random and getrandom(2) WILL NOT block, and it _MUST_ be able to do so without the help of any hardware RNG, and without the ability to store (and read) a seed from a file or other storage device. I.e. we _MUST_ be _ABLE_ to choose to use other devices as sources for entropy, even if they are not perfect. We had this, it works fine, we still need it. -- Greg A. Woods Kelowna, BC +1 250 762-7675 RoboHack Planix, Inc. Avoncote Farms pgpuAM5snajCz.pgp Description: OpenPGP Digital Signature
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Tue, 6 Apr 2021, John D. Baker wrote: > So, I'm now doing a non-update build of my regular netbsd-9 tree to > see if some cruft or other was hanging around and causing trouble. So that didn't help. Updating the system with sets from a non-update build of the main netbsd-9 branch resulted in mail queued on the hub but not relayed. Re-extracting "base.tgz" from my test tree with the OpenSSL update rolled back and restarting 'sendmail', the queued mail and all subsequent mail are relayed as expected. Now to figure out what's happening and why. -- |/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X |\ / jdbaker[snail]consolidated[flyspeck]net OpenBSDFreeBSD | X No HTML/proprietary data in email. BSD just sits there and works! |/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Wed, Apr 07, 2021 at 11:47:10AM -0500, John D. Baker wrote: > ISTR that I tried that and simply got the SIGILL again. Maybe that > was from a later sparcV9 instruction... Yes, I think it probes for like three different instructions. > In any case, while one may be able to do that in 'gdb', when running > normally, it is fatal and there is no recourse. Odd that it doesn't > dump core. That would be exactly explained by the above - SIGILL gets caught by lib, no core dump. Your real problem happens later. Martin
Re: Partial reads on unix domain sockets
In article , Tom Ivar Helbekkmo wrote: >Some time last year, probably late summer or autumn, a change was made >that caused transfer of small chunks of data over unix domain sockets to >have a higher chance of resulting in a read() getting only part of the >chunk. > >While there is no guarantee of a one to one relationship between writes >and reads, it seems that some applications expect this. In my case, it >was jack (pkgsrc/audio/jack) that failed. It comes with, among other >things, a daemon, jackd, and a library for use by clients wishing to >connect to it. Communication between jackd and its clients became >impossible with this change, because the code in jack expects to be able >to exchange C structs between server and clients. The jackd server has >a thread that uses poll() to wait for available packets from clients, >and when something arrives, it is read with code like this example: > > if (read (client_fd, , sizeof(req)) != sizeof(req)) { > jack_error ("cannot read ACK connection request from client"); > return -1; > } > >The client_fd is an open unix domain stream socket, and it is *not* in >non-blocking mode. The structs being transfered are of various sizes, >and can, from a casual inspection of the header files, be up to a couple >of hundred bytes long. > >Data is written to the sockets using code like this: > > if (write (reply_fd, , sizeof(req)) < (ssize_t)sizeof(req)) { > jack_error ("cannot write request result to client"); > return -1; > } > >Meanwhile, in the client library, the code at the other end of this >communication is simply: > > if (write (fd, , sizeof(req)) != sizeof(req)) { > jack_error ("cannot write event connect request to server (%s)", > strerror (errno)); > close (fd); > return -1; > } > > if (read (fd, , sizeof(res)) != sizeof(res)) { > jack_error ("cannot read event connect result from server (%s)", > strerror (errno)); > close (fd); > return -1; > } > >Obviously, poll() will return, with information about available data, >before the entire chunk written by the other end is available. > >I haven't filed a PR on this, as it isn't technically an error in >NetBSD. However, if there is a wide-spread belief out there that code >such as this will "just work" (I'm guessing it "just works" on Linux, >just like it does on NetBSD < 10), and it's not otherwise detrimental to >have the data from a single write() call all be available to the reader >of the socket before triggering a select() or poll() that's waiting for >it, then maybe such an adjustment should be considered. Can you please file a PR with an example to keep track of it. It is a behavior change after all and we should understand why it happened. christos
Re: mail/sendmail not relaying on netbsd-9/sparc, problem with OpenSSL update?
On Wed, Apr 07, 2021 at 11:26:05AM -0500, John D. Baker wrote: > Running under 'gdb' showed: > > > (gdb) run -odi -v -q > Starting program: /usr/sbin/sendmail -odi -v -q > process 867 is executing new program: /usr/pkg/libexec/sendmail/sendmail > > Program received signal SIGILL, Illegal instruction. > 0xedd6d40c in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14 > (gdb) bt This is normal, you should be able to "continue" from it. The library catches the SIGILL and avoids the instruction. Martin
Re: regarding the changes to kernel entropy gathering
On Wed, Apr 07, 2021 at 07:53:07AM -0400, matthew sporleder wrote: > So on a brand new installation/first boot why isn't the clock a > sufficiently random thing? (anymore?) Becaus it isn't random? > Hung and unusable systems are a big problem. Happening on the first > boot is not a good first impression. :) It does not happen on first boot. It needs a one-time fix after installation, which is why I added the fix to the installer. Martin
Re: regarding the changes to kernel entropy gathering
On Wed, Apr 7, 2021 at 7:10 AM Martin Husemann wrote: > > On Wed, Apr 07, 2021 at 07:05:12AM -0400, matthew sporleder wrote: > > Is the issue gaw saw exclusive to xen first boots? Are there other > > ways to end up in his situation? > > It happens on all new installations for machines with no RNG, which is > the far majority of everything but "newish" amd64 and a few arm and mips > boards/SoC. > > It is unrelated to Xen. > > Martin So on a brand new installation/first boot why isn't the clock a sufficiently random thing? (anymore?) Hung and unusable systems are a big problem. Happening on the first boot is not a good first impression. :)
Partial reads on unix domain sockets
Some time last year, probably late summer or autumn, a change was made that caused transfer of small chunks of data over unix domain sockets to have a higher chance of resulting in a read() getting only part of the chunk. While there is no guarantee of a one to one relationship between writes and reads, it seems that some applications expect this. In my case, it was jack (pkgsrc/audio/jack) that failed. It comes with, among other things, a daemon, jackd, and a library for use by clients wishing to connect to it. Communication between jackd and its clients became impossible with this change, because the code in jack expects to be able to exchange C structs between server and clients. The jackd server has a thread that uses poll() to wait for available packets from clients, and when something arrives, it is read with code like this example: if (read (client_fd, , sizeof(req)) != sizeof(req)) { jack_error ("cannot read ACK connection request from client"); return -1; } The client_fd is an open unix domain stream socket, and it is *not* in non-blocking mode. The structs being transfered are of various sizes, and can, from a casual inspection of the header files, be up to a couple of hundred bytes long. Data is written to the sockets using code like this: if (write (reply_fd, , sizeof(req)) < (ssize_t)sizeof(req)) { jack_error ("cannot write request result to client"); return -1; } Meanwhile, in the client library, the code at the other end of this communication is simply: if (write (fd, , sizeof(req)) != sizeof(req)) { jack_error ("cannot write event connect request to server (%s)", strerror (errno)); close (fd); return -1; } if (read (fd, , sizeof(res)) != sizeof(res)) { jack_error ("cannot read event connect result from server (%s)", strerror (errno)); close (fd); return -1; } Obviously, poll() will return, with information about available data, before the entire chunk written by the other end is available. I haven't filed a PR on this, as it isn't technically an error in NetBSD. However, if there is a wide-spread belief out there that code such as this will "just work" (I'm guessing it "just works" on Linux, just like it does on NetBSD < 10), and it's not otherwise detrimental to have the data from a single write() call all be available to the reader of the socket before triggering a select() or poll() that's waiting for it, then maybe such an adjustment should be considered. -tih -- Most people who graduate with CS degrees don't understand the significance of Lisp. Lisp is the most important idea in computer science. --Alan Kay
Re: regarding the changes to kernel entropy gathering
On Tue, 6 Apr 2021, RVP wrote: On Tue, 6 Apr 2021, Taylor R Campbell wrote: Why do you say that? We do incorporate many sources that are not well-studied -- every keystroke, for example, and the CPU cycle counter at the time of the keystroke, affects the output of /dev/urandom. Is the output of /dev/random also influenced like this? Ah, no need to answer this: it's already in rnd(4): The operating system continuously makes observations of hardware devices, such as network packet timings, disk seek delays, and keystrokes. The observations are combined into a seed for a cryptographic pseudorandom number generator (PRNG) which is used to generate the outputs of both /dev/random and /dev/urandom. -RVP
Re: regarding the changes to kernel entropy gathering
On Wed, Apr 07, 2021 at 07:05:12AM -0400, matthew sporleder wrote: > Is the issue gaw saw exclusive to xen first boots? Are there other > ways to end up in his situation? It happens on all new installations for machines with no RNG, which is the far majority of everything but "newish" amd64 and a few arm and mips boards/SoC. It is unrelated to Xen. Martin
Re: regarding the changes to kernel entropy gathering
> On Apr 6, 2021, at 8:09 AM, Taylor R Campbell wrote: > > >> Date: Mon, 05 Apr 2021 10:58:58 +0700 >> From: Robert Elz >> I understand that some people desire highly secure systems (I'm not >> convinced that anyone running NetBSD can really justify that desire, >> but that's beside the point) and that's fine - make the system be able >> to be as secure as possible, just don't require me to enable it, and >> don't make it impossible or even difficuly to disable it - and allow >> some kind of middle ground, just just "perfectly secure" and "hopeless". > > The main issue that hits people is that the traditional mechanism by > which the OS reports a potential security problem with entropy is for > it to make applications silently hang -- and the issue is getting > worse now that getrandom() is more widely used, e.g. in Python when > you do `import multiprocessing'. > > Based on experience over the past year with a meaningful criterion for > _detecting_ potential problems, I don't think that's a useful > mechanism for _reporting_ them, which is why I added several other > mechanisms -- a line in the /etc/security report, an `entropy' knob in > /etc/rc.conf to wait or fail to single-user (default: neither) -- and > proposed to remove the blocking behaviour of getrandom() in favour of > focusing on feedback in system integration: > > https://mail-index.netbsd.org/tech-userlevel/2021/01/11/msg012807.html > > (main discussion after all the noise starts here: > https://mail-index.netbsd.org/tech-userlevel/2021/01/15/msg012846.html) > > But I ran out of steam to continue the discussion at the time. Is the issue gaw saw exclusive to xen first boots? Are there other ways to end up in his situation?
Re: regarding the changes to kernel entropy gathering
On Tue, Apr 06, 2021 at 06:24:38PM +, Koning, Paul wrote: > > Isn't it as simple as: > > > > dd bs=32 if=/dev/urandom of=/dev/random > > > > ? > > That runs the risk of people thinking it adds entropy. I'd be more > comfortable with this: > > dd bs=32 if=/dev/zero of=/dev/random > > because it makes the security implications more obvious. Both ways are equally unclear to anyone not looking deep enough. Your method could be read like "we start with empty state". Martin
Re: regarding the changes to kernel entropy gathering
On Tue, Apr 06, 2021 at 03:12:45PM -0700, Greg A. Woods wrote: > > Isn't it as simple as: > > > > dd bs=32 if=/dev/urandom of=/dev/random > > No, that still leaves the question of _when_ to run it. (And, at least > at the moment, where to put it. /etc/rc.local?) Of course not! You run it once. Manually. And never again. Martin