Re: Concurrent trie-hash map
> On Oct 16, 2018, at 1:00 PM, Mindaugas Rasiukevicius wrote: > > Hi, > > I recently implemented thmap [1] -- a concurrent trie-hash map, combining > the elements of hashing and radix trie. It is supports lock-free lookups > and concurrent inserts/deletes. It is designed to be optimal as a general > purpose *concurrent* associative array [2][3]. I think this is a great thing to add to kern/ as a general facility. There are applications beyond just the network stack, as well -- any read-mostly hash table that has a lock or an array-of-locks around it is a good candidate for being replaced by this. We already have some objects in the kernel that use the "get-ref" / "put-ref" pattern for indicating they are being referenced; this is ideal for hooking in your G/C mechanism. We should probably, as a separate exercise, extend that pattern to most of the objects managed in our kernel (even if the put-ref is a no-op). -- thorpej
Re: libnv
> On Aug 27, 2018, at 7:49 AM, Mindaugas Rasiukevicius wrote: > I do not think there was any conclusion either, or a conclusion that > we always want to have static serialisation with a schema (IMO, there > are pros and cons in both cases). Worth noting that there are multiple > libraries already: we already imported libnvpair (with tons of other > code for ZFS and DTrace) and nobody complained; we still have XDR. XDR has a separate reason for existing -- it is enshrined in wire protocols. > While libnv is not a panacea, it solves some actual problems: it has > intuitive reference counting (symmetric API), it supports accumulated > error handling, it provides more efficient binary encoding, it does not > have shared structures with internal locking, it does not have awkward > API. As a side note, it also makes code sharing with FreeBSD (as well > as illumos) much easier. Meanwhile, there has been little action on > improving proplib or proposing a tangible replacement of it. Let me try to address these points one by one: ==> It has intuitive reference counting (symmetric API) I just sent you a separate mail on this subject. Actually, the libprop reference counting rules are very simple: "If you create an object, you implicitly own the reference and you are responsible for disposing of it. In all other cases, references must be explicitly retained." That's it. It's really not that complicated. Containers (arrays and dictionaries) retain their own references explicitly (because they want the object the live on not matter what the caller does with its own reference). When you "get" an object from the container, you did not create it, and thus need to explicitly retain your own (that you, of course, are responsible for disposing of later). Please explain to me how this referencing counting rule is non-intuitive. ==> It supports accumulated error handling I personally find APIs like this kind of annoying, but if you find it valuable, it could be added to libprop with a utility wrapper API. ==> It provides more efficient binary encoding I'll grant this one, but with a caveat -- I always intended to to add the "binary plist" serialization format to libprop, but never got around to it. It would require a refactor of the serialization stuff that's already in libprop, but this is a solvable problem and have been thinking of doing it soonish anyway because of some other ideas I've been toying with in my head. ==> It does not have shared structures with internal locking I'll grant that this was probably a mistake initially with dictionaries and arrays. Serialization for these should be handled at the caller's level, because in many instances you would need that anyway if you expect multiple threads to be accessing the container (to provide correct semantics at the caller's layer). So, as such, it would probably be worth having a discussion about whether removing the internal locking would be a safe thing to do. ==> It does not have an awkward API The individual object APIs themselves can be kind of awkward to use, but that was the price to pay for flexibility. There are already several utility APIs that provide a simpler interface, and those could be extended. > > Anyway, I am not proposing that everybody should use libnv. However, > it is is quite lightweight and self-contained. I am already using libnv > in the upstream NPF. The conversion to libnv made the code simpler, > shorter and easier to read. And I am not planning to maintain proplib > version: it causes too much headache and the differences are too vast. > > -- > Mindaugas -- thorpej
Re: libnv
> On Aug 27, 2018, at 2:25 PM, David Holland wrote: > > On Mon, Aug 27, 2018 at 08:41:43AM -0700, Jason Thorpe wrote: >>> [proplib] >> >> Let me try to address these points one by one: > > Now that you're back, can you explain a few more basic points about > proplib? I'll do my best to remember :-) > (a) What's its mission? Is it supposed to be a data *storage* library, > that is, the data lives in proplib and you do proplib accesses to > extract the bits of it you want as you need them? Or is it supposed to > be a data *transfer* library, that is, the data lives somewhere else > and the expectation is that when you receive a proplib bundle that you > immediately unpack it into its own data structures? (Note that saving > to disk and loading back later is roughly comparable to sending and > receiving among entities -- that's not what this question is about.) It can be either. Consider the case of properties on device_t's ... it's the storage medium for that information. Or consider the case of a configuration file ... this isn't really the same thing as "saving state out to disk and reading it back in later", it's one application serializing the information in a standard way to be consumed by another ... and how the consumer deals with it is really up to them. Or consider the case of wanting to collect some information from the kernel for consumption by userland (for display purposes, or whatever) ... but you want to isolate the receiver from the implementation details of the kernel sender. Some additional utility functions could make it easier to use in various situations, but it's intent is to be "useful wherever" > (b) What's its data model? Is it supposed to handle hierarchical > tree-structured data of the kind you might find in an ordinary > programming language's native data types? Or is it supposed to support > graph-structured data where many locations can share references to > common values or sub-objects? The former is how it's generally used, but it shares references to common sub-objects as a memory saving measure (either by assuming you use the API that manipulates the objects directly and do this yourself, or by de-duping certain immutable objects, as with prop_number_t). There is not attempt to detect cycles, however, so y'know, exercise caution. > (c) And also, what's its data model? What was the criterion used to > determine which atom types it supports? The data types supported by Apple XML property lists, with the following caveats: -- No attempt to support or . There is no reason those could not be added, but their use in standalone and kernel environments is not feasible; they could certainly be excluded based on #ifdef... -- I serialized unsigned numbers as hex (and assume that hex numbers that are read in are unsigned), and serialize signed as decimal. I believe macOS will read both decimal and hex, but always writes out in decimal. I think. I should double-check. I captured bool as a separate object type, as in CoreFoundation (CFBoolean), rather than in the number object as in Foundation (NSNumber). -- thorpej
Re: ATF t_mlock() babylon5 kernel panics
> On Mar 12, 2019, at 9:09 PM, Robert Elz wrote: > > The first issue I noticed, is that t_mlock() apparently belives > the malloc(3) man page, which states: > > The malloc() function allocates size bytes of uninitialized memory. The > allocated space is suitably aligned (after possible pointer coercion) for > storage of any type of object. > > and in particular, those last few words. The "any type of object" that > t_mlock wants to store is a "page" - that is a hardware page. The test employs a bogus understanding of how malloc() is specified. On x86, malloc() should return memory that is 16-byte aligned because that is the maximum alignment requirement of the fundamental types used by the compiler. > It obtains > the size of that using: > > page = sysconf(_SC_PAGESIZE); > > and then does > >buf = malloc(page); > > and if buf is not NULL (which it does check) assumes that it now > has a correctly page aligned page sized block of memory, in which > it can run mlock() related tests. > > Something tells me that the "any type of object" does not include this > one, and that t_mlock should be using posix_memalign() instead to allocate > its page, so it can specify that it needs a page aligned page. Correct. Or mmap() (which always returns page-aligned pointers). > Again, I am not proposing fixing the test until the kernel issues > are corrected, but it would be good for someone who knows what alignment > malloc() really promises to return (across all NetBSD architectures) > to rewrite the man page to say something more specific than "any type of > object" ! I've also seen the term "fundamental object" used. One has to remember that malloc() is specified by the C standard, and C has no notion of "pages" or any other such silliness that we Unix people assume are fundamental :-) > NetBSD's mlock() rounds down, so regardless of the alignment of the > space allocated, the mlock() tests should be working (the locking might > not be exactly what the test is expecting, but all it is doing is keeping > pages locked in memory - which pages exactly this test does not really > care). POSIX specifically states that mlock() //may// require that the address is page-aligned ... Our implementation does not require this: /* * align the address to a page boundary and adjust the size accordingly */ pageoff = (addr & PAGE_MASK); addr -= pageoff; size += pageoff; size = (vsize_t)round_page(size); That is to say, the intent of our implementation is to wire the page where the range begins through the page where the range ends. Note that internally, UVM represents all ranges as [start, start+size) (assuming start and size are page aligned / rounded). > On my test setup, the kernel did not panic. It does however experience > some other weirdness, some of which is also apparent in the bablylon5 > tests, and others which might be. > > My test system is an amd64 XEN DomU - configired with no swap space, and > just 1GB RAM. It typically has a tmpfs mounted limitted to 1/2 GB > (actually slightly more than that - not sure where I got the number from, > there may have been a typo... the -s param is -s=537255936 in fstab. > That oddity should be irrelevant. > > The first thing I noticed was that when I run the t_mlock test in this > environment, it ends up failing when /tmp has run out of space. And I > mean really run out of space, in that it is gone forever, and nothing I > have thought of so far to try gets any of that space back again. And there are no files there? Even an open-unliked file should disappear when the offending process exits. > I assume that jemalloc() (aka malloc() in the test) is doing some kind > of mmap() that is backed by space on /tmp and continually grabbing more > until it eventually runs out, and that the kernel never releases that > space (even after the program that mapped it has exited). That seems > sub-optimal, and needs fixing in the kernel, anonymous mmap's (or whatever > kind jemalloc() is doing) need to be released when there are no more > processes that can possibly use them. Well, note that tmpfs also uses anonymous memory. Is it that "df" on the tmpfs is really showing a bunch of space allocated to the tmpfs? > I did not try umount -f (easier to just reboot...) but a regular umount > failed (EBUSY) even though there was nothing visibly using anything on > /tmp (and I killed every possible program, leaving only init - and yes, > that did include the console shell I use to test things). > > Umounting the tmpfs before running the t_mlock test worked fine (which also > illustrates that none of the very few daemon processes, nor the shell, etc, > from my login, are just happening to be using /tmp - and that it is the > results of the malloc() calls from t_mlock that must be the culprit. > (While ATF is running, it would be using /tmp as both its working > directory,
Re: Importing aiomixer to src
> On May 6, 2021, at 6:44 AM, nia wrote: > > I think this version is suitable for importing into NetBSD base > for continued development. This will also allow us to easily > keep it in sync if there are any changes to the audio stack. This is really cool, I would support this! -- thorpej
Re: draft mcount(3) man page
> On May 21, 2021, at 5:06 PM, David Holland wrote: > > The workings of the various *mcount functions are confusing and > currently undocumented. After wading into it just now for I think the > third time recently, I wrote the following to ease the process. This looks awesome, David, thanks for writing it! Might it be worth noting some of the historical legacy symbol naming issues that were are the root of the bug report we were just talking about a little while ago? > > Comments? > > -- > .\" > .\" Copyright (c) 2021 The NetBSD Foundation, Inc. > .\" All rights reserved. > .\" > .\" This code is derived from software contributed to The NetBSD Foundation > .\" by David A. Holland. > .\" > .\" Redistribution and use in source and binary forms, with or without > .\" modification, are permitted provided that the following conditions > .\" are met: > .\" 1. Redistributions of source code must retain the above copyright > .\"notice, this list of conditions and the following disclaimer. > .\" 2. Redistributions in binary form must reproduce the above copyright > .\"notice, this list of conditions and the following disclaimer in the > .\"documentation and/or other materials provided with the distribution. > .\" > .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS > .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > LIMITED > .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR > .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS > .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF > .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS > .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN > .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) > .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE > .\" POSSIBILITY OF SUCH DAMAGE. > .\" > .Dd May 21, 2021 > .Dt MCOUNT 3 > .Os > .Sh NAME > .Nm mcount , > .Nm _mcount , > .Nm __mcount > .Nd profiling hooks > .Sh LIBRARY > .Lb libc > .Sh SYNOPSIS > .Fn _mcount "/* asmlinkage */" > .Fn __mcount "u_long frompc" "u_long selfpc" > .Sh DESCRIPTION > The > .Fl pg > option to > .Xr cc 1 > causes the compiler to insert calls to a profiling hook > at the entry to each function it compiles. > This allows code in the standard library to collect call graph > information. > .Pp > The compiler inserts a call to > .Fn _mcount , > typically written in assembler, which is responsible for saving > registers and calling > .Fn __mcount . > .Pp > Note however that on some targets, e.g. vax, these names are > reversed. > The name of what this manual page refers to as > .Fn _mcount > is determined by the compiler, and the name of what this manual > page refers to as > .Fn __mcount > is determined by the macro > .Dv MCOUNT_DECL > in > .In machine/profile.h > (q.v.). > .Pp > The > .Ar frompc > argument to > .Nm __mcount > is the address the call came from, that is, the original return > address; the > .Ar selfpc > argument is the address of the callee, normally expressed as the > return address of the call to > .Nm _mcount . > Note that > .Nm _mcount > itself does not have arguments in the normal sense; it receives stack > and register contents consistent with the call location inserted by > the compiler, which is usually at the beginning of the original > function but may vary by target. > It is responsibile for extracting the arguments to > .Nm __mcount > from that state, in addition to preserving register contents as > needed. > .Pp > The code for > .Nm _mcount > is on some machines contained in the macro > .Dv MCOUNT > in > .In machine/profile.h , > and on others in its own source file. > In the former case, > .Nm __mcount > can be and often is declared static. > .Sh RETURN VALUES > None. > .Sh RESTRICTIONS > Both > .Nm _mcount > and > .Nm __mcount > are internal interfaces and must not be called directly by application > code. > The symbols are not even guaranteed to exist unless the > .Fl pg > option was given to > .Xr cc 1 . > This manual page is provided to document the internal workings for > developers who need to adjust them. > -- > > -- > David A. Holland > dholl...@netbsd.org -- thorpej
Re: fuse_opt.h
> On Jun 3, 2021, at 7:19 AM, Emmanuel Dreyfus wrote: > > Is there any reason not to fix it? Seems like it would be a bad idea NOT to fix it. I assume one of the goals of librefuse was to be ABI compatible? -- thorpej
Re: CVS commit: src/share/misc
> On Mar 30, 2021, at 8:03 PM, Christos Zoulas wrote: > > There are 3 x 'sizeof(' in the tree compared to 'sizeof ' in '*.c' and > I am counting 'sizeof (' as 'sizeof ': > > 191337 'sizeof(' > 63508 'sizeof ' > > I think that it is better to bless the prevailing majority as the rule, > but we should let others express their opinions first. "sizeof(...)" is definitely my preference, in no small part for symmetry with "offsetof(...)" (or, at least, the historical definition thereof before the compiler had a built-in...) -- thorpej
Re: proposal: remove traditional C support from lint
> On Mar 16, 2021, at 4:55 PM, Roland Illig wrote: > > Any objections to removing the -t flag and everything that belongs to it? No objections from me. -- thorpej
Re: Q about pthread condition variable usage - mixing interlocks
> On Apr 8, 2021, at 11:15 AM, Mouse wrote: > > I'm curious: why do you care? If, of course, it's anything you can > talk about. I have a set of futex-based synchronization objects for libpthread I’ve been hacking on off-and-on for a while. -- thorpej
Re: Q about pthread condition variable usage - mixing interlocks
> On Apr 8, 2021, at 11:15 AM, Mouse wrote: > >> Is there ANY situation where, for a given pthread condition variable, that u$ > > For what value of "legitimate"? > > I see no reason why this would/should be forbidden, and, indeed, at > least the 5.2 manpage for pthread_cond_wait seems to imply that it is > permitted, provided the condvar is idle: Ah, yes, I should clarify, I am speaking about only when the condition variable is NOT idle. And now that I read IEEE Std 1003.1-2017 again, I see: “”” During this time, the effect of an attempt by any thread to wait on that condition variable using a different mutex is undefined. “”” I.e. I was wrong that there was language missing from the spec. I’ll save myself some work and, rather than attempt to gracefully handle this sort of thing, I’ll just return EPERM when the situation is detected. Thanks, folks. -- thorpej
Q about pthread condition variable usage - mixing interlocks
As far as I can tell, POSIX has no language that covers this situation, so I’m tossing it out here for the hive mind... Is there ANY situation where, for a given pthread condition variable, that using a different mutex as the interlock in two different calls to pthread_cond_wait() would EVER be legitimate usage? I cannot come up with any scenario in which this would be appropriate, but who knows what some crazy applications might do. POSIX for whatever dumb reason allows pthread_cond_signal() / pthread_cond_broadcast() without holding the interlock, but at least there’s explicit language about this in the spec (and thankfully the language contains a mild finger-wag about it). But as I said, the spec is silent on this particular question, and hopefully it’s because the answer is “NO THAT IS A COMPLETELY STUPID THING TO DO AND ANY SELF RESPECTING THREAD LIBRARY WILL IMMEDIATELY CAST YOUR APPLICATION INTO THE NINTH CIRCLE OF HELL IF YOU TRY IT”. What say you, smart people? Merci d’avance. -- thorpej
Native timerfd and eventfd
[Cross-posted to tech-userlevel because this is primarily about an API exposed to user-space.] This past winter, I wrote NetBSD native implementations of the Linux timerfd[1] and eventfd[2] APIs, mainly to fill out gaps in COMPAT_LINUX. Of course, I wanted to write ATF unit tests for them, so I needed them to be native syscalls, too. The code to implement these is quite small, and the APIs are not unreasonable (eventfd is certainly less cumbersome to use for its intended purpose than a pair of pipes, and timerfd is a much better interface than EVFILT_TIMER and essentially serves as a descriptor-oriented interface to the regular POSIX timer support). Because of this, I think they should be included as native syscalls. If the consensus is that these are reasonable things to add as native syscalls, I’ll whip up man pages for them and do that. Otherwise, I’ll just relegate them to COMPAT_LINUX and toss the ATF unit tests (I see no mechanism to run ATF unit tests against COMPAT_* syscalls, and I’m not particularly inclined to invent one). [1] https://man7.org/linux/man-pages/man2/timerfd_create.2.html [2] https://man7.org/linux/man-pages/man2/eventfd.2.html -- thorpej
Re: Anyone still using YP support on NetBSD?
> On Sep 28, 2021, at 7:35 AM, Izumi Tsutsui wrote: > > FYI, > https://mail-index.netbsd.org/netbsd-bugs/2018/09/24/msg059139.html I guess old habits die hard :-) Ok, well, then I won’t ask if it should be G/C’d :-) -- thorpej
Anyone still using YP support on NetBSD?
Anyone? -- thorpej
Re: Libc support for old BSD-style "sigcontext" signal handlers is broken, let's just kill it.
> On Oct 26, 2021, at 8:27 AM, Valery Ushakov wrote: > > I'm not sure why we left the sigcontext version in the tree. I would > guess for reference only, so it doesn't really need any fixing, as far > as I understand. Presumably it was to keep old binaries that used the sigcontext style would keep working. If it was for reference only, one would think it would have been commented as such and the binary objects not actually built. The commit that broke it had the log message “make these consistent”, which doesn’t really convey to me that it was intended to disable them. -- thorpej
Re: Libc support for old BSD-style "sigcontext" signal handlers is broken, let's just kill it.
> On Oct 26, 2021, at 6:56 AM, Jason Thorpe wrote: > > Obviously, the practical impact of this is nil, since no one apparently > noticed (and I guess we didn’t break any programs that people were using). We > haven’t documented that style of handler for a VERY long time but presumably > the intent was for it to continue to work because the code remained and was > modified several times over the years. (it was certainly my original intent > when I first added support for multiple signal trampolines in a single > process ~20 years ago, and it did in fact work for the first few years that > code existed). Oh, I suppose I should also point out that kernel support for the older trampoline is in the COMPAT_16 module (the new “siginfo” stuff was added in NetBSD 2.0). -- thorpej
Libc support for old BSD-style "sigcontext" signal handlers is broken, let's just kill it.
Hey folks — I was looking into the request for a libc function that determined if a program counter was inside a signal trampoline (see https://mail-index.netbsd.org/tech-kern/2021/10/15/msg027703.html), and the first thing I decided to do as part of this was unify all of the __libc_sigaction14() functions … each architecture has its own copy, and they’re mostly identical, except for in their support (or not) for old-style “sigcontext” handlers and their trampoline ABI versions (VAX had actually rev’d its sigcontext trampoline ABI before “siginfo” support was added). Anyway, a few defines in with the ability to override values in will do the trick. However, as I was adding an #ifdef conditional to the sigcontext handling in the new unified __sigaction14_sigtramp.c (because new architectures added after “siginfo” support was added never really needed it, and didn’t define a “struct sigcontext”), I realized that in 2006, support for sigcontext handling was broken on platforms that previously supported it (each copy gained a __LIBC12_SOURCE__ conditional around the sigcontext support, but nothing ever defined __LIBC12_SOURCE__ for those files). I went ahead and listed it out by architecture in my own notes, but the TL;DR is that “sigcontext” style signal handlers don’t work on any architecture. For those not familiar, there are 3 basic types of signal handlers: 1. void handler(int sig); // simple traditional handler 2. void handler(int sig, int code, struct sigcontext *scp); // sigcontext handler 3. void handler(int sig, struct siginfo *info, ucontext_t *ctx); // siginfo handler Originally, the code in libc to select the correct trampoline for each type selected the “siginfo” trampoline if SA_SIGINFO was set in the sigaction, otherwise it used the “sigcontext” trampoline. The trampoline is responsible to restoring the previous program flow, and it needs to know what format the saved registers are in to do so, which is why there are two types. Since Feb 2006 (with one minor exception; don’t worry, it was broken a few months later), the “siginfo” trampoline has always been chosen, meaning that any old program dynamically linked against libc that uses style “2” above would get garbage in the “code” and “scp” arguments. Obviously, the practical impact of this is nil, since no one apparently noticed (and I guess we didn’t break any programs that people were using). We haven’t documented that style of handler for a VERY long time but presumably the intent was for it to continue to work because the code remained and was modified several times over the years. (it was certainly my original intent when I first added support for multiple signal trampolines in a single process ~20 years ago, and it did in fact work for the first few years that code existed). At this point, I don’t see any value in going back and actually fixing sigcontext style handlers. My preference here is to simply remove that trampoline from libc. I do think kernel support should remain, because we generally are sticklers for binary compatibility, but we should completely hide the “struct sigcontext” definitions from user-space in . Thoughts? -- thorpej
Re: Libc support for old BSD-style "sigcontext" signal handlers is broken, let's just kill it.
> On Oct 26, 2021, at 11:07 AM, Robert Elz wrote: > >Date:Tue, 26 Oct 2021 18:27:20 +0300 >From:Valery Ushakov >Message-ID: > > | OTOH the old > | *binaries* (using old dynamic libc, or linked with old static libc) > | still need the kernel support. > > I think the point is that the "old dynamic libc" is the current libc, > unless an old binary is so old that it linked against libc.11 or earlier > (which means way back into the last millenium sometime). > > So not only is kernel support needed, libc (12) support is as well. Once > the fabled libc version bump happens this will all change. I mean, I could certainly fix it (and rev the symbol), just don’t know if it’s worth the bother since it’s been broken for so long. -- thorpej
Re: crypt_r()?
> On Feb 15, 2022, at 3:30 PM, Joerg Sonnenberger wrote: > > Am Wed, Feb 16, 2022 at 12:04:16AM +0100 schrieb Niclas Rosenvik: >> do you mean that the interface should be >> crypt_r(const char *key, const char setting, char * storage, size_t >> *storage_len) >> where storage can be set to NULL to return the needed storage size in >> storage_len? > > No. There are two sensible interface contracts here: > > (1) Verification only, which takes the password and the expected hash > and returns a bool. No separate settings necessary as the hash already > contains all the necessary parameters. > > (2) Hashing password, which takes the password and the settings and > returns an allocated string with the resulting hash. This is essentially > the same interface as crypt(3), but the caller is responsible for > free(3) the return value. Given that the goal of the crypt(3) interface > is to be slow, optimizing a memory allocation away saves nothing, except > making a more complicated interface. While we’re on the topic of “what we’d like the interfaces to do”… There really should be a function that takes a user name or ID and a cleartext password string, and IPCs to another (trusted system) process to do the verification, so that programs that want to verify passwords don’t need root privileges. Same for changing passwords. Then there’s no need for any of this crypt(3) tomfoolery. -- thorpej
Re: crypt_r()?
> On Feb 15, 2022, at 5:13 PM, Mouse wrote: > >> There really should be a function that takes a user name or ID and a clearte$ > > Maybe. But then you have a lot more failure modes and a lot more > possible attack surface. It would also mean that you can't check or > change passwords in single-user mode without starting the magic daemon; > that would be a substantial regression as far as user experience goes, > if nothing else. And what about checking the root password for > single-user boot with insecure console? You put the fallback logic into the function libc that can satisfy the request using the Old Way if the helper isn’t available. Obviously, to satisfy it the Old Way, the process would need to have root privileges, but this would be OK in the scenario you’re describing. > It _is_, however, very much in keeping with the "encapsulate > single-purpose code into a single place" attitude that has brought a > lot of benefits. I wonder if there isn't some better way I'm missing. It’s certainly a lot better than having a big complex program (that exposes itself to the network, potentially) require root privileges just to check passwords. -- thorpej
Re: Proposal: remove usr.bin/mkstr
> On Apr 9, 2022, at 8:49 AM, Robert Elz wrote: > Just stop suggesting removing things for no better reason than > that you see no point keeping them. If the existence of something > which seems not to be all that necessary is being a roadblock > to getting other work done, then by all means, request assistance, > from which sometimes the solution might be to simply remove the > roadblock -- other times a suitable alternative might be found. > But while it is doing no harm, leave it alone. +1 I’m definitely guilty of asking for things to be removed, but in the case of my suggestions, it’s because their presence were actual impediments to progress (e.g. keeping round the Token Ring and FDDI code, which no one had apparently used in many years, was impeding progress on network stack improvements). I’m all for keeping old stuff around. -- thorpej
postinstall(8): Add opensslcerts item to regen /etc/openssl/certs.
> On Aug 26, 2023, at 1:59 AM, Taylor R Campbell wrote: > > Module Name: src > Committed By: riastradh > Date: Sat Aug 26 05:59:00 UTC 2023 > > Modified Files: > src/usr.sbin/postinstall: postinstall.in > > Log Message: > postinstall(8): Add opensslcerts item to regen /etc/openssl/certs. > > Works only with destdir /, since it relies on running openssl(1), > which is not available as a tool or required in the cross-build > environment. Maybe there should be a boot-time check in an rc script for an out-of-date trust cache? -- thorpej
Re: installboot option for u-boot bootloader paths
> On Jul 5, 2022, at 8:28 PM, Brook Milligan wrote: > > Installboot can install U-Boot boot blocks directly into a system image. > Normally, the U-Boot files are searched for in /usr/pkg/share/u-boot, under > the expectation that most people will build them with pkgsrc. However, it is > also possible to set an environment variable to a colon-separated search path > to accommodate other situations. > > I would like to add a command line option (tentatively -u) to set the search > path and override the environment or default. > > Please see the attached patch. I would appreciate comments and improvements. Looks good to me! > Thanks a lot. Thank *you*! -- thorpej
Re: Reading I2C HID descriptor from userland
See responses further down in-line: > On Aug 20, 2022, at 8:51 AM, Emmanuel Dreyfus wrote: > > Hello > > I have a ihidev device that fails to find its HID descriptor. In order to > debug that, I try to retreive it from userspace. > > At kernel boot, I have: > ihidev1 at iic2 addr 0x5d > > I try the code at the end of this message, inspired from sys/dec/i2c/ihidev.c. > I am supposed to find the HID descriptor at address sc_hid_desc_addr, which > should come frome ACPI. > > The HID over I2C specification is at > https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn642101%28v=vs.85%29?redirectedfrom=MSDN > Section 10 deals with ACPI bindings. It says the HID descriptor > address is to be find in a _DSM object that "defines a structure > that contains the HID Descriptor address (2 bytes)" > > acpdump -dt gives me this: >Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method >{ >If ((Arg0 == HIDG)) >{ >Return (HIDD (Arg0, Arg1, Arg2, Arg3, HID2)) >} > >Return (Buffer (One) >{ > 0x00 // . >}) >} > > How should this be readen? I found the HIDG definition earlier in the > ACPI dump, and my device has Name (HID2, One). >Name (HIDG, ToUUID ("3cdff6f7-4267-4555-ad05-b30a3d8938de") /* HID I2C > Device */) > (...) >Method (HIDD, 5, Serialized) >{ >If ((Arg0 == HIDG)) >{ >If ((Arg2 == Zero)) >{ >If ((Arg1 == One)) >{ >Return (Buffer (One) >{ > 0x03 > // . >}) >} >} > >If ((Arg2 == One)) >{ >Return (Arg4) >} >} > > Hence 1 or 3? Both return garbage. In fact I tried to brute-force the thing, > I never get anything relevant. It’s definitely not going to be 3, because the arguments being passed to the HIDD method from the _DSM are: Arg0 = 3cdff6f7-4267-4555-ad05-b30a3d8938de Arg1 = 1 Arg2 = 1 Arg3 = NULL Arg4 = One (I guess?) What does the ihidev driver think the descriptor address is? I.e. what’s the result of the _DSM evaluation? Can you provide your entire ACPI dump? -- thorpej
Re: dynamic linker change to handle multiple PT_LOAD segments
> On Jan 5, 2023, at 8:53 AM, Christos Zoulas wrote: > > Hello, > > Our dynamic linker ld_elf.so in map_object.c currently can only handle 2 > PT_LOAD segments (one for text and one for data); the kernel elf loader does > not have this limitation, it can load multiple PT_LOAD segment. The following > patch (from FreeBSD) removes this limitation from the dynamic linker. The > reason I made this patch now is that the latest binutils (2.39) for amd64 > sets a maxinum pagesize of 2Mb, and when using relocation read only binaries > (ld -z relro) this will cause the binaries to have an extra 2Mbytes for > alignment. This can be fixed by building binutils to set a separate code > segment by default (ld -z separate-code). Binutils then sets the maximum page > size to 4Kb. Setting this option creates 4 PT_LOAD segments, two for text > (r--, r-x) and two for data (r--, rw-), which also improves security. This is > the default for linux on x86 and the patch also makes it the default for > NetBSD x86. The patch also adds -z noseparate-code to the kernel builds so > that we don't need to fix the boot loaders. I am planning to commit this > soon, so please let me know if you hsve any objections. > > Best, > > christos Is this ld.elf_so change also going to be put into NetBSD 10? (Seems like it could / should be?). That way, the configuration changes for the toolchain (which you will presumable want to upstream) can match against NetBSD 10 for the new defaults. -- thorpej
Re: style(5) proposal: forbid extern in .c
> On Mar 15, 2023, at 4:23 AM, Taylor R Campbell > wrote: > > Proposal: Forbid extern declarations in .c files. > > extern declarations in .c files invite easily avoided bugs where the > definition and use have mismatched types, because the compiler doesn't > have an opportunity to check them. Fix: Always put the extern > declaration in a .h file shared by the .c file defining it and the .c > files using it. > > Pretty simple. Any objections? LGTM. Would be nice if there was a way to get the compiler to enforce this. -- thorpej
Re: style(5) proposal: forbid extern in .c
> On Mar 28, 2023, at 5:53 PM, Roland Illig wrote: > > Am 15.03.2023 um 16:40 schrieb Jason Thorpe: >> >>> On Mar 15, 2023, at 4:23 AM, Taylor R Campbell >>> wrote: >>> >>> Proposal: Forbid extern declarations in .c files. >>> >>> extern declarations in .c files invite easily avoided bugs where the >>> definition and use have mismatched types, because the compiler doesn't >>> have an opportunity to check them. Fix: Always put the extern >>> declaration in a .h file shared by the .c file defining it and the .c >>> files using it. >>> >>> Pretty simple. Any objections? >> >> LGTM. Would be nice if there was a way to get the compiler to enforce this. > > I implemented this check as a lint warning, for both objects and functions. > > Building NetBSD/amd64 results in 5054 new warnings, which makes this > warning the second-most popular, right between the 11193 'conversion may > lose accuracy' and the 4974 'argument unused in function'. This is great, thanks Roland! I suspected you would make this happen :-) -- thorpej
Re: epoll exposure
> On Jul 31, 2023, at 6:25 PM, Rin Okuyama wrote: > > epoll(2) for COMPAT_LINUX is implemented in as similar manner to > NetBSD's. Therefore, it should also have this compatible issue. > Is is still useful for real Linux applications? Or do you have a > plan to lift this limitation? When I first looked at epoll() (back when I wrote eventfd and timerfd), I looked at the FreeBSD implementation and specifically chose not to use it because it was based on kqueue, and thus has the “kq descriptors not inherited across forks” wart. The extent to which this will impact real Linux implementations … honestly I have no idea. Perhaps the BEST solution is to fix this stupid limitation in kqueue (obviously version the system call for backward BSD compatibility). -- thorpej
Re: epoll exposure
> On Jul 31, 2023, at 6:53 PM, Theodore Preduta wrote: >> >> epoll(2) for COMPAT_LINUX is implemented in as similar manner to >> NetBSD's. Therefore, it should also have this compatible issue. > > The epoll implementation is literally a direct port of the one from > FreeBSD's Linux compatibility layer, and so has the same limitations. > >> Is is still useful for real Linux applications? Or do you have a >> plan to lift this limitation? > > As for if it is still useful, yes! I've tested a bunch of Linux > binaries (admittedly mostly ones written in Go) and they all worked as > expected. So from what I can tell, the fork() limitation is not that > big of a deal in practice. I want to make clear — I’m not trying to knock on the work that’s been done here… in fact, kudos to you, Theo, for doing this! I only wanted to raise the concern about it as a *native* call… If it’s exposed natively, application developers might expect it to behave the same way, including the inherit-across-fork thing… I think if there’s a minor glitch in emulation, it’s a little easier to shrug off. When I did timerfd and eventfd, I made sure the behavioral differences between the native NetBSD implementation and Linux were limited to things that didn’t make sense on NetBSD anyway, and would only be noticed by things that are elbow-deep in weird Linux-specific sh*t for other reasons :-) Anyway, like I said, I think the best way forward is to make it possible for kq descriptors to be inherited… it’s a little tricky because of some of the wacky stuff kqueue can track, but I think NetBSD can lead on this and define a set of semantics that makes sense. -- thorpej
Re: epoll exposure
> On Jul 31, 2023, at 1:38 AM, nia wrote: > > Hey, I regret that epoll was committed without further discussion with > pkgsrc developers. We have a lot of experience with this already > (illumos/SmartOS exposes a compatibility epoll) and the situation is > not entirely great and requires lots of workarounds. What’s even worse … I think our epoll() emulation isn’t 100% compatible w/ Linux’s. It wouldn’t bother me so much if it were, but I’m pretty sure it’s not. (IIRC, epoll descriptors are inherited across fork, and if ours is emulated with kqueue under the covers, then that can’t work…) -- thorpej
Re: Trivial program size inflation
> On Jun 30, 2023, at 11:30 PM, tlaro...@polynum.com wrote: > > Le Fri, Jun 30, 2023 at 01:37:10PM -0400, Mouse a écrit : >> Based on something at work, I was looking at executable sizes. I >> eventually tried a program stripped about as far down as I could: >> >> int main(void); >> int main(void) >> { >> return(0); >> } >> >> and built it -static. size on the resulting binary: Ok, so: the-ripe-vessel:thorpej 10$ cat nop.c int main(void) { return 0; } the-ripe-vessel:thorpej 11$ cc -static nop.c the-ripe-vessel:thorpej 12$ size a.out textdata bss dec hex filename 581799 28928 2176553 2787280 2a87d0 a.out the-ripe-vessel:thorpej 13$ cc --version cc (nb1 20210411) 10.3.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. the-ripe-vessel:thorpej 14$ Should be pretty trivial to see what’s in there… the-ripe-vessel:thorpej 15$ nm a.out | wc -l 1073 the-ripe-vessel:thorpej 16$ nm a.out | grep printf 00470176 T __vfprintf_unlocked_l 0046e96d T _snprintf 0046e9fc T _snprintf_l 00410319 T _snprintf_ss 00474706 T _vfprintf_l 0046e95b T _vsnprintf 0046e720 T _vsnprintf_l 0046d440 T _vsnprintf_ss 004370cf T je_malloc_cprintf 00437151 T je_malloc_printf 00436fcc T je_malloc_snprintf 0043704e T je_malloc_vcprintf 00434da1 T je_malloc_vsnprintf 0046e96d W snprintf 0046e9fc W snprintf_l 00410319 W snprintf_ss 004746b1 T vfprintf 00474706 W vfprintf_l 0046e95b W vsnprintf 0046e720 W vsnprintf_l 0046d440 W vsnprintf_ss the-ripe-vessel:thorpej 17$ Hmmm, who is calling stdio routines here? the-ripe-vessel:thorpej 19$ nm a.out | grep "je_" | wc -l 392 the-ripe-vessel:thorpej 20$ Oh look, the entirety of jemalloc seems to be included in the binary. WTF knows why that’s happening, but apparently it is, and jemalloc pulls in a ton of additional stuff. Someone should maybe take a look into that. -- thorpej
Re: Trivial program size inflation
> On Jul 1, 2023, at 8:51 AM, Mouse wrote: > > Why TLS?? Is there some networking going on under the hood? Thread Local Storage -- thorpej
Re: Trivial program size inflation
> On Jul 1, 2023, at 8:20 AM, tlaro...@polynum.com wrote: > > This is also what I meant by "static seems to be considered deprecated". Honestly, I find the obsession with static linking hilariously quaint. NetBSD already bends backwards to an extreme degree by ensuring that old version of *system calls* work correctly *in the kernel*. Some other systems provide the ABI contract at the libc boundary, and let the libc <-> kernel interface be fluid (keep the compatibility stuff in user-space where it belongs!) Obviously this is not feasible to do with static binaries, and we have one platform that ONLY supports static, but for the rest, I honestly think we should officially deprecate static linking in the general case (obviously it has some uses in super-constrained environments, but in those cases we are often also using crunch’d binaries as well). -- thorpej
Re: Trivial program size inflation
> On Jul 1, 2023, at 1:41 PM, Michael van Elst wrote: > > thor...@me.com (Jason Thorpe) writes: > >> Obviously this is not feasible to do with static binaries > > In the world of go, you skip libc and static binaries are the norm. Go sounds poorly designed, then. -- thorpej
Re: Trivial program size inflation
> On Jul 1, 2023, at 6:20 PM, Emmanuel Dreyfus wrote: > > On Sat, Jul 01, 2023 at 02:25:03PM +, RVP wrote: >> Not to forget the code for C++ support. And, of course even static >> binaries may call dlopen() and friends. So that dl*() and the ELF bits >> right there. > > At least in 9.3, dlopen() in a static binary does not work. Try using > a NSS module from a statically lunked binary to check that. Right, because the dynamic linker is not loaded into static binaries. For dynamic binaries, the kernel maps it as the program’s interpreter. -- thorpej
Re: NetBSD 10 and NetBSD 11...
> On May 17, 2023, at 11:47 AM, tlaro...@polynum.com wrote: > > But the fact that the advertised list of changes for 10 stops at > February 2023 is probably something the webmaster(s) should look > at: it's a bit confusing/disturbing... Those are just the major changes. There has been a focus on bug fixes and other minor improvements since then. Alas, often those don't warrant calling out in the CHANGES file. Of course, major, potentially-destabilizing changes that are trickling into -current now are not going to be pulled into netbsd-10 until they've had more bake time, and not at all if they involve changes to the ABI. -- thorpej
Re: NetBSD 10 and NetBSD 11...
> On May 17, 2023, at 11:21 AM, tlaro...@polynum.com wrote: > > I don't know on what mailing list to ask this... > > I have seen on the web site that changes to 10 have stopped in february > and that changes are now for 11... > > Does this mean that 10 will never be released and that the focus is on > 11 now? There have been a steady stream of bug fixes from the trunk being pulled into the netbsd-10 release branch. I don't have knowledge of releng@'s plans vis a vis a release date. -- thorpej
Re: Issues with lseek(2) on a block device
> On Feb 21, 2024, at 2:52 AM, Sad Clouds wrote: > > Hello, for most operating systems determining the size of a block > device can be done with: > > lseek(fd, 0, SEEK_END); On what operating systems does this do what you claim? > However, on NetBSD this does not seem to work. It doesn’t work on macOS, either: thorpej-mbp:thorpej 541$ sudo ./lseek /dev/disk4 Password: lseek(fd, 4096, SEEK_SET) = 4096 bytes lseek(fd, 0, SEEK_END)= 0 bytes thorpej-mbp:thorpej 542$ -- thorpej
Re: Issues with lseek(2) on a block device
(This should really be on tech-kern…) > On Feb 21, 2024, at 1:20 PM, Taylor R Campbell > wrote: > >> Date: Wed, 21 Feb 2024 10:52:55 + >> From: Sad Clouds >> >> Hello, for most operating systems determining the size of a block >> device can be done with: >> >> lseek(fd, 0, SEEK_END); >> >> However, on NetBSD this does not seem to work. > > Internally, this is happens for more or less the same reason that > stat(2) doesn't return the size of a block device in st_size. > > Each file system on which device nodes can reside implements its own > VOP_GETATTR (used by both lseek(2) and stat(2)), and most of them use > the _inode_ size (more or less) rather than querying the block device > that the device node represents for its physical size. Add a d_stat devsw entry point and implement spec_getattr()? Except you don’t want ALL of the results from spec_getattr(), just some of them - rather, specfs can’t possibly provide all of the info, just some of it. So the file system that owns the vnode (UFS, in this case) would need to call spec_getattr() to get the stuff that specfs can provide, then do its own stuff to overlay the usual UFS stuff over it. -- thorpej
Re: Issues with lseek(2) on a block device
> On Feb 25, 2024, at 12:29 AM, Robert Elz wrote: > > ps: as a semi-related anecdote, the first system I ever personally > installed any unix version onto (way back in the mid 70's) booted from > a reel to reel (1/2" I think) tape - and by booted, I mean ran with its > filesystems (or filesystem, everything was in one) on the tape (read only, > naturally). HP-UX install media on QIC-24 back in the HP-UX 7/8/9 days (earliest I have experience with) was also mounted as the file system. It’s quite a trip to see (well, hear, anyway) a tape drive get randomly-accessed like that. And, yah, it’s not fast. -- thorpej