Re: Issues with lseek(2) on a block device
On Sat, Feb 24, 2024 at 10:54:56PM +, Taylor R Campbell wrote: > > Date: Sat, 24 Feb 2024 16:21:42 -0500 > > From: Thor Lancelot Simon > > > > On Wed, Feb 21, 2024 at 09:20:55PM +, Taylor R Campbell wrote: > > > I think this is a bug, and it would be great if stat(2) just returned > > > the physical medium's size in st_size -- currently doing this reliably > > > takes at least three different ioctls to handle all storage media, if > > > I counted correctly in sbin/fsck/partutil.c's getdiskinfo. > > > > I am not sure this can be done for all block devices. Tapes have block > > devices, and open-reel tape drives do not really know the length of the > > loaded media, while on any other tape drive supporting compression, there > > may really be no such size. > > Sure, it's fine if it doesn't give an answer (or, returns st_size=0) > for devices where there's no reasonable answer. > > But for a medium which does have a definite size that is known up > front, it should just be returned by stat(2) in st_size instead of > requiring a dance of multiple different NetBSD-specific ioctls to > guess at which one will work. We need to be careful with the definition of "definite size". In simple terms, we can think of examples such as: A physical HDD has a fixed size(*), so stat should report that, whereas a tape drive with removable media can potentially present different sizes so stat should return zero or indicate that situation in some other way. But in a world of virtual machines with block devices that are dynamically re-sizable on the host, it becomes more difficult. Consider such a device which _could_ be resized but in practice almost never is: If stat returns what it thinks it the current or stable size of the device and userland software retrieves and stores that value for future use rather than using it immediately, then bad things will likely happen if the real value changes later on. On the other hand, if stat returns zero for the device because it could possibly change size, (even though it almost certainly won't), then the userland code will not function in conjunction with that device even though it otherwise could. (*) Ignoring the fact that the drive firmware can often be configured to clip the size and report a smaller disk than is physically present. In most cases, unless we are talking about a low-level disk utility, if userland code is trying to find out the size of a raw block device then it seems like a design error.
Re: Issues with lseek(2) on a block device
On Thu, Feb 22, 2024 at 08:13:28AM -0500, Mouse wrote: > >>> lseek(fd, 0, SEEK_END); > [on a disk device] > > >> [...] > > [...] > > This is such a buggy behaviour that [...] > > I wouldn't call it buggy, not unless there is a spec that it's supposed > to conform to that says otherwise (even if the "spec" is just an > author's description of intent), which is something I so far haven't > seen reason to think exists. It looks to me like "we didn't bother > making it do anything in particular, so you get whatever it happens to > give you". In general: If a file descriptor references anything other than a regular file, then the assumptions that portable code can make about it are constrained.
Re: [patch] cat -n bug from 33 years ago
On Sat, Nov 18, 2023 at 02:24:48PM -0500, Mouse wrote: > >>> Numbering currently starts over at 1 for each input file [...] > >> If you want to have continuing (non-restarting) numbering for > >> multiple input files, one could use "cat file1 file2 | cat -n". > > True, that would be a workaround. > > > But shouldn't the current behaviour still be fixed? > > First, I think, we should decide whether "fixed" is an appropriate > word. Perhaps it's just me, but I don't consider Linux cat to be a > reference implementation. (Indeed, I don't consider the Linux > implementation of pretty much _anything_ to be a reference, except for > Linux-specific things.) Well I said 'fixed' because it would be reverting to the way it worked in the oldest BSD versions that I've looked at. Generally before 4.3, the line number counter was a global variable 'lno' and it was only initialised to 1 once at the start. 4.3-Reno seems to be where it was changed to a local variable 'line' in cook_buf(), and that version has survived in to the modern BSDs. The lack of any mention in the manual makes me think that it wasn't an intentional change. I've since noticed another related issue, this time with -s, which is supposed to replace multiple empty lines with a single one. $ echo "foo\n\n\n" > 1 $ echo "\n\n\nbar" > 2 $ cat -s 1 2 foo bar Whereas on 4.2BSD, the empty line at the end of 1 and the empty line at the start of 2 are in fact collapsed: $ cat -s 1 2 foo bar If nothing else, it's interesting that functionality can change after ten years and then go 30 years with nobody noticing :-/.
Re: [patch] cat -n bug from 33 years ago
On Sat, Nov 18, 2023 at 05:59:15PM +0100, Rhialto wrote: > On Wed 15 Nov 2023 at 08:04:42 -0300, Crystal Kolipe wrote: > > The attached patch fixes a bug in /bin/cat when using -n with multiple input > > files. This bug seems to have been introduced in 4.3BSD-Reno. > > > > Numbering currently starts over at 1 for each input file, here is a simple > > reproducer: > > If you want to have continuing (non-restarting) numbering for multiple > input files, one could use "cat file1 file2 | cat -n". True, that would be a workaround. But shouldn't the current behaviour still be fixed? The restarting for each file has never been mentioned in the manual as a feature, and it isn't what most people would expect.
[patch] cat -n bug from 33 years ago
Hi, The attached patch fixes a bug in /bin/cat when using -n with multiple input files. This bug seems to have been introduced in 4.3BSD-Reno. Numbering currently starts over at 1 for each input file, here is a simple reproducer: $ echo "foo\nbar" > /tmp/1 $ echo "foobar\nbar\nbaz" > /tmp/2 $ cat -n /tmp/1 /tmp/2 1 foo 2 bar 1 foobar 2 bar 3 baz Historic BSD behaviour, (confirmed in at least 2.9-BSD, 4.1c-BSD, 4.2BSD, and 4.3BSD), and current gnu coreutils behaviour numbers the lines of the output as a single entity. For example, with gnu cat: $ gcat -n /tmp/1 /tmp/2 1 foo 2 bar 3 foobar 4 bar 5 baz Here is a proposed patch to fix it, by making 'line' a static: --- cat_netbsd.c.dist Wed Nov 15 07:43:09 2023 +++ cat_netbsd.cWed Nov 15 07:46:48 2023 @@ -170,9 +170,10 @@ void cook_buf(FILE *fp) { - int ch, gobble, line, prev; + int ch, gobble, prev; + static int line; - line = gobble = 0; + gobble = 0; for (prev = '\n'; (ch = getc(fp)) != EOF; prev = ch) { if (prev == '\n') { if (sflag) {