bug#45648: `dd` seek/skip which way is up?
On 1/4/21 7:44 PM, Bela Lubkin wrote: TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document them as pure synonyms for 'skip' and 'seek'. Thanks for doing all that research. It's compelling, and I think your patch (or something like it) should go in. I'll wait for a bit to hear other opinions.
bug#45648: `dd` seek/skip which way is up?
TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document them as pure synonyms for 'skip' and 'seek'. The implementation where I encountered it was SCO OpenServer. Like Solaris, there was a distinction between 'iseek' and 'skip' ('skip' reads, 'iseek' seeks); no distinction between 'oseek' and 'seek'. I consulted with freebsd.org/cgi/man.cgi?query=dd -- this shows that *many* OSes support these keywords. The current default display is FreeBSD 12.2, which says: 'iseek=n Seek on the input file n blocks. This is synonymous with skip=n.' 'oseek=n Seek on the output file n blocks. This is synonymous with seek=n.' Identical text exists since FreeBSD 4.0 (2000-03); Darwin 5.0.1; HP-UX 11.1; NetBSD 6.0; DEC OSF/1 4.0. These are *ancient* OSes. IRIX 6.5.30 actually documents 'seek' as 'Identical to oseek, retained for backward compatibility.', i.e. 'oseek' is the real flag in this man page's mind. The man pages from Plan 9 & Inferno 4th edition (AT&T research OSes) document 'skip', 'iseek', 'oseek', but not 'seek' at all! Regarding the actual implementation, being able to manually control seeking vs. actually doing useless I/O does not seem useful to me in 2021. The distinction exist(ed) for the benefit of things like tape drives, which of course do still exist. But back then, information about what was or was not seekable was poorly plumbed up from drivers to userland. Today, it should be clear whether a file (whatever its fundamental implementation is) is, or is not, seekable; `dd` should always attempt to seek if possible, slog through the corresponding I/O only if the underlying file cannot seek. In fact, the pointed-to Open Group specification precisely supports that position: 'skip' says, 'Skip n input blocks ... On seekable files, ... read the blocks or seek past them; on non-seekable files, ... read and ... [discard]'; 'seek' says, 'Skip n [output] blocks ... On non-seekable files, [read] existing blocks ...; on seekable files, ... seek ... or read ...' i.e. 'do I/O if not seekable; implementer's choice if seekable'. The Solaris page is the only one where there is a possible implication that 'oseek' is different from 'seek', but only because the 'oseek' description is vestigial. (Exact same text persists from Solaris 2.5.1 through the 11.2 pointed to above.) Should coreutils `dd` insist that if one uses 'oseek' and the file isn't seekable, it should fail? This violates least surprise. 'iseek' and 'oseek' should seek if possible, read if not. Whereas 'skip' and 'seek' *may* seek if possible, read if not. This distinction is uninteresting since the implementation *should* take advantage of the *may*. Both the Solaris and Open Group man pages describe 'seek' as 'Skip[s] n blocks', again showing that the words are not at all bound to a particular direction. >Bela< On Mon, Jan 4, 2021 at 6:06 PM Paul Eggert wrote: > On 1/4/21 3:07 PM, Bernhard Voelker wrote: > >> I previously encountered a `dd` implementation which also accepted > >> 'oseek=N' and 'iseek=N', which I found far more natural and easy to > >> remember. > > What 'dd' implementation was this specifically? > > Solaris dd has iseek and oseek. However, they are not aliases for skip > and seek. If coreutils dd were to add these features I expect we should > do them the Solaris way, instead of making them aliases for skip and > seek. This would take more work than the proposed patches. > > https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html >
bug#45648: `dd` seek/skip which way is up?
On 1/5/21 3:06 AM, Paul Eggert wrote: > On 1/4/21 3:07 PM, Bernhard Voelker wrote: >> What 'dd' implementation was this specifically? > > Solaris dd has iseek and oseek. However, they are not aliases for skip > and seek. If coreutils dd were to add these features I expect we should > do them the Solaris way, instead of making them aliases for skip and > seek. This would take more work than the proposed patches. > > https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html That would make the situation even more confusing for the user ... and more complex because such implementation would interfere with GNU dd's seek/skip and iflag=skip_bytes and oflag=skip_bytes functionality. Doesn't sound like a good idea. Have a nice day, Berny
bug#45648: `dd` seek/skip which way is up?
On 1/4/21 3:07 PM, Bernhard Voelker wrote: I previously encountered a `dd` implementation which also accepted 'oseek=N' and 'iseek=N', which I found far more natural and easy to remember. What 'dd' implementation was this specifically? Solaris dd has iseek and oseek. However, they are not aliases for skip and seek. If coreutils dd were to add these features I expect we should do them the Solaris way, instead of making them aliases for skip and seek. This would take more work than the proposed patches. https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html
bug#45648: `dd` seek/skip which way is up?
On 1/4/21 4:03 AM, Bela Lubkin wrote: > I constantly confuse 'seek=N' and 'skip=N'. The two words have no natural > affinity to one I/O direction or the other. While the words 'seek' and 'skip' may not be strong enough for everyone to be clear about whether they apply on input or output - e.g. for non-native English speaker like myself - they are well documented in usage() and more places: $ dd --help | grep -E ' (skip|seek)=N ' seek=N skip N obs-sized blocks at start of output skip=N skip N ibs-sized blocks at start of input FWIW these terms are required by POSIX: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html > I previously encountered a `dd` implementation which also accepted > 'oseek=N' and 'iseek=N', which I found far more natural and easy to > remember. What 'dd' implementation was this specifically? > Here is a small patch implementing the same for coreutils `dd`. In my opinion: if the word chosen for an option is not clear enough to distinguish from another one, then adding yet another alias would just increase confusion. Adding options to coreutils programs has to be carefully chosen. The only reason I'd see to add such an alias would be existing behavior in one of the other major implementations. Have a nice day, Berny
bug#45648: `dd` seek/skip which way is up?
On Jan 03 2021, Bela Lubkin wrote: > diff --git a/doc/coreutils.texi b/doc/coreutils.texi > index e9dd21c4e..417857c5e 100644 > --- a/doc/coreutils.texi > +++ b/doc/coreutils.texi > @@ -9100,6 +9100,15 @@ Skip @var{n} @samp{obs}-byte blocks in the output > file before copying. > if @samp{oflag=seek_bytes} is specified, @var{n} is interpreted > as a byte count rather than a block count. > > +@item oseek > +@item iseek The second @item needs to be @itemx. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."