Re: du enhancement
Dragan Simic writes: > On 2023-10-20 15:55, Arsen Arsenović wrote: >> Dragan Simic writes: >> >>> On 2023-10-20 15:18, Pádraig Brady wrote: >>>> On 20/10/2023 00:22, Rusty Duplessis wrote: >>>>> Would be nice to have an option to append a / to the end of directory >>>>> names, >>>>> so that you can distinguish between a file and directory when using -a. >>>>> Something like -F option to ls. >>>> It's a good suggestion. >>>> I generally only use du with single files / dirs, >>>> or otherwise I use a wrapper that makes dirs obvious (though coloring): >>>> http://www.pixelbeat.org/scripts/dutop >>>> I.e. the default output from du -a is hard to parse. >>> How about making the output of du(1) colored the same way as it currently is >>> in >>> ls(1)? I'd be willing to work on implementing that. >> I was about to post the same. I see no reason why du couldn't (or >> shouldn't) do that based on normal coloring logic already present in ls. >> Are there other tools that could use the treatment, too? > > On a somewhat unrelated note, I've been thinking already about adding coloring > to md5sum(1) and the related utilities, green for "OK" and red for any errors. That'd be nice. Presumably, 'success' and 'failure' can also be added to LS_COLORS? -- Arsen Arsenović
Re: du enhancement
Dragan Simic writes: > On 2023-10-20 15:18, Pádraig Brady wrote: >> On 20/10/2023 00:22, Rusty Duplessis wrote: >>> Would be nice to have an option to append a / to the end of directory names, >>> so that you can distinguish between a file and directory when using -a. >>> Something like -F option to ls. >> It's a good suggestion. >> I generally only use du with single files / dirs, >> or otherwise I use a wrapper that makes dirs obvious (though coloring): >> http://www.pixelbeat.org/scripts/dutop >> I.e. the default output from du -a is hard to parse. > > How about making the output of du(1) colored the same way as it currently is > in > ls(1)? I'd be willing to work on implementing that. I was about to post the same. I see no reason why du couldn't (or shouldn't) do that based on normal coloring logic already present in ls. Are there other tools that could use the treatment, too? TIA, have a lovely day :-) -- Arsen Arsenović signature.asc Description: PGP signature
Re: coreutils/man/rm.x - fails to mention POSIX "Refuse to remove path/. and path/.., as well as `.' and `..'
Dragan Simic writes: [...snip...] >> Here's an example session with which you can find, for instance, >> warn_unused_result in the GCC manual: >> $ info gcc >> i warn_unused_result RET >> This is something you couldn't do with a man-page and a pager, as >> searching for 'warn_unused_result' will produce false matches. This is >> far worse with soemthing like '-g' for obvious reasons. To reach the -g >> flag in the GCC manual, you can: >> i g RET >> which will immediately bring you to it. In case it does not, you can >> continue the index search using ','. > > Wow, this is AMAZINGLY GOOD, thank you very much! :) It reminds me of using > vim's built-in help, which is very enjoyable. Even works as expected in > index searches in GNU info, which is awesome! > >> When opening info, you should be able to hit 'h' to get access to a >> walkthrough on how to use info. Unfortunately, this might be obscured >> on your system, as many distributors package that manual as part of >> Emacs rather than Texinfo (which is, again, due to the standalone viewer >> suffering by trying to emulate Emacs, and so, trying to reuse its >> manual). My TODO list has filing bugs with a bunch of distributors to >> fix this, but so far I have not tended to that. > > Thank you once again, I'll start using info for sure! I'll also recommend it > to other people. Glad to be of assistance! Indices are certainly quite useful, and are the primary motivation behind my insistance on Infos continued usefulness (and the primary driver behind my will to further try and improve the surrounding ecosystem). Cross references and the ability to structure manuals in a tree rather than as a collection of flat pages is also up there. Have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: Feature Request: Removal of '-f' Short Option in 'rm' Command
Owen Chia writes: > Dear GNU Coreutils Team, > > I hope this email finds you well. I am writing to propose a feature request > regarding the 'rm' command in GNU Coreutils. Specifically, I would like to > suggest the removal of the '-f' short option, while retaining the '--force' > long option. > > The rationale behind this proposal stems from the observation that the '-f' > short option has led to numerous accidental deletions due to its ease of use > and associated muscle memory. By removing the short option and encouraging the > use of the longer '--force' option, we can mitigate such unintended > consequences and promote safer usage of the 'rm' command. > > I believe that requiring users to type the full long option '--force' each > time > they intend to forcefully remove files or directories would serve as an > effective safeguard against inadvertent deletions. The additional effort > involved in typing the longer option will create a deliberate pause, allowing > users to consciously confirm their intent before proceeding with potentially > irreversible actions. > > By implementing this change, we can prevent accidental data loss and promote a > culture of cautious file management practices. The modification aligns with > the > principles of user-friendly design and prioritizes the safety and integrity of > user data. > > I kindly request the GNU Coreutils Team to consider this proposal for the 'rm' > command. I understand that any changes to a widely used utility like Coreutils > require careful consideration and testing. I am available to provide further > insights, conduct additional testing, or assist in any way necessary to > support > the implementation of this feature. The idea is nice, and understandable, but POSIX requires '-f', so it's a non-starter. Have a lovely day. > Thank you for your attention to this matter, and I look forward to your > response. Your efforts in maintaining and improving the GNU Coreutils project > are greatly appreciated. > > Sincerely, > Owen Chia > aptx...@gmail.org -- Arsen Arsenović signature.asc Description: PGP signature
Re: coreutils/man/rm.x - fails to mention POSIX "Refuse to remove path/. and path/.., as well as `.' and `..'
Dragan Simic writes: > On 2023-09-25 12:58, Rob Landley wrote: >> On 9/24/23 01:37, James Feeney via GNU coreutils General Discussion wrote: >>> Sorry, that was probably a bit harsh. >> No, people used to regularly boggle at why info still exists: >> https://www.reddit.com/r/gnu/comments/240mle/why_does_gnu_cling_to_info/ >> https://unix.stackexchange.com/questions/77514/what-is-gnu-info-for >> https://unix.stackexchange.com/questions/159859/why-didnt-gnu-info-succeed-man >> https://www.reddit.com/r/linuxadmin/comments/27dxrr/does_anybody_use_gnu_info/ >> These days, info seems so dead nobody talks about it at all anymore. Not at all. It is, unfortunately, far more structured and usable than man-pages. The standalone viewer just happens to suffer from emulating Emacs. Info is not perfect (in fact, I consider the on-disk format rather terrible), but it has source material which can do better, which is far unlike pages written in roff. See my other emails which cover how Info can be made far more accessible. > Here's a brief insight into what happened about 15-20 years ago when I tried > using GNU info for the first time... I failed to see how is it supposed to be > used, and how the actual information is to be reached, after trying that for > ~10 minutes or so, maybe even a few times, IIRC. Mind you, I _wanted_ to use > info, and I did learn to use vim beforehand, which seems to be a posterboy for > hard to use utilities. Vi(m) being hard to use is somewhere between an urban legend and an inside joke. It's a ubiquitous tool with a very slight learning curve. Here's an example session with which you can find, for instance, warn_unused_result in the GCC manual: $ info gcc i warn_unused_result RET This is something you couldn't do with a man-page and a pager, as searching for 'warn_unused_result' will produce false matches. This is far worse with soemthing like '-g' for obvious reasons. To reach the -g flag in the GCC manual, you can: i g RET which will immediately bring you to it. In case it does not, you can continue the index search using ','. When opening info, you should be able to hit 'h' to get access to a walkthrough on how to use info. Unfortunately, this might be obscured on your system, as many distributors package that manual as part of Emacs rather than Texinfo (which is, again, due to the standalone viewer suffering by trying to emulate Emacs, and so, trying to reuse its manual). My TODO list has filing bugs with a bunch of distributors to fix this, but so far I have not tended to that. >> Here's a patch I used to apply to binutils 11 years ago: >> https://github.com/landley/aboriginal/blob/master/sources/patches/binutils-screwinfo.patch >> Rob -- Arsen Arsenović signature.asc Description: PGP signature
Re: man pages & info prefer HTML format
Dragan Simic writes: > On 2023-09-24 15:47, Bernhard Voelker wrote: >> On 9/24/23 14:37, Dennis German wrote: >>> After the years and fine tuning of basic HTML, why aren't the man pages >>> standardized to HTML format? >>> Perhaps some users don't frequently enough reference man pages as they >>> should and fewer use info , but (nearly) everyone uses a browser. >> Sorry, I don't get the point. >> HTML might be nice to read, but it's cumbersome to write. >> Authors do not want to care about formatting too much. >> Therefore, the documentation - both the man pages and the Texinfo manual - >> is maintained in formats which are easy to write and track. > > Good point. Another popular format, markdown, just confirms that the ease of > writing is very important, together with the ability to view the source as-is, > with no rendering applied, and to still be able to read it. Sadly, Markdown takes the opposite extreme, where it lacks basic features like indexing or cross-references, or footnotes, ... Things like reST and AsciiDoc do better in that regard. >> Besides viewing them with their principal reader (man, info, pinfo, ..), >> those formats allow the conversion to several other nice formats ... >> among them HTML. >> The Texinfo manual is rendered into various formats: HTML, PDF, DVI, ASCII. >> https://www.gnu.org/software/coreutils/manual/ >> And also the man pages are available online in HTML format: >> https://man7.org/linux/man-pages/ > > ... and on many other web sites. > >> Have a nice day, >> Berny -- Arsen Arsenović signature.asc Description: PGP signature
Re: coreutils/man/rm.x - fails to mention POSIX "Refuse to remove path/. and path/.., as well as `.' and `..'
James Feeney writes: > On Sun, 2023-09-24 at 02:49 +0200, Arsen Arsenović wrote: >> >> Many standards come and go. >> >> Note that I agree that a better info viewer (and a better info on-disk >> format) are necessary, but groff -Tutf8 -mtty-char | less -R is not >> better. It lacks the ability to navigate or reflow (the latter '.info' >> also lacks today, unfortunately). >> >> The solution to this is not to downgrade to man-pages, but to make the >> 'info' format better (the source material, i.e. the .texi, for that is >> already there) and to provide a better browser. ... > > Wikipedia tells us that "The Unix Programmer's Manual was first published on > November 3, 1971. The first actual man pages were written by Dennis Ritchie > and > Ken Thompson at the insistence of their manager Doug McIlroy in 1971." > > GNU Info is a de facto documentation standard for UNIX-like operating systems > in the same way that GNU Hurd is a de facto kernel standard for UNIX-like > operating systems - which is to say, not at all. Forcing the user to "jump > through hoops" - "Full documentation ... available locally via: info ..." - to > gain a reasonable overview of the coreutils system commands is little more > than > a juvenile disparaging of the traditional Unix Manual. > > As for Info itself, I will always use "zless > /usr/share/info/coreutils.info.gz", rather than "info '(coreutils) rm > invocation'", just to avoid dealing with info's arcane navigation commands. Try 'info --vi-keys' or pinfo. > Arguments about the de facto documentation standard for UNIX-like operating > systems is not going to be resolved here and now, and adding a couple of > sentences to coreutils/man/rm.x is not a big ask. -- Arsen Arsenović signature.asc Description: PGP signature
Re: man pages & info prefer HTML format
Dennis German writes: > After the years and fine tuning of basic HTML, why aren't the man pages > standardized to HTML format? > > Perhaps some users don't frequently enough reference man pages as they should > and fewer use info , but (nearly) everyone uses a browser. > > And I don't mean programmatically simply wrapping the man page in HTML. > > The ability to > > 1) wrap text ( not truncate a line with a hyphen and place the >last 4 or 5 characters on the next line) and let the user decide the >width of the window > > 2) embed links (rather than "see also") > > 3) use basic fonts to render variables, command name, keywords, >description and clarify optional and alternatives (rather than the >noisy apostrophes, *<* and*>* which also take of space) Note, again, that this is by no means intrinsic to Texinfo-based manuals, but to the Info format. See https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html in comparison with (gcc) Optimize Options, for instance. > 4) show the user much more information the first and every page >(rather than needing multiple lines for the simplest keyword >description) I can't speak of man-pages, since I only use them in lack of better manual formats (which is nearly any other manual format), but this is a well-known potential point of improvement to Texinfo, to the point where the Texinfo devs have been toying with a webkit-based info viewer. See https://git.savannah.gnu.org/cgit/texinfo.git/tree/TODO.HTML I've also been considering altering Gentoo to install HTML manuals and generate a HTML directory file for the manuals, as well as inserting sane styles. This wouldn't be as complete a solution as https://www.gnu.org/software/texinfo/manual/texinfo-html/index.html but it would be a start. > HTML is superior than the current format and easily customizable in width, > font > size and colors by the reader. ... but, it is significantly harder to render. I believe there's a better middle-ground, as I've mentioned in my other email. Note that HTML manuals would have some of the current issues man has (with it being too 'raw', e.g. lacking index entries, and being too focused on format rather than content, hard to restyle without altering the actual manuals, ...), but it's reasonable to render manuals in HTML. > Thank you for your consideration, > > Dennis German -- Arsen Arsenović signature.asc Description: PGP signature
Re: coreutils/man/rm.x - fails to mention POSIX "Refuse to remove path/. and path/.., as well as `.' and `..'
Mike Hodson writes: > On Sat, Sep 23, 2023, 12:47 Arsen Arsenović wrote: > >> Hi, >> >> James Feeney writes: >> >> > coreutils 9.3 >> > coreutils/man/rm.x >> > >> > Even though "rm" was modified, > > >> the rm(1) man page completely fails to mention this particular POSIX >> > promise. > > >> Language should be added to the GNU rm man page explicitly stating the >> > POSIX behavior: > > > > This is specified in the manual: > > > >> See (coreutils)rm invocation, paragraph 5. >> > > ... but this is not the 'manual', its the 'info' page. 'infopage' ? is that > a term? > But I digress over semantics.. No, the correct term is 'manual'. The info format is one of the output formats for Texinfo manuals. > This brings up once again a problem I have had, and I believe is in fact > the reason that I subscribed to this mailing list in the first place years > ago. > The info pages, quite unfortunately, DO NOT match the man pages. Quite the opposite, if man pages, which are unstructured flat pages with some convention on headings, matched the manuals, they'd be hard to access. See, for instance, the ffmpeg manuals. > There is at the same time both more, and less verbosity in certain sections > between the two. For one, I appreciate that the 'manpage' directs you to > other related 'see also' pages. It also contains full copyright and > authorship information. I have not yet figured out where these > corresponding values exist in 'info' for the individual 'rm' command. Hmm, normally these are contained in a node in the manual, but they do, indeed, seem to be missing from (coreutils). Strange. SEE ALSO is simply not a useful construct in any documentation system but 'man', as all other documentation systems (that I am aware of) have marked-up inline links. > I suspect [and history keeps indicating this to me] that a rather small > group of people understand how to [properly] use the GNU 'info' command, or > even know of its existence, vs the UNIX standard 'man' command. > > The Unix 'man' command is well known and understood, and has been the > defacto standard since 1971. 15 years prior to GNU Info. Many standards come and go. Note that I agree that a better info viewer (and a better info on-disk format) are necessary, but groff -Tutf8 -mtty-char | less -R is not better. It lacks the ability to navigate or reflow (the latter '.info' also lacks today, unfortunately). The solution to this is not to downgrade to man-pages, but to make the 'info' format better (the source material, i.e. the .texi, for that is already there) and to provide a better browser. Texinfo developers have toyed with the idea of using HTML for that, but I'm personally more partial to a simpler format that's a 'lossless' encoding that could instead quickly be translated either to '-Tutf8 -mtty-chars'-style output *or* to HTML, for GUI viewers like the KDE help center (which currently uses info2html, which isn't too effective due to the on-disk format). (FWIW, I also think man could use more purpose-made pagers.. but there are intrinsic difficulties due to the source format) > This one case seems particularly egregious, considering that the POSIX > specific information is likely not understood by someone trying to > understand why their command does not work, and this same person likely > will neither know of 'info' nor read to the very bottom of the manpage to > see there is indeed a second (info) manual. > > In the particular case of 'rm', (albeit on an older ubuntu version of > coreutils, 9.1, as an example) there is only 16 80-column textual lines > difference between 'man' and 'info' outputs. [infopage is 16 lines longer] > 'man' output furthermore resizes to the size of my terminal, making its > actual vertical output less if the terminal is > 80 columns. > 'info' output does not seem to resize, which to me makes it harder to read. > > Furthermore, between 'info' and 'man' the output of the pages seems to be > nicer in 'manpage' format as the different sections are all nicely > indented, have proper section headers, and it is visually less bothersome > without having so many single quotes everywhere. This is another unfortunate quirk of the on-disk format. I have a low priority task of replacing it (overridden by various other projects sadly). I intended to do that this summer, but could not fit it into schedule. Contributions welcome (feel free to ask for context on the Texinfo ML). However, I d
Re: coreutils/man/rm.x - fails to mention POSIX "Refuse to remove path/. and path/.., as well as `.' and `..'
Hi, James Feeney writes: > coreutils 9.3 > coreutils/man/rm.x > > Even though "rm" was modified, > > Sat Apr 20 00:03:09 1991 David J. MacKenzie (djm at geech.gnu.ai.mit.edu) > ... > * rm.c (rm): Refuse to remove path/. and path/.., as well as `.' and > `..', for POSIX. > ... > > the rm(1) man page completely fails to mention this particular POSIX > promise. > > If the user already knows how GNU rm will respond to "rm -rf *", they > don't need to look at the man page. > > If the user does *not* know how GNU rm will respond to "rm -rf *", the > GNU rm man page is not going to help them. > > Language should be added to the GNU rm man page explicitly stating the > POSIX behavior: > > If either of the files dot or dot-dot are specified as the basename > portion of an operand (that is, the final pathname component) or if an > operand resolves to the root directory, rm shall write a diagnostic > message to standard error and do nothing more with such operands. > > including the additional requirement with respect to the root > directory. This is specified in the manual: >Any attempt to remove a file whose last file name component is ‘.’ > or ‘..’ is rejected without any prompting, as mandated by POSIX. See (coreutils)rm invocation, paragraph 5. Have a lovely night. -- Arsen Arsenović signature.asc Description: PGP signature
Re: Compiling Coreutils with masm=intel
Jay writes: > Hello, thank you for your reply. > > I am currently writing an assembly rewriting tool, and with information > gathered throughout my toolchain, Intel syntax, at least for my purpose, > provides many intuitive ways to rewrite the assembly. Therefore, I was > hoping to remain consistent with the Intel syntax as I wish to generate an > assembly file of Coreutils and reassemble it using my tool. > > When you mention >> Coreutils (and gnulib) could work around them, but I'm not sure >> that's useful. > > Would it be possible if you could let me know the workaround? If it is > impossible to use the Intel syntax, then I will probably need to work on > porting my tool over to AT&T, but I would like to keep many options open if > possible. [entire email cited above as the ML was dropped from CC, with previous emails dropped] Hi, Please keep the mailing list in CC, as it is useful to archive discussion (this usually involves using Reply All or such in your mail client). > When you mention >> Coreutils (and gnulib) could work around them, but I'm not sure >> that's useful. > > Would it be possible if you could let me know the workaround? If it is > impossible to use the Intel syntax, then I will probably need to work on > porting my tool over to AT&T, but I would like to keep many options open if > possible. No problem. The bug report you found mentions this issue, which is that global symbols with names that match up with instruction mnemonics in Intel syntax confuse the assembler. The workaround would be to rename those (e.g. 'or' in src/test.c), but that would confuse developers and break API. I'm more partial to the patch proposed in that PR (comment 23), which would add quoting to some pieces of ASM output. I'm not sure if it fixes this case, though, and I don't have time to test it now. If you do test it, please post feedback on the PR. Note that -masm=intel is not very widely used, so YMMV. Good luck, thank you for your interest, have a lovely day :-) -- Arsen Arsenović signature.asc Description: PGP signature
Re: Compiling Coreutils with masm=intel
Jay writes: > Hello, > > I am trying to compile coreutils with the following commands: > > CC=gcc CFLAGS="-O0 -gdwarf-2 -save-temps=obj -Wno-error > -fno-asynchronous-unwind-tables -fno-exceptions" ../configure --prefix > /home/Documents/coreutils/intelbuild && make -j8 > > However, it fails with the following message: > lib/mktime.s: Assembler messages: > lib/mktime.s:95: Error: invalid use of operator "shr" > lib/mktime.s:285: Error: invalid use of operator "shr" > lib/mktime.s:291: Error: invalid use of operator "shr" > ... > > After searching through, I found out that this is some bug? > https://gcc.gnu.org/bugzilla//show_bug.cgi?id=53929 but I could not find > any remedy to it. I am wondering whether this has been fixed or it is not > possible to compile Coreutils with the masm=intel option. You, indeed, seem to have run into that bug. Coreutils (and gnulib) could work around them, but I'm not sure that's useful. Why do you need the GCC<->AS interface to use Intel syntax, though? > Thank you in advance. Have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: mv command usability: rename file and create dest dir
rce or destination each time) then another approach is to >>>> just >>>> enable the bash 'direxpand' option, define some short envars in your >>>> .bash_profile or .bashrc, and use those to facilitate commandline (and >>>> script) operations, e.g. >>>> >>>> export p1=/long/path/to/some/frequently/accessed/directory >>>> export p2=/another/long/path/to/a/frequently/accessed/directory >>>> >>>> Then, for cmdline ops, just typing >>>> >>>> $ mv $a/ >>>> >>>> immediately expands $a (inline on the commandline) to >>>> >>>> $ mv /long/path/to/some/frequently/accessed/directory/ >>>> >>>> and you can then tack on "$b" (or any other destination). >>>> >>>> The 'direxpand' option provides nice immediate feedback that the envar >>>> you >>>> selected is the correct one (among, presumably, several 1-letter envars >>>> you've defined like this for various long paths of interest.) >>>> >>>> I use this approach frequently in my own workflow when dealing with >>>> annoyingly long but consistent paths. >>>> >>>> Glenn >>>> >>>> >>>> >>> >>> -- >>> Sergey Ponomarev <https://linkedin.com/in/stokito>, >>> stokito.com >>> >> >> >> -- >> Sergey Ponomarev <https://linkedin.com/in/stokito>, >> stokito.com >> -- Arsen Arsenović signature.asc Description: PGP signature
Re: cp -n: now exits yet without error message
Hi there, Bernhard Voelker writes: > Hi, > > I saw a discussion about 'cp -n A B' now exiting with 9.2 when B exists. > That seems to have changed with v9.1-133-g7a69df889. > > I don't question that change, but shouldn't the tool output an error > diagnostic > if it exits with an error? > > $ cp -n A B; echo $? > 1 > > It doesn't even give a hint with -v or --debug: > > $ cp -nvvv A B; echo $? > 1 > > $ cp -n --debug A B; echo $? > 1 > > Is that okay? > > Have a nice day, > Berny See discussion at https://debbugs.gnu.org/62572 Have a lovely evening! -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Morning Padraig, Pádraig Brady writes: > I'm squashing in the following to handle illumos and macOS. > Also it decouples the code from the pollfd structure layout, > by using C99 designated initializers. Thanks. I read over the patches you attached and they seem reasonable. Thanks for working on getting this merged, have a great day! > cheers, > Pádraig > > diff --git a/src/iopoll.c b/src/iopoll.c > index 916241f89..ceb1b43ad 100644 > --- a/src/iopoll.c > +++ b/src/iopoll.c > @@ -49,7 +49,10 @@ > extern int > iopoll (int fdin, int fdout) > { > - struct pollfd pfds[2] = {{fdin, POLLIN, 0}, {fdout, 0, 0}}; > + struct pollfd pfds[2] = { /* POLLRDBAND needed for illumos, macOS. */ > +{ .fd = fdin, .events = POLLIN | POLLRDBAND, .revents = 0 }, > +{ .fd = fdout, .events = POLLRDBAND, .revents = 0 }, > + }; > >while (poll (pfds, 2, -1) > 0 || errno == EINTR) > { -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] fold: Add '-c' (cut) option to cut off lines at a specific WIDTH
Hi Julian, jhx writes: > Hello everyone, > > I personally often use 'fold' to break up some long lines, which works > well. Lately I have been in the need to cut of a line at a specific length - > removing the rest of the line. I made a small patch for 'fold' to do just > that. The line gets cut off at WIDTH (specified via -w WIDTH) and three dots > will be printed for a more appealing output. The new option added is '-c' for > 'cut'. > I checked out the newest code via Git and compiled 'fold' with the patch > attached to this mail. (No errors/warnings were output). If I understood your intention and patch correctly, cut should already do what you need: ~$ perl -e 'print "A" x 800, "\n";' | cut -c -50 AA ~$ perl -e 'print "A" x 8, "\n";' | cut -c -50 | cat --show-ends $ Check out ``(coreutils)cut invocation''. > Attached you will find the patch for 'fold'. > > Apologies if there is something missing/wrong - Never contributed to any GNU > software before. :) Welcome! Great having you. :-D > Greetings > > Julian "jhx" > > [2. text/x-patch; fold-cut-line.patch]... Hope that helps! Have a most wonderful day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Pádraig Brady writes: > Yes definitely. > This is the top of my list to merge. Lovely, thanks! -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Padraig, I saw that you are planning on making a coreutils release soon. Can these patches be included in it? Thanks in advance, have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: Feature Request: 'du' command allow filter by user name
Hi, SCOTT FIELDS writes: > The problem is that doesn't provide you a summary, since it gives you usage > for each file. You can compute a total with -c still. There might be an argument to make about -s also producing a summary when given a files0-from, though. Hope that helps, have a great evening! -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Padraig, Pádraig Brady writes: > Really nice work on this. > Only very small syntax tweaks (attached) at this point. > I'm going to do testing with this and will add an appropriate test case. I spotted some a slightly less minor error, and notified Carl off-list, but you beat us to resubmitting a fixed patchset ;) Namely, select (rfds, ...) would leave the state of rfds undefined. On Linux, this didn't cause errors, but I can definitely see it doing so on other platforms. I attached a patch that fixes that. I also attached the test case I mentioned. From 2e26d25475b1541ff6f03685c671c63277b837d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arsen=20Arsenovi=C4=87?= Date: Tue, 3 Jan 2023 18:05:07 +0100 Subject: [PATCH 1/2] iopoll: Fix select fd_set UB in iopoll * src/iopoll.c (iopoll): Reinitialize rfds fd_set on each select iteration. --- src/iopoll.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/iopoll.c b/src/iopoll.c index 9424af6fa..e1714728c 100644 --- a/src/iopoll.c +++ b/src/iopoll.c @@ -66,19 +66,22 @@ iopoll(int fdin, int fdout) #else /* fall back to select()-based implementation */ extern int -iopoll(int fdin, int fdout) +iopoll (int fdin, int fdout) { int nfds = (fdin > fdout ? fdin : fdout) + 1; - fd_set rfds; - FD_ZERO (&rfds); - FD_SET (fdin, &rfds); - FD_SET (fdout, &rfds); + int ret = 0; /* If fdout has an error condition (like a broken pipe) it will be seen as ready for reading. Assumes fdout is not actually readable. */ - while (select (nfds, &rfds, NULL, NULL, NULL) > 0 || errno == EINTR) + while (ret >= 0 || errno == EINTR) { - if (errno == EINTR) + fd_set rfds; + FD_ZERO (&rfds); + FD_SET (fdin, &rfds); + FD_SET (fdout, &rfds); + ret = select (nfds, &rfds, NULL, NULL, NULL); + + if (ret < 0) continue; if (FD_ISSET (fdin, &rfds)) /* input available or EOF; should now */ return 0; /* be able to read() without blocking */ -- 2.39.0 From 28abc6c347e74a0e61dc3dfac40f09b186fab65f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arsen=20Arsenovi=C4=87?= Date: Tue, 3 Jan 2023 18:06:45 +0100 Subject: [PATCH 2/2] tests: Add test for tee -p * tests/misc/tee.sh: Add tee -p test. --- tests/misc/tee.sh | 17 + 1 file changed, 17 insertions(+) diff --git a/tests/misc/tee.sh b/tests/misc/tee.sh index 0bd91b6cb..6945865ff 100755 --- a/tests/misc/tee.sh +++ b/tests/misc/tee.sh @@ -63,6 +63,23 @@ if test -w /dev/full && test -c /dev/full; then test $(wc -l < err) = 1 || { cat err; fail=1; } fi +# This is a testcase for the iopoll-powered read-write loop in tee. In +# essence, the test checks for sleep exiting as soon as all it's outputs die. +# With the presence of some bashisms, this test could be more complete, so that +# it includes tests for outputting to named pipes too, but the handling of +# outputs in tee is sufficiently elegant to make it hopefully identical. +# +# Component breakdown of this pipeline: +# - sleep emulates misbehaving input. +# - The timeout is our failure safety-net. +# - We ignore stderr from tee, and should have no stdout anyway. +# - If the tee succeeds, we print TEST_PASSED into FD 8 to grep for later. +# (FD 8 was selected by a D20 roll, or rather, a software emulation of one) +# - The colon is the immediately closed output process. +# - We redirect 8 back into stdout to grep it. +( sleep 5 | (timeout 3 tee -p 2>/dev/null && echo TEST_PASSED >&8) | : ) 8>&1 \ +| grep -x TEST_PASSED >/dev/null || fail=1 + # Ensure tee honors --output-error modes mkfifo_or_skip_ fifo -- 2.39.0 I originally wanted to include these squashed into the original two commits, which is why I held off from posting an amended patchset. Oh, I also just noticed: In the non-poll case, a warning will be emitted because an undefined macro value is used in an #if. Please also add a #else # define IOPOLL_USES_POLL 0 ... branch. Thanks, and happy holidays! -- Arsen Arsenović signature.asc Description: PGP signature
Re: trying to contribute but got an error during the setup
Evening, Stéphane Archer writes: > Dear Gnu Coreutils community, > > I tried to compile the repo on my mac to contribute to the project. > I have the following error: > clang: error: no such file or directory: './lib/parse-datetime.c' > clang: error: no input files > I have the file test-parse-datetime.c but not parse-datetime.c, I'm not > sure how to get it. > Does anyone know? > > I'm currently trying to contribute to mv so maybe I can just call something > like make mv and start playing around? I think you missed a call to ./bootstrap. Hope that helps! -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Carl, Padraig, Thanks for the ACK. I've sent the signed copyright assignment form; I'll keep you posted on that. On Tue, 13 Dec 2022, Pádraig Brady wrote: >> Re HAVE_INOTIFY, that's really a proxy for a linux kernel, and so would be >> most appropriately changed to: >> >> defined __linux__ || defined __ANDROID__ >> >> I'm thinking these hardcoded defines would be best for now at least as it >> covers the vast majority of systems, and avoids complicated (cross) compile >> time checks. It might also be good to give a quick test on FreeBSD, since it has some popularity too. >> A modularised iopoll.c would be better, given the potential uses by other >> tools, though we'd probably release for just tee initially. >> >> As for interface to this functionality I'm wondering if we could just have >> the existing tee {-p,--output-error} imply the use of poll() on output. >> >> I.e. from a high level -p just means to deal more appropriately with non file >> outputs, and as part of that, dealing immediately with closed outputs would >> be an improvement. That seems reasonable to me. >> Note also tail(1) enables this functionality by default. I'm not sure about >> other utilities, but we can deal with that later if needed. Carl Edquist via GNU coreutils General Discussion writes: > > Thanks Pádraig for the feedback - that all sounds good. > > Will try to follow-up sometime this week... If you prefer, I'll have some time in the latter part of this week too. Let's not forget to include the testcase posted previously (with -p instead of -P, since it was suggested to enable polling for -p): ( sleep 5 | (timeout 3 tee -p 2>/dev/null && echo TEST_PASSED >&8) | : ) 8>&1 | grep -qx TEST_PASSED To annotate it, and let's include this info in a comment: - sleep emulates misbehaving input. - The timeout is our failure safety-net. - We ignore stderr from tee, and should have no stdout anyway. - If that succeeds, we print TEST_PASSED into FD 8 to grep for later. (FD 8 was selected by a D20 roll, or rather, a software emulation) - The colon is the immediately closed output process. - We redirect 8 back into stdout to grep it. If tee fails, for instance because it times out, or it fails to recognize -P for some reason, the echo simply won't run. The grep options are in POSIX (or, at least, in POSIX.1-2017). Thank you both, have a great night. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Carl, Carl Edquist writes: > [2. text/x-diff; 0001-tee-only-fstat-outputs-if-pipe_check-is-active.patch]... > > [3. text/x-diff; > 0002-tee-skip-pipe-checks-if-input-is-always-ready-for-re.patch]... Thanks for writing these, and the other patches. I've once again been stripped of time, but I think we've nailed the concept down for the most part. I think we should wait for Pádraig to voice his opinion at this point. Details can be ironed out later, and pretty easily too. Thank you again, have a great day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Rob, Rob Landley writes: > This sort of thing is why I added -i to toybox's "timeout" command: > > -i Only kill for inactivity (restart timeout when command produces > output) > > It runs the command's stdout through a pipe and does a poll() with the -i > seconds value, and signals the program if the poll() expires. > > The android guys found it useful, but I was waiting to hear back about "cut > -DF" > before bringing it up here... That's interesting, might be worth adding to the GNU timeout, however, it's not appropriate for what I'm using tee for, since compiler processes could appear idle for a long time, if doing LTO for instance. Thanks, have a great day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Carl Edquist writes: > On Thu, 8 Dec 2022, Arsen Arsenović wrote: > >> Apologies for my absence, Tuesdays and Wednesdays are long workdays for me. > > No need for apologies - I feel like i am the one who should apologize for my > high volume of email to the list. People have lives after all! :) > > The timing of this thread caught my attention because I had recently been > wrestling with a similar issue, trying to use shell utils to talk over a > socket > with the help of bash's /dev/tpc/host/port interface. > > Similar to the situation here, i was seeing things annoyingly look like they > are still 'alive' longer than they ought to be when providing input from the > terminal. Huh, I never tried that, honestly. Did polling help your use-case? > >>> Biggest item is making a new configure macro based on whether poll() is >>> present and and works as intended for pipes. With 0 timeout, polling the >>> write-end of a pipe that is open on both ends for errors does not indicate a >>> broken pipe; but polling the write-end of a pipe with the read-end closed >>> does indicate a broken pipe. >> >> This might be a bit problematic when cross compiling (which is why I imagine >> systems were hard-coded before). > > Oh interesting. That wasn't on my radar at all. I guess this means that when > cross-compiling, the configure script is run on the cross-compiling host, > rather than on the target platform; so any test programs in configure.ac with > AC_RUN_IFELSE don't necessarily check the target platform functionality (?) Or worse, is unable to run at all (and always fails), if the binary is for a different kernel or architecture. > That's too bad. I had hoped to come up with a better way to indicate a > working > poll() for this feature than maintaining a list of platforms. > > >>>> So I guess (on Linux at least) that means a "readable event on STDOUT" is >>>> equivalent to (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | POLLERR). >>>> >>>> So, it'll catch the relevant poll() errors (POLLHUP | POLLERR), but the >>>> inclusion of POLLIN results in the gotcha that it will be a false positive >>>> if stdout is already open for RW (eg a socket) and there is actually data >>>> ready. >> >> Ah - yes. tail.c guards against this by checking the type of the file >> descriptor before selecting it, and makes sure it's among the "one-way" >> file descriptors: >> >> if (forever && ignore_fifo_and_pipe (F, n_files)) >>{ >> /* If stdout is a fifo or pipe, then monitor it >> so that we exit if the reader goes away. */ >> struct stat out_stat; >> if (fstat (STDOUT_FILENO, &out_stat) < 0) >>die (EXIT_FAILURE, errno, _("standard output")); >> monitor_output = (S_ISFIFO (out_stat.st_mode) >>|| (HAVE_FIFO_PIPES != 1 && isapipe (STDOUT_FILENO))); >> >> Good catch! It completely slipped by my mind. > > Ah, yeah if we know it's a pipe we shouldn't have to worry about an output > being open for RW. > > Originally i had imagined (or hoped) that this broken-pipe detection could > also > be used for sockets (that was how the issue came up for me), but it seems the > semantics for sockets are different than for pipes. This might require POLLPRI or POLLRDHUP or such. Can you try with those to the set of events in pollfd? > Experimentally, it seems that once the remote read end of the socket is > shutdown, poll() does not detect a broken pipe - it will wait indefinitely. > But at this point if a write() is done on the local end of the socket, the > first write() will succeed, and then _after_ this it will behave like a broken > pipe - poll() returns POLLERR|POLLHUP, and write() results in SIGPIPE/EPIPE. > > It's fairly confusing. But it seems due to the difference in semantics with > sockets, likely this broken-pipe detection will only really work properly for > pipes. > > So yeah, back to your point, there is a little room for improvement here by > fstat()ing the output and only doing the iopoll() waiting if the output is a > pipe. > > A quick note, this check only needs to be done a total of once per output, it > shouldn't be done inside iopoll(), which would result in an additional > redundant fstat() per read(). Could this be handled by get_next_out? > ... Also, i suspect that the pipe_check option can be disabled if the _input_ > is a regular file (or block device), since (i think) these always show up as > ready for reading. (This check would only need to be done once for fd 0 at > program start.) Yes, there's no point poll-driving those, since it'll be always readable, up to EOF, and never hesitate to bring more data. It might just end up being a no-op if used in current form (but I haven't tried). > But ... one step at a time! :) > > > Carl Have a great day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Carl, Apologies for my absence, Tuesdays and Wednesdays are long workdays for me. Carl Edquist writes: > Alright, lest I be guilty of idle nay-saying, I've attached another patch to > address all of my complaints. > > (Apply it after Arsen's last one, which comes after my previous one. Otherwise > if desired I can send a single summary patch.) > > Biggest item is making a new configure macro based on whether poll() is > present > and and works as intended for pipes. With 0 timeout, polling the write-end of > a pipe that is open on both ends for errors does not indicate a broken pipe; > but polling the write-end of a pipe with the read-end closed does indicate a > broken pipe. This might be a bit problematic when cross compiling (which is why I imagine systems were hard-coded before). > Beyond that, I revised the select()-based implementation of iopoll() to > address > my previous comments. Sorry I got my grubby hands all over it. > I do hope you'll like it though! :) Thanks :) >>> (4.) >>> >>>> + /* readable event on STDOUT is equivalent to POLLERR, >>>> + and implies an error condition on output like broken pipe. */ >>> >>> I know this is what the comment from tail.c says, but is it actually >>> documented to be true somewhere? And on what platforms? I don't see it >>> being documented in my select(2) on Linux, anyway. (Though it does seem >>> to work.) Wondering if this behavior is "standard". /me shrugs I cannot speak for many platforms, so I just opted to follow what tail.c already did. >> Ah! >> >> Well, it's not documented in my (oldish) select(2), but I do find the >> following in a newer version of that manpage: >> >> >>> Correspondence between select() and poll() notifications >>> >>> Within the Linux kernel source, we find the following definitions which >>> show the correspondence between the readable, writable, and exceptional >>> condition notifications of select() and the event notifications pro- >>> vided by poll(2) (and epoll(7)): >>> >>> #define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | >>> POLLERR) >>> /* Ready for reading */ >>> #define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR) >>> /* Ready for writing */ >>> #define POLLEX_SET (POLLPRI) >>> /* Exceptional condition */ >>> >> >> >> So I guess (on Linux at least) that means a "readable event on STDOUT" is >> equivalent to (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | POLLERR). >> >> So, it'll catch the relevant poll() errors (POLLHUP | POLLERR), but the >> inclusion of POLLIN results in the gotcha that it will be a false positive if >> stdout is already open for RW (eg a socket) and there is actually data ready. Ah - yes. tail.c guards against this by checking the type of the file descriptor before selecting it, and makes sure it's among the "one-way" file descriptors: if (forever && ignore_fifo_and_pipe (F, n_files)) { /* If stdout is a fifo or pipe, then monitor it so that we exit if the reader goes away. */ struct stat out_stat; if (fstat (STDOUT_FILENO, &out_stat) < 0) die (EXIT_FAILURE, errno, _("standard output")); monitor_output = (S_ISFIFO (out_stat.st_mode) || (HAVE_FIFO_PIPES != 1 && isapipe (STDOUT_FILENO))); Good catch! It completely slipped by my mind. >> Also, the POLLEX_SET definition of (POLLPRI) doesn't seem relevant; so I >> might suggest removing the 'xfd' arg for the select()-based implementation: >> >>> POLLPRI >>> >>>There is some exceptional condition on the file descriptor. >>>Possibilities include: >>> >>> * There is out-of-band data on a TCP socket (see tcp(7)). >>> >>>* A pseudoterminal master in packet mode has seen a state >>> change on the slave (see ioctl_tty(2)). >>> >>>* A cgroup.events file has been modified (see cgroups(7)). Yes, adding POLLPRI and xfds is likely excessive. I did the former while quite tired (so, under a misunderstanding), and the latter was a translation of the former. In any case, iopoll appears to be the path ahead, regardless of implementation details. Thanks again. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Arsen Arsenović writes: > From 582e0a27b7995aac90cc360463ec8bde1a44cfe4 Mon Sep 17 00:00:00 2001 > From: Paul Eggert ^ Whoops, I forgot to fix this after committing with the wrong hash in --reuse-commit. I don't want to confuse anyone, I authored the patch. Apologies. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi Carl, Carl Edquist writes: > Originally I had in mind to put the read() call inside the poll() loop. But if > we keep this feature as an option, it felt it was a bit easier just to add the > "if (pipe_check) {...}" block before the read(). Yes, I do agree that this is likely cleaner. > For Pádraig, I think the same function & approach here could be used in other > filters (cat for example). The stubborn part of me might say, for platforms > that do not natively support poll(2), we could simply leave out support for > this feature. If that's not acceptable, we could add a select(2)-based > fallback for platforms that do not have a native poll(2). There's no need to omit it. iopoll() seems sufficiently easy to implement via select(): From 582e0a27b7995aac90cc360463ec8bde1a44cfe4 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 5 Dec 2022 18:42:19 -0800 Subject: [PATCH] tee: Support select fallback path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 2022-12-06 Arsen Arsenović iopoll: Support select fallback path * src/tee.c (iopoll): Add logic to enable select usage. --- src/tee.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/src/tee.c b/src/tee.c index c17c5c788..7064ad27d 100644 --- a/src/tee.c +++ b/src/tee.c @@ -197,6 +197,7 @@ main (int argc, char **argv) static int iopoll(int fdin, int fdout) { +#if defined _AIX || defined __sun || defined __APPLE__ || HAVE_INOTIFY struct pollfd pfds[2] = {{fdin, POLLIN, 0}, {fdout, 0, 0}}; while (poll (pfds, 2, -1) > 0 || errno == EINTR) @@ -206,7 +207,41 @@ iopoll(int fdin, int fdout) if (pfds[1].revents)/* POLLERR, POLLHUP, or POLLNVAL */ return IOPOLL_BROKEN_OUTPUT; /* output error or broken pipe */ } +#else + int ret; + int bigger_fd = fdin > fdout ? fdin : fdout; + fd_set rfd; + fd_set xfd; + FD_ZERO(&xfd); + FD_ZERO(&rfd); + FD_SET(fdout, &rfd); + FD_SET(fdout, &xfd); + FD_SET(fdin, &xfd); + FD_SET(fdin, &rfd); + /* readable event on STDOUT is equivalent to POLLERR, + and implies an error condition on output like broken pipe. */ + while ((ret = select (bigger_fd + 1, &rfd, NULL, &xfd, NULL)) > 0 + || errno == EINTR) +{ + if (errno == EINTR) +continue; + + if (ret < 0) +break; + + if (FD_ISSET(fdout, &xfd) || FD_ISSET(fdout, &rfd)) +{ + /* Implies broken fdout. */ + return IOPOLL_BROKEN_OUTPUT; +} + else if (FD_ISSET(fdin, &xfd) || FD_ISSET(fdin, &rfd)) +{ + /* Something on input. Error handled in subsequent read. */ + return 0; +} +} +#endif return IOPOLL_ERROR; /* poll error */ } -- 2.38.1 Note that I also needed to replace the ``/* falls through */'' comment with [[fallthrough]]; to build your patch on gcc 12.2.1 20221008. I'd guess there's some way to pick the correct marking method. I tested both codepaths in the patch above on Linux. I suggest adding the test case I provided before to test on more platforms (and I'll give some VMs a shot when I get home; currently in a lecture). The API here seems quite general, I'd be surprised if other utils couldn't make use of it too, though, maybe it should be given a slightly more descriptive name (iopoll feels a bit broad, maybe select_inout () to signify that it makes a selection between one input or one output exactly). > Unique to tee is its multiple outputs. The new get_next_out() helper simply > advances to select the next remaining (ie, not-yet-removed) output. As > described last time, it's sufficient to track a single output at a time, and > perhaps it even simplifies the implementation. It also avoids the need for a > malloc() for the pollfd array before every read(). I think this is okay, I struggle to find a case where it couldn't work. Note that removing polled files from a pollfd array does not require any reallocation (just setting the fd to -1, as in the code I initially posted), so there's no malloc either way ;). Thanks for working on this, have a great day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Carl Edquist writes: > On the topic of implementation - I was thinking more about a general solution > for filter utils, and I am thinking the key thing is to provide a replacement > (wrapper) for read(2), that polls two fds together (one input and one ouput), > with no timeout. > > It would check for POLLIN on input (in which case do the read()). Otherwise if > there is an error (POLLERR or POLLHUP) on input, treat it as EOF. Otherwise > if > there's an error on output, remove this output, or handle it similar to > SIGPIPE/EPIPE. > > (Nothing is written to the output fd here, it's just used for polling.) I'm concerned with adding such a behavior change by default still. I can imagine this "lifetime extension" properly having been relied on in the last many decades it has been around for ;) > Although tee has multiple outputs, you only need to monitor a single output fd > at a time. Because the only case you actually need to catch is when the final > valid output becomes a broken pipe. (So I don't think it's necessary to > poll(2) all the output fds together.) That is technically true, but I think coupling this to two FDs might prove a bit inelegant in implementation (since you'd have to decide which entry from an unorganized list with holes do you pick? any of them could spontaneously go away), so I'm not sure the implementation would be cleaner that way. Thanks, have a wonderful day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Carl Edquist writes: > It sounds like one way or another you want to copy your endless but > intermittent input to multiple output pipes, but you want to quit as soon as > all the output pipes become broken. Precisely. The most important requirement there is that the tee-based substitute imitates the lifetime of it's longest lived output. Now I'm thinking, maybe --pipe-check should also block SIGPIPE, to prevent the race between poll, process death and write (which would result in the process getting killed, as it'd happen right now, to see what I mean, try ``tee >(sleep 100) >(:)'' and press enter after a bit; a race could make --pipe-check behave like that). I'll keep this in mind, for v2, which is currently waiting on me having some time to research the portability of this whole thing, and for a decision on whether to even include this feature is made. > To me, tee sounds like exactly the place to do that. Otherwise, you'd have to > add the broken-pipe detection (as in your patch) to your own program, along > with the rest of tee's basic functionality :) > > It would be one thing if you controlled the programs that consume the input > (you could have them handle 'heartbeats' in the input stream, and once these > programs terminate, the heartbeats would trip on the broken pipe). (However > (in)elegant that becomes to implement...) > > But if you don't have control over that, the fundamental problem is detecting > broken pipes *without writing to them*, and I don't think that can be solved > with any amount of extra pipes and fd redirection... I imagine that, technically, this is attainable by editing the process substitutions involved to also signal the original process back; however, this feels less elegant and generally useful than tee handling this, given that tee's use-case is redirecting data to many places. Thanks, have a nice day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
William Bader writes: > For the case of testing two compile runs, could you use something like the > bash > command below (replacing 'sleep ...' with 'gcc ...')? The issue here isn't the compilers hanging, it's tee living longer than all the compilers do because it's stdin doesn't EOF (it'd be preferable for it to only live as long as the last of the compilers). I can imagine attempting to implement this with enough pipe and fd redirection magic, but I'm not sure how (in)elegant that becomes. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Pádraig Brady writes: > Right. Thanks for discussing the more general pattern. > I.e. that SIGPIPE doesn't cascade back up the pipeline, > only upon attempted write to the pipe. > So it's not really a tee issue, more of a general pattern. > > So it wouldn't be wrong to add this to tee (by default), > but I'm not sure how useful it is given this is a general issue for all > filters. > Also I'm a bit wary of inducing SIGPIPE as traditionally it hasn't been > handled well: > https://www.pixelbeat.org/programming/sigpipe_handling.html I hesitated making tee poll driven by default because I can imagine someone relying on this behavior. I feel that this is especially useful in the case of teeing pipes because the example I presented isn't very common with other tools, since tee is unique in copying to many outputs, managing the N arbitrary subprocesses can prove difficult. I wouldn't be necessarily opposed to making all the coreutils able to be poll driven, but that'd definitely require some common cross-platform event loopy code ;) Thanks, have a great night. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Pádraig Brady writes: > To get a better handle on the generality of this > I went back about 16 years on the list, > and even though there have been many tee functionality requests over the > years, > this is the first time this has been requested. I'm glad not to have wasted time on a redundancy! I totally overlooked checking the MLs, so I only searched through the "rejected features" page; I'll remember to check archives too in the future. > Trying to understand your use case better, > I'm presuming the input generator is depending on the compiler runs > (written to by tee) to exit cleanly, before exiting / generating more? > Hence the hangs? > If that was the case then there still might be the potential for hangs > even if tee detected closed pipes. I.e. if the compiler runs hung rather > than exited, this not be distinguishable from tee as the pipe outputs would > remain. Yes, but these are fine. The problem here isn't that the build process can hang, the problem is that it can hang in a way different to the compilers it's emulating. The work tee is involved in consists of running the configure stage of a package build against two compilers, universally, by essentially tee'ing to both, storing and comparing the results, and then returning the result of the "normal" compiler, in order to detect packages that misconfigure due to compiler changes. The bug we observed is that on occasion, for instance when running with a tty, or with a script that (for some reason) has a pipe on stdin, the tee-based "compiler" would hang. To replicate this, try: tee >(gcc test.c -o a.out.1) >(gcc test.c -o a.out.2) in a tty (here, the stdin is meant to be irrelevant). In an applied version of this scenario, this would be cc-old/normal and cc-new of some sort, through a wrapper function so that results are stored/retrieved/compared/etc, but that complexity isn't necessary to see to understand the issue I think (but [1] is the script involved in the process, see the timeout line and the comment above it) The behavior we'd want out of tee here is to let it pass down as much data as possible to both compilers, and to be able to give up when the last of the compilers die, hence waiting for POLLHUP events. This is useful in other instances where stdin is not necessarily relevant to what tee is sending data to, and where you can't deduce whether it would be (at least, without fairly complex code to replicate argument parsing). > If that's the case this has become a bit less generally useful in my mind. > > To keep tee data driven, perhaps your input could periodically > send a "clock" input through to (say a newline), to check everything > is still running as expected. I.e. your periodic input generator > seems like it would be async already, so it would be better to > add any extra async logic there, and keep tee more simple and data driven. This is, sadly, not in our code. We control (distro) build scripts (ebuilds) and the compiler-diff-wrapper-thing, and the ability to run (normal program) build systems unmodified, with just a custom CC, in order to detect when they misbehave due to a compiler update is paramount here. > cheers, > Pádraig. > > p.s. while reading the tee docs in this area, they were a bit dispersed, > so the attached updates the texinfo for the -p option to summarise > it's operation and how it differs from the default operation. Thanks! This seems nice. Have a great night. [1] https://gist.github.com/thesamesam/4ddaa95f3f42c2be312b676476cbf505 -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Hi, Pádraig Brady writes: > I vaguely remember on macos that POLLRDBAND needed to be set on read fds for > select, > though didn't check all combinations since we didn't actually need > the poll() replacement. Also select() (emulation of poll) doesn't work on > Solaris or AIX, so we needed to explicitly disable emulation there. Ah, hmm. I don't know how notification APIs work there, maybe some other one can be picked? (IIRC Windows also has it's own set of notification APIs that Gnulib uses when on Windows) > Perhaps we could adjust poll() emulation to be compat everywhere, > but I'm not confident. I see. I'll try to dig around a bit for notes about these platforms (IIRC the libevent manual documented a bunch of weird notification API quirks across platforms) to see how to reliably wait on pipes becoming either readable, closed, or writable, if possible at all. > We can help test on esoteric systems, > especially if appropriate tests are in place. More of a reason to figure out the test then :). On that topic, I did come up with a testcase that should be appropriate for the Coreutils testsuite, but it takes a while to execute (5s), which is something to consider. Here it is: ( sleep 5 | (timeout 3 tee -P && echo g >&2) | : ) 2>&1 | grep -q g The 5s time is no coincidence ;). Maybe a better tool exists, that I'm unaware of, that would just idly wait on stdout to become unwritable (which sounds suspiciously like the issue this patch addresses ;). Expect might also be able to handle this test, but I'm not sure whether that's available in the testsuite. > Note https://www.nongnu.org/pretest/ which may be useful. I'll play around with this a bit later, too. Thanks, have a lovely evening. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Arsen Arsenović writes: > Looking at ``check_output_alive'', this logic should be fairly easy to > reuse. Worst case, I'd need to refactor some of the code to handle the > non-pipe-check case more elegantly so that it doesn't become too > unreadable when also adding the logic to pick between poll and select. Got some free time again. The more I think about how to do this, the more it feels like the solution involves a portable poll or poll-like function that gets translated to select() if need be on a given platform, which sounds an awful lot like Gnulib ;). The ``check_output_alive'' code checks for Gnulib poll and declares it incompatible, but I can't quite tell what incompatibility this is. I wonder if that can be fixed? If not, I can make factor out the check_alive logic and have it also check for file descriptors with input data. Sadly, I don't have many non-GNU and non-Linux systems to test for poll behavior on; just some (modern) Solaris and BSD VMs. Thanks in advance, have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] scripts: commit-msg: recognize Git cut_lines
Pádraig Brady writes: > Ah a scissors :) Ha! I never noticed that :D > Pushed. Thanks. Have a great day. -- Arsen Arsenović signature.asc Description: PGP signature
[PATCH] scripts: commit-msg: recognize Git cut_lines
This prevents spurious failures from happening when someone sets commit.verbose or passes -v to commit. --- Hi, When working on my previous patch, I was running into commit-msg mistakenly triggering on lines that are part of the diff that can be optionally added to COMMIT_EDITMSG via commit.verbose (which I have set globally, to review patches as I commit them). This patch prevents that from happening. For the source of the cut_line, see wt-status.c:23 (cut_line) in the Git sources. Have a good day. scripts/git-hooks/commit-msg | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/git-hooks/commit-msg b/scripts/git-hooks/commit-msg index 8b06559..da094c9 100755 --- a/scripts/git-hooks/commit-msg +++ b/scripts/git-hooks/commit-msg @@ -120,6 +120,7 @@ sub check_msg($$) my $max_len = 72; foreach my $line (@line) { + last if $line =~ '.*-{24} >8 -{24}$'; my $len = length $line; $max_len < $len && $line =~ /^[^#]/ and return "line length ($len) greater than than max: $max_len"; -- 2.38.1
Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Pádraig Brady writes: > Thanks a lot for the patch. > > There is similar functionality in tail.c (which should probably be reused > if we do decide to implement this in tee as it's tricky to do portably). > Anyway the tail(1) case makes sense considering: Looking at ``check_output_alive'', this logic should be fairly easy to reuse. Worst case, I'd need to refactor some of the code to handle the non-pipe-check case more elegantly so that it doesn't become too unreadable when also adding the logic to pick between poll and select. > tail -f file.log | grep -q trigger && process_immediately > > So a similar use case with tee might be: > > tee -p (grep -q trigger_1) | grep -q trigger_2 && > process_immediately > > However tee wouldn't be waiting for more input in that case. > It would either consume the whole file, or exit when processing it. > > So a more appropriate case is: > > intermittent_output | > tee -p >(grep -q trigger_1) | grep -q trigger_2 && process_immediately This case is just about what we were using tee for when I wrote this patch. We use a compiler wrapper that compares the outputs of few compilers by running them in parallel (to detect some new behavior), and we started noticing some builds would mysteriously hang forever. I traced it back to ``tee'' waiting on stdin even after the child processes die on occasion. Really, that was an oversight on my part. I didn't think of the (quite normal) case of stdin being some long-lived low traffic pipe rather than either a file or /dev/null or a pipe that gets closed properly after writing. > Where intermittent_output is stdin, the use case is a bit contrived > as any input will cause tee to exit. > The general more general non stdin case above has some merit, > though not as common as the tail example I think. Right, but no user input could ever happen when the hangs described above happened, since it'd happen in some automated code, so (AFAICT) the only option was to teach tee to detect outputs going away more promptly. > I'll think a bit more about it. > > thanks! > Pádraig Thank you! Have a great evening. -- Arsen Arsenović signature.asc Description: PGP signature
[PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
This flag comes in useful in scripts in which tee is used to duplicate output over process substitutions, so that dying processes and no stdin won't lead to a hang. Without this option, a command like ``tee >(compiler1) >(compiler2) | compiler3'' would hang forever if stdin was a terminal and the compilers didn't expect any data, but since tee -P would do polling, the compilers dying could mean that tee terminates. * src/tee.c (long_options): Add -P, --pipe-check. (main): Parse out -P, --pipe-check. (usage): Add -P, --pipe-check. (tee_files): Enable polling inputs and outputs to drive the tee, rather than exclusively using read and write blocking to control data flow. * doc/coreutils.texi (tee invocation): Document -P, --pipecheck. --- Hi there, While working on some compiler hacks, I found it necessary to copy data in parallel to multiple compilers to compare their results. Currently, with ``tee'', that is only mostly possible, since tee always expects to read some data on stdin before it can see whether it can terminate. This breaks unless stdin is /dev/null, since stdin would never get EOF or some data to wake tee up and have it terminate. The polling implementation provided below will wait on and remove terminating outputs from the list of files ``tee'' watches when an error on them occurs, except in the special case of stdin, which will never be handled by the poll loop, and only serves to break out of the poll in the case of new data. I couldn't figure out a decent test to write for this, the solutions I came up all relied on PIPESTATUS, which I wasn't sure is permitted in tests, since they all share a #!/bin/sh bang, but if that's allowed, I could add a test like the following in: sleep 11 | timeout 10 tee -P | true [[ "${PIPESTATUS[1]}" -eq 0 ]] || fail=1 Note that the above does take 10-11 seconds in all cases, but the test needs some long-enough-lived program that does not write data, and sleep fits the bill. The timeout could probably get knocked down a good bit. Thanks in advance, have a great day. Tested on x86_64-pc-linux-gnu. NEWS | 3 ++ doc/coreutils.texi | 11 + src/tee.c | 103 ++--- 3 files changed, 110 insertions(+), 7 deletions(-) diff --git a/NEWS b/NEWS index b6b5201..3a6bbfe 100644 --- a/NEWS +++ b/NEWS @@ -62,6 +62,9 @@ GNU coreutils NEWS-*- outline -*- wc now accepts the --total={auto,never,always,only} option to give explicit control over when the total is output. + tee now accepts the --pipe-check flag, to enable polling input and output + file descriptors, rather than only relying on stdin for notifications. + ** Improvements date --debug now diagnoses if multiple --date or --set options are diff --git a/doc/coreutils.texi b/doc/coreutils.texi index fca7f69..c372e2e 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -14075,6 +14075,17 @@ them. @opindex --ignore-interrupts Ignore interrupt signals. +@item -P +@itemx --pipe-check +@opindex -P +@opindex --pipe-check + +Polls file descriptors instead of just waiting on standard input, to +allow dying pipes to be detected instantly, rather than waiting for +standard input to write some data first. This is especially useful to +permit process substitutions to notify @command{tee} of completion, so +that it stops waiting for input data when all outputs are closed. + @item -p @itemx --output-error[=@var{mode}] @opindex -p diff --git a/src/tee.c b/src/tee.c index 971b768..25ab5b5 100644 --- a/src/tee.c +++ b/src/tee.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "system.h" #include "argmatch.h" @@ -37,7 +38,7 @@ proper_name ("Richard M. Stallman"), \ proper_name ("David MacKenzie") -static bool tee_files (int nfiles, char **files); +static bool tee_files (int nfiles, char **files, bool pipecheck); /* If true, append to output files rather than truncating them. */ static bool append; @@ -61,6 +62,7 @@ static struct option const long_options[] = {"append", no_argument, NULL, 'a'}, {"ignore-interrupts", no_argument, NULL, 'i'}, {"output-error", optional_argument, NULL, 'p'}, + {"pipe-check", no_argument, NULL, 'P'}, {GETOPT_HELP_OPTION_DECL}, {GETOPT_VERSION_OPTION_DECL}, {NULL, 0, NULL, 0} @@ -89,6 +91,7 @@ usage (int status) Copy standard input to each FILE, and also to standard output.\n\ \n\ -a, --append append to the given FILEs, do not overwrite\n\ + -P, --pipe-check polls before reading, to detect closed pipes\n\ -i, --ignore-interrupts ignore interrupt signals\n\ "), stdout); fputs (_("\ @@ -118,6 +121,7 @@ int main (int argc, char **argv) { bool ok; + bool pipecheck = false; int optc; initialize_main (&argc, &argv); @@ -131,7 +135,7 @@ main (int argc, char **argv) append = false; ignore_interrupts =
Re: [PATCH] dircolors: consider COLORTERM sufficient for color
[ Reposting since I didn't reply-all the first time; apologies ] > Your change could break this setup if there were > entries in "specific config" that weren't overridden later, > and even if they were, it would result in larger LS_COLORS env vars. Ah, I see. I lacked a full understanding of this format so I overlooked that, sorry. > How about the attached patch to allow matching COLORTERM > just like TERM, and set the pattern in the default config > to match against any COLORTERM value. > This also has the advantage of handling specific COLORTERM values. This looks fine, and works on my machine. > We would have to add the default COLORTERM entry to distro config > (when supported by the installed dircolors), but that should be easy enough > to do. Don't distros just generally use the compiled-in default? Thanks again, -- Arsen Arsenović signature.asc Description: PGP signature
Re: [PATCH] dircolors: consider COLORTERM sufficient for color
On 22/02/13 14:15, Pádraig Brady wrote: > Though it was pointed out that COLORTERM may not be preserved across ssh etc. > Now doing this would be no worse, so I'm inclined to apply this. I could follow up with OpenSSH at some point, too, to include $COLORTERM in the default environment set. Though, operating over ssh also presents other considerable problems, such as the remote side lacking the matching terminfo (at least by default), which is why I mostly overlooked it. > BTW alacritty was just added as an explicitly matched terminal. Ah, nice; I didn't realize. -- Arsen Arsenović signature.asc Description: PGP signature
[PATCH] dircolors: consider COLORTERM sufficient for color
COLORTERM is an environment used usually to expose truecolor support in terminal emulators. If a terminal emulator supports truecolor, it is surely reasonable to assume it also supports 8/16/256 colors. This implicitly supports foot, alacritty and any other truecolor terminal emulator with unmatched $TERM. --- Good evening, I've noticed dircolors does not work in foot and alacritty, and on previously raised patches about adding a TERM entry for them the concern of a nongeneric solution was brought up. This is a valid concern, and as many terminals (including these two) export COLORTERM to advertise 24-bit color support, it'd appear to be a good way to pick up on color support. src/dircolors.c | 9 + 1 file changed, 9 insertions(+) diff --git a/src/dircolors.c b/src/dircolors.c index b8cd203..2d20486 100644 --- a/src/dircolors.c +++ b/src/dircolors.c @@ -243,6 +243,7 @@ dc_parse_stream (FILE *fp, char const *filename) size_t input_line_size = 0; char const *line; char const *term; + char const *colorterm; bool ok = true; /* State for the parser. */ @@ -253,6 +254,14 @@ dc_parse_stream (FILE *fp, char const *filename) if (term == NULL || *term == '\0') term = "none"; + /* Check for $COLORTERM */ + colorterm = getenv ("COLORTERM"); + if (colorterm == NULL) +colorterm = ""; + + if (*colorterm != '\0') +state = ST_TERMSURE; + while (true) { char *keywd, *arg; -- 2.34.1