Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs

2022-12-06 Thread Carl Edquist

On Tue, 6 Dec 2022, Carl Edquist wrote:


(4.)


 +  /* readable event on STDOUT is equivalent to POLLERR,
 + and implies an error condition on output like broken pipe.  */


I know this is what the comment from tail.c says, but is it actually 
documented to be true somewhere?  And on what platforms?  I don't see it 
being documented in my select(2) on Linux, anyway.  (Though it does seem 
to work.)  Wondering if this behavior is "standard".


Ah!

Well, it's not documented in my (oldish) select(2), but I do find the 
following in a newer version of that manpage:




Correspondence between select() and poll() notifications

 Within the Linux kernel source, we find the following definitions which
 show the correspondence between the readable, writable, and exceptional
 condition  notifications  of  select() and the event notifications pro-
 vided by poll(2) (and epoll(7)):

 #define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP |
 POLLERR)
/* Ready for reading */
 #define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR)
/* Ready for writing */
 #define POLLEX_SET (POLLPRI)
/* Exceptional condition */




So I guess (on Linux at least) that means a "readable event on STDOUT" is 
equivalent to (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | POLLERR).


So, it'll catch the relevant poll() errors (POLLHUP | POLLERR), but the 
inclusion of POLLIN results in the gotcha that it will be a false positive 
if stdout is already open for RW (eg a socket) and there is actually data 
ready.


Also, the POLLEX_SET definition of (POLLPRI) doesn't seem relevant; so I 
might suggest removing the 'xfd' arg for the select()-based 
implementation:



POLLPRI

  There is some exceptional  condition  on  the  file  descriptor.
  Possibilities include:

  *  There is out-of-band data on a TCP socket (see tcp(7)).

  *  A  pseudoterminal  master  in  packet  mode  has seen a state
 change on the slave (see ioctl_tty(2)).

  *  A cgroup.events file has been modified (see cgroups(7)).



Of course, those definitions live in the Linux kernel source, and on Linux 
we'd get to use poll() instead of select().


Don't know if the select() <-> poll() correspondence differs at all on 
other platforms .. ?



(But, this does give me a bit more confidence about select() being a 
(mostly) viable alternative.)



Carl



Re: Feature request: Rename files with a decent name if two files have the same name.

2022-12-06 Thread Stéphane Archer
Thank you Gregory for your answer,

This is doing the rename to avoid the overwriting which is great!
The only problem is the "with a decent name" part that is missing, this is
changing the file extension.
gmv --backup=t foo.tar bar.tar
will result in bar.tar  bar.tar.~1~
is there any way to be a bit smarter than that?

Best regards

Stephane Archer

On Mon, Dec 5, 2022 at 1:05 PM Gregory Heytings 
wrote:

>
> >
> > I'm using the shell every day but there is one thing that terribly
> > annoys me.
> >
> > When moving files, if two files have the same name, you can overwrite it
> > with the default behavior, not move the file with -n or choose between
> > the two with -i, from my understanding the is no way to tell mv to
> > rename the file to avoid the overwrite with a decent name like img_2.jpg
> > for img.jpg
> >
> > Can you add this feature to mv?
> >
>
> You can use the --backup option.  For example,
>
> mv --force --backup=t foo bar
>
> will make a backup of bar in bar.~1~ before overwriting bar with foo.
>


-- 
Best Regards,

Stephane Archer


Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs

2022-12-06 Thread Carl Edquist via GNU coreutils General Discussion

Hi Arsen, thanks for your feedback!

On Tue, 6 Dec 2022, Arsen Arsenović wrote:

The stubborn part of me might say, for platforms that do not natively 
support poll(2), we could simply leave out support for this feature. 
If that's not acceptable, we could add a select(2)-based fallback for 
platforms that do not have a native poll(2).


There's no need to omit it.  iopoll() seems sufficiently easy to 
implement via select():


So, you're right...

I think the stubborn part of me was reacting (badly) to what looked to be 
a messy situation in tail.c's mix of poll() and select().  I got further 
disenchanted when i saw gnulib's select()-based emulation of poll()  :(


But hey, if it works...


Anyway a few comments:

(1.)


+#if defined _AIX || defined __sun || defined __APPLE__ || HAVE_INOTIFY


I know this is what tail.c has, but I would like to seriously rethink 
this.  What do these macros have to do with poll()?  (And just as 
importantly, how can you tell by looking at it?)


First of all, I think poll(2) (the native system one, not the gnulib 
emulation) should be used wherever possible.  Which is to say, select() 
should only be used where native poll() is unavailable (or if it's 
"available" but broken in some important way).


I believe the HAVE_INOTIFY bit is specific to tail.c, which relates to its 
inotify mode.  I don't think it should be part of the equation here in 
tee.c.


The remaining macros _AIX, __sun, __APPLE__ -- do these cover all systems 
that have a working native poll() -- eg, Linux and the BSDs?  (I suspect 
not?)  And they do not clearly communicate anything to the reader about 
why they were picked / what they represent in relation to the availability 
of poll().


*Ideally,* there might be some configure macro like HAVE_POLL, except one 
that refers specifically to the system one and not the gnulib emulation. 
Something like "HAVE_NATIVE_poll".  Then it would be clear what the 
preprocessor logic is trying to accomplish.



(2.) if we want to use #if preprocessor logic (and preferably "#if 
HAVE_NATIVE_poll") for this poll/select fallback mechanism, I'd suggest to 
include each entire version of the function (including the function header 
and comments) in separate preprocessor blocks.  (One complete function in 
#if, and the other in #else.) The iopoll() implementation I provided is 
straightforward by itself, but its readability gets trampled badly when 
trying to mix the body of the select()-based implementation in with it. 
Just my 2 cents.



(3.)


+  FD_SET(fdout, &xfd);
+  FD_SET(fdin, &xfd);


Is it documented somewhere that the 'exceptfds' argument (xfd) to 
select(2) can be used in a useful way here?


The only thing I've seen (from select_tut(2)) is that these can be used to 
detect out-of-band data for reading on a socket, or that control status 
information is available for a pty in packet mode (tty_ioctl(4)).  I 
believe both of these would be false positives if encountered on input, 
and it's not clear what they would mean for output.



(4.)


+  /* readable event on STDOUT is equivalent to POLLERR,
+ and implies an error condition on output like broken pipe.  */


I know this is what the comment from tail.c says, but is it actually 
documented to be true somewhere?  And on what platforms?  I don't see it 
being documented in my select(2) on Linux, anyway.  (Though it does seem 
to work.)  Wondering if this behavior is "standard".



I guess there is also a little gotcha.  In tee, all the outputs *except* 
stdout are explicitly open()ed, but stdout itself is not, and therefore it 
inherits the open mode that fd 1 had when tee was exec()ed.


In particular, if fd 1 / stdout is a socket, open for RW, then a "readable 
event on STDOUT" could be a legitimate indication that data is ready for 
reading.  So that would be a false positive causing stdout to be removed 
even though successful writes are still possible.


(Maybe nobody will care about this, but it's an interesting corner-case.)

...

"But other than that", sure!  I guess I'm all for it  :)

Carl


Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs

2022-12-06 Thread Arsen Arsenović

Arsen Arsenović  writes:

> From 582e0a27b7995aac90cc360463ec8bde1a44cfe4 Mon Sep 17 00:00:00 2001
> From: Paul Eggert 

^ Whoops, I forgot to fix this after committing with the wrong hash in
--reuse-commit.  I don't want to confuse anyone, I authored the patch.

Apologies.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs

2022-12-06 Thread Arsen Arsenović
Hi Carl,

Carl Edquist  writes:

> Originally I had in mind to put the read() call inside the poll() loop. But if
> we keep this feature as an option, it felt it was a bit easier just to add the
> "if (pipe_check) {...}" block before the read().

Yes, I do agree that this is likely cleaner.

> For Pádraig, I think the same function & approach here could be used in other
> filters (cat for example).  The stubborn part of me might say, for platforms
> that do not natively support poll(2), we could simply leave out support for
> this feature.  If that's not acceptable, we could add a select(2)-based
> fallback for platforms that do not have a native poll(2).

There's no need to omit it.  iopoll() seems sufficiently easy to
implement via select():

From 582e0a27b7995aac90cc360463ec8bde1a44cfe4 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Mon, 5 Dec 2022 18:42:19 -0800
Subject: [PATCH] tee: Support select fallback path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

2022-12-06  Arsen Arsenović  

	iopoll: Support select fallback path
	* src/tee.c (iopoll): Add logic to enable select usage.
---
 src/tee.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/src/tee.c b/src/tee.c
index c17c5c788..7064ad27d 100644
--- a/src/tee.c
+++ b/src/tee.c
@@ -197,6 +197,7 @@ main (int argc, char **argv)
 static int
 iopoll(int fdin, int fdout)
 {
+#if defined _AIX || defined __sun || defined __APPLE__ || HAVE_INOTIFY
   struct pollfd pfds[2] = {{fdin, POLLIN, 0}, {fdout, 0, 0}};
 
   while (poll (pfds, 2, -1) > 0 || errno == EINTR)
@@ -206,7 +207,41 @@ iopoll(int fdin, int fdout)
   if (pfds[1].revents)/* POLLERR, POLLHUP, or POLLNVAL */
 return IOPOLL_BROKEN_OUTPUT;  /* output error or broken pipe   */
 }
+#else
+  int ret;
+  int bigger_fd = fdin > fdout ? fdin : fdout;
+  fd_set rfd;
+  fd_set xfd;
+  FD_ZERO(&xfd);
+  FD_ZERO(&rfd);
+  FD_SET(fdout, &rfd);
+  FD_SET(fdout, &xfd);
+  FD_SET(fdin, &xfd);
+  FD_SET(fdin, &rfd);
 
+  /* readable event on STDOUT is equivalent to POLLERR,
+ and implies an error condition on output like broken pipe.  */
+  while ((ret = select (bigger_fd + 1, &rfd, NULL, &xfd, NULL)) > 0
+	 || errno == EINTR)
+{
+  if (errno == EINTR)
+continue;
+
+  if (ret < 0)
+break;
+
+  if (FD_ISSET(fdout, &xfd) || FD_ISSET(fdout, &rfd))
+{
+  /* Implies broken fdout.  */
+  return IOPOLL_BROKEN_OUTPUT;
+}
+  else if (FD_ISSET(fdin, &xfd) || FD_ISSET(fdin, &rfd))
+{
+  /* Something on input.  Error handled in subsequent read.  */
+  return 0;
+}
+}
+#endif
   return IOPOLL_ERROR;  /* poll error */
 }
 
-- 
2.38.1


Note that I also needed to replace the ``/* falls through */'' comment
with [[fallthrough]]; to build your patch on gcc 12.2.1 20221008.
I'd guess there's some way to pick the correct marking method.

I tested both codepaths in the patch above on Linux.  I suggest adding
the test case I provided before to test on more platforms (and I'll give
some VMs a shot when I get home; currently in a lecture).

The API here seems quite general, I'd be surprised if other utils
couldn't make use of it too, though, maybe it should be given a slightly
more descriptive name (iopoll feels a bit broad, maybe select_inout ()
to signify that it makes a selection between one input or one output
exactly).

> Unique to tee is its multiple outputs.  The new get_next_out() helper simply
> advances to select the next remaining (ie, not-yet-removed) output. As
> described last time, it's sufficient to track a single output at a time, and
> perhaps it even simplifies the implementation.  It also avoids the need for a
> malloc() for the pollfd array before every read().

I think this is okay, I struggle to find a case where it couldn't work.

Note that removing polled files from a pollfd array does not require any
reallocation (just setting the fd to -1, as in the code I initially
posted), so there's no malloc either way ;).

Thanks for working on this, have a great day.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature