Re: "here strings" and tmpfiles

2019-04-12 Thread konsolebox
On Fri, Apr 12, 2019 at 10:32 PM Chet Ramey  wrote:
>
> On 4/11/19 3:08 PM, konsolebox wrote:
>
> >>> It has slightly inconvenient semantics, in that you can't open it more
> >>> than once, and if you can't do that, you can't convert it from read-write
> >>> to readonly.
> >>
> >>
> >> Perhaps it can be reopened via /dev/fd.
> >
> > Also file sealing maybe.  The way it restricts writing is just a
> > little different.
>
> It seems like if you don't mmap the file, you won't create the shared
> writable mapping that would restrict you from sealing the file against
> writes.

Ok, thanks for the reply.  I'll check these theories myself next weekend.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-12 Thread Chet Ramey
On 4/11/19 3:08 PM, konsolebox wrote:

>>> It has slightly inconvenient semantics, in that you can't open it more
>>> than once, and if you can't do that, you can't convert it from read-write
>>> to readonly.
>>
>>
>> Perhaps it can be reopened via /dev/fd.
> 
> Also file sealing maybe.  The way it restricts writing is just a
> little different.

It seems like if you don't mmap the file, you won't create the shared
writable mapping that would restrict you from sealing the file against
writes.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-11 Thread Chet Ramey
On 4/11/19 3:15 AM, Robert Elz wrote:

>   (Substitute cat if you're that kind of weirdo!).

We're really going to throw down right here?


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-11 Thread Chet Ramey
On 4/11/19 12:12 AM, Jason A. Donenfeld wrote:
> I keep forgetting things. The other thing I wanted to bring up is that
> I suspect bash's actual implementation of temporary files is
> problematic and might have some of the classic /tmp and TOCTOU style
> attacks. 

It's a peripheral issue, since the here-document implementation uses a
different function that (usually) calls mkstemp.

But since this function is used for making non-regular files (named pipes),
you pretty much have to use a function that returns a name. If you'd like
to take a run at a better implementation, I'd be glad to take a look at it,
as long as it's portable.

> The first one there uses mktemp(3), which is known to be racy and
> insecure. The GNU man page has a pretty strong warning about it. Maybe
> that's not used in GNU environments though?

Read

http://lists.gnu.org/archive/html/bug-bash/2016-05/msg00062.html

for a different perspective.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-11 Thread Chet Ramey
On 4/11/19 12:02 AM, Jason A. Donenfeld wrote:
> Hi Chet,

Hi.

> I hope that can shed light on the motivation a bit. Pass got hit by
> this a bit ago:
> https://git.zx2c4.com/password-store/commit/?id=367efa5846492e1b0898aad8a2c26ce94163ba24

I note that the pipe-for-small-enough-heredocs works for this case.

> Anyway, the more interesting thing is discussing what a proper fix
> would be. Do you see anything conceptually wrong with the NONBLOCK
> approach I suggested? In theory, would that work? 

I'd prefer the fork-a-child-and-let-it-do-the-writing approach. The
question is where to place it on a list of issues.

> Another thing I was
> curious about is - what about internally treating "x < x"? Are these somehow not quite equivalent because x is in a subshell
> in one but not the other, or something like that? And if that's so,
> would my NONBLOCK suggestion incur similar issues?

They're quite semantically different -- subshells, pipes, different
expansion semantics, among others -- and result in additional, possibly
unexpected, issues.

For instance, consider what happens in your script when someone runs
it on a bash version that has been compiled for strict posix conformance,
including xpg_echo being on by default.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-11 Thread konsolebox
On Thu, Apr 11, 2019 at 5:31 AM konsolebox  wrote:
>
> On Thu, Apr 11, 2019, 4:45 AM Chet Ramey  wrote:
>>
>> On 4/10/19 4:33 PM, konsolebox wrote:
>> > On Wed, Apr 10, 2019 at 11:15 PM Chet Ramey  wrote:
>> >> If we're going to go off into hypotheticals and speculation, it would be
>> >> nice if memfd_create were available universally.
>> >
>> > I see many parts in lib/* that adapts to available system features
>> > like mmap and MAP_ANONYMOUS.  I don't see why memfd_create should be
>> > an exception.  Using a volatile file is much better than forking, and
>> > this only requires a one-time implementation of a wrapper library
>> > function that returns -1 if the feature is not available in the
>> > system.  It should be easy to integrate with the current code since it
>> > returns an fd.
>>
>> It has slightly inconvenient semantics, in that you can't open it more
>> than once, and if you can't do that, you can't convert it from read-write
>> to readonly.
>
>
> Perhaps it can be reopened via /dev/fd.

Also file sealing maybe.  The way it restricts writing is just a
little different.

I just realized that memfd_create has the potential to allow
optimization of capturing output to a variable as well.  Rather than
transferring data through pipes, it can simply use a ram-based file.
In some way I think the file's data can be directly accessed and
allocated or assigned to the variable.  Maybe with the help of mmap.

It can even introduce a new general set of "String-IO" features to
bash, which is quite an exciting idea.  The compromise is to use
temporary files in systems that don't support anonymous RAM-based
files, just like how temporary files are currently used in here docs.
I can think of the following syntaxes for starters:

cmd >>> var  # Trailing newlines are kept and no implicit subshell
var=${{echo a}} # No implicit subshell, but trailing newlines are
still trimmed.  Not really useful to me.
var >>| cmd # Trailing newlines are kept.  Better than <<< if used
with lastpipe, unless bash defaults to no fork unless necessary.

There should be more.

I bet scripters who like doing functional-like abomination in shells
where functions return values through echo or printf would be happy
with it because no more subshells.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-11 Thread Chet Ramey
On 4/10/19 7:18 PM, Daniel Kahn Gillmor wrote:
> On Wed 2019-04-10 16:16:44 -0400, Chet Ramey wrote:
>> Is it just that people have not realized all along that most shells,
>> certainly all historical shells, that implement here documents use temp
>> files to do it? It's really only the ash-based shells (not an insignificant
>> portion of the shells in use, for sure) that use pipes exclusively.
> 
> Yes, i think most people have not realized that, all along.  I certainly
> didn't, and i've been using bash for decades now.

It's not just bash.

>> I'd love to see these concerns articulated concretely, so we can analyze
>> the risk in terms of the temp-file-on-disk implementation.
> 
> I tried to do that at the start of this thread, but i'm happy to try
> gain.
> 
> data written to the local filesystem can be discovered by someone
> analyzing the disk controller data path, or by someone with access to
> the underlying storage medium.

If you have someone who can access that data path, or someone with access
to physical storage, I suggest that you have much greater concerns than
whether or not you can read data in a shell script. This is data that is,
for the most part, already in the script itself.

> So anyone who puts anything sensitive into a here document or herestring
> using bash is triggering a potential leak of that sensitive data.  While
> the contents of many here documents are clearly not sensitive, or are
> destined for the filesystem anyway, *some* data placed in heredocs *are*
> sensitive, and are intended to be ephemeral. 

Here's the thing. If this had been a concern, it would never have happened.
Shells have always used temp files for here documents. You can't just chalk
it up to people not knowing -- the risk has been deemed acceptable.


>> I'm sure Jason has been advocating that nobody use here documents since the
>> Bourne shell introduced them, since that's the position he advocates above.
> 
> AFAICT, Jason only recently became aware of this, so he certainly hasn't
> been advocating no one use here docuemnts  the Bourne
> shell introduced them:

I'm exaggerating for effect, with maybe some snark. His position is, in
effect, that nobody should use here documents or here strings, since, by
his definition, every shell except ash descendents does it insecurely. I'm
sure it's more nuanced than that, and I admit I'm exaggerating for effect.
The intent is to show that people have known about this for 40 years and
to incorporate that into the risk calculation.


> He's not the only person who has used them in an attempt at simplicity
> or cleanliness only to find that this is a risk if sensitive data is
> included in them.
> 
>> Unnecessary hyperbole diminshes the force of your argument.
> 
> i wasn't intending to use unnecessary hyperbole

(It was the original message from Jason.)

> -- if you want to give
> guidance that most people can follow, it's necessary for the guidance to
> be clear and simple.  Complicated guidance gets ignored, or people get
> confused and can't follow it closely.

Starting off with things like "buggy, vulnerable, scary" and "don't use
herestrings" (or, by extension, here-documents) is unnecessary hyperbole
in my book. That's why I'm trying to solicit genuine use cases and have a
risk management discussion.

> fwiw, i'm not here to have an argument with you or to "win" at anything
> -- i'm trying to help improve bash and the ecosystem around it, the same
> as you are.  I certainly have no interest in having this be a contest.
> If you see this discussion as an argument, please let me know and i'll
> stop following up on this thread and adjust my expectations accordingly.

There's no "winning" here. It's a question of analyzing risk and deciding
where to place this issue in a priority-ordered list. An implementation has
a cost, even if I'm not the one who does it, and should have a
corresponding benefit.


> 
>> The real question is why the ephemerality of heredocs and herestrings is
>> important -- what is the use case that requires it, as opposed to it simply
>> being an implementation detail?
> 
> This is an implementation detail that isn't well-documented. 

Implementation details are rarely documented. I would argue that they
don't need to be.


>>> Either of these implementations would be great, though i note that how
>>> bash handles the extra fork (whether the parent or the child does the
>>> pipe writing) might complicate the use of the wait builtin (or any
>>> wait(-1) syscall) in the spawned process (e.g. similar to this bug
>>> report: https://bugs.debian.org/920455)
>>
>> It wouldn't really affect that. The reason `wait' waits for process
>> substitution processes is that they set $!, making them "known to the
>> shell" and subject to wait without arguments.
>>
>> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wait.html#tag_20_153
> 
> This behavior is actually different between bash 4.4.18 and 5.0, but i
> think this is a separate 

Re: "here strings" and tmpfiles

2019-04-11 Thread Daniel Kahn Gillmor
On Thu 2019-04-11 10:04:02 +0200, Andreas Schwab wrote:
> On Apr 10 2019, Daniel Kahn Gillmor  wrote:
>
>> data written to the local filesystem can be discovered by someone
>> analyzing the disk controller data path, or by someone with access to
>> the underlying storage medium.
>
> Do you have swap enabled?

The machines i use that have swap have it enabled via dmcrypt with an
ephemeral key, so no cleartext RAM is ever written to disk.

This is pretty standard practice afaict.

  --dkg



Re: "here strings" and tmpfiles

2019-04-11 Thread konsolebox
On Thu, Apr 11, 2019, 10:42 PM Andreas Kusalananda Kähäri <
andreas.kah...@abc.se> wrote:

> On Thu, Apr 11, 2019 at 09:01:50PM +0800, konsolebox wrote:
> > On Thu, Apr 11, 2019, 4:04 PM Andreas Schwab  wrote:
> >
> > > On Apr 10 2019, Daniel Kahn Gillmor  wrote:
> > >
> > > > data written to the local filesystem can be discovered by someone
> > > > analyzing the disk controller data path, or by someone with access to
> > > > the underlying storage medium.
> > >
> > > Do you have swap enabled?
> > >
> >
> > It's 2019.
> >
> > --
> > konsolebox
>
> The point of Andreas' comment is, I presume, that if you have swap
> enabled, sensitive data may be written to that swap, either in low
> memory situations or when hibernating your laptop.  Discussion about
> whether temporary files are used or not for certain operations becomes
> less interesting if the data anyway runs the risk of being written to an
> unencypted swap.
>

I know but then again that's no longer just about bash and should be
corrected on system level.

It implicitly also gives the hint that using an encrypted temporary
> storage area may be considered by those with such needs (because they
> would hopefully already have thought about enabling some form of
> encryption of their swap partition or swap files).
>

Same argument.

--
konsolebox


Re: "here strings" and tmpfiles

2019-04-11 Thread Andreas Kusalananda Kähäri
On Thu, Apr 11, 2019 at 09:01:50PM +0800, konsolebox wrote:
> On Thu, Apr 11, 2019, 4:04 PM Andreas Schwab  wrote:
> 
> > On Apr 10 2019, Daniel Kahn Gillmor  wrote:
> >
> > > data written to the local filesystem can be discovered by someone
> > > analyzing the disk controller data path, or by someone with access to
> > > the underlying storage medium.
> >
> > Do you have swap enabled?
> >
> 
> It's 2019.
> 
> --
> konsolebox

The point of Andreas' comment is, I presume, that if you have swap
enabled, sensitive data may be written to that swap, either in low
memory situations or when hibernating your laptop.  Discussion about
whether temporary files are used or not for certain operations becomes
less interesting if the data anyway runs the risk of being written to an
unencypted swap.

It implicitly also gives the hint that using an encrypted temporary
storage area may be considered by those with such needs (because they
would hopefully already have thought about enabling some form of
encryption of their swap partition or swap files).

I'm sorry for adding to this overly long thread.

Regards,

-- 
Andreas Kusalananda Kähäri,
National Bioinformatics Infrastructure Sweden (NBIS),
Uppsala University, Sweden.



Re: "here strings" and tmpfiles

2019-04-11 Thread konsolebox
On Thu, Apr 11, 2019 at 9:06 PM Greg Wooledge  wrote:
> So... yes.  Because everyone in 2019 has a laptop and therefore has swap
> enabled because it's used for hibernation.

Sure captain. It was a joke.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-11 Thread Greg Wooledge
On Thu, Apr 11, 2019 at 09:01:50PM +0800, konsolebox wrote:
> On Thu, Apr 11, 2019, 4:04 PM Andreas Schwab  wrote:
> > Do you have swap enabled?
> 
> It's 2019.

So... yes.  Because everyone in 2019 has a laptop and therefore has swap
enabled because it's used for hibernation.

That was what you meant, right?  I'm sure it was.



Re: "here strings" and tmpfiles

2019-04-11 Thread konsolebox
On Thu, Apr 11, 2019, 4:04 PM Andreas Schwab  wrote:

> On Apr 10 2019, Daniel Kahn Gillmor  wrote:
>
> > data written to the local filesystem can be discovered by someone
> > analyzing the disk controller data path, or by someone with access to
> > the underlying storage medium.
>
> Do you have swap enabled?
>

It's 2019.

--
konsolebox


Re: "here strings" and tmpfiles

2019-04-11 Thread Greg Wooledge
On Thu, Apr 11, 2019 at 06:02:41AM +0200, Jason A. Donenfeld wrote:
> what about internally treating "x < x"? Are these somehow not quite equivalent because x is in a subshell
> in one but not the other, or something like that?

cmd <<< string   opens a temporary file for writing, dumps the string
plus a newline to the temporary file, reopens the temporary file for
reading as file descriptor 0 (stdin), and then runs cmd.  If you want
only string (no newline) to be written, you cannot use this approach.

printf %s string | cmdforks two subshells (or one if lastpipe is
enabled) with an anonymous pipe between them.  The first subshell runs
printf, and the second subshell runs cmd.  If you want to add a newline,
you may use '%s\n' instead of %s.  The subshell(s) are run as a single
"job" (process group) for job control purposes, unless monitor mode is
disabled.

cmd < <(printf %s string)   runs printf in a background subshell, and cmd
as a foreground command.  Bash will use either a named pipe or a /dev/fd/
entry to connect them, depending on the platform.

And, of course, you may always explicitly create your own temporary file
and write whatever you like into it (newlines or not); or explicitly
create your own named pipe and launch processes reading/writing from it
however you wish.

I do not believe it would be a wise decision to "internally treat <<<
like a pipeline", because that removes one of the choices in the
script writer's toolbox.  If you don't like the semantics or the
implementation of your chosen tool, choose a different one.

Reducing the toolbox because 3 people out of the entire planet decided
to pass passwords via <<< without knowing how <<< works seems like a
really poor idea to me.



Re: "here strings" and tmpfiles

2019-04-11 Thread Andreas Schwab
On Apr 10 2019, Daniel Kahn Gillmor  wrote:

> data written to the local filesystem can be discovered by someone
> analyzing the disk controller data path, or by someone with access to
> the underlying storage medium.

Do you have swap enabled?

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



Re: "here strings" and tmpfiles

2019-04-11 Thread Robert Elz
Date:Thu, 11 Apr 2019 06:02:41 +0200
From:"Jason A. Donenfeld" 
Message-ID:  



  | Now, it might be the case that bash really isn't the
  | right tool for that kind of thing, and I shouldn't use bash for tasks
  | with security requirements as such. But I sort of love bash, and have
  | written a lot of it, and I want it to be suitable for this.

This is exactly the attitude that ruins things.

Think of it being like

I love my dog.   He (or she) is familiar, friendly, and does
whatever I ask of him/her.   (Substitute cat if you're that kind
of weirdo!).

Now I want to play tennis.Here boy (girl).   Hold onto this
racquet and hit the ball back.

Oh, you need hands, well, let's see how I can make that happen
You're not tall enough to hit my lobs, well, leg extensions coming 
up...

Use the right tool for the job at hand, don't try and make every job
fit the tool that you like best.   That is fruitless, and likely to
ruin the tool while you're trying to bend it into the shape you need.

kre




Re: "here strings" and tmpfiles

2019-04-10 Thread Jason A. Donenfeld
I keep forgetting things. The other thing I wanted to bring up is that
I suspect bash's actual implementation of temporary files is
problematic and might have some of the classic /tmp and TOCTOU style
attacks. The current implementation is three-fold via ifdefs:

char *
sh_mktmpname (nameroot, flags)
char *nameroot;
int flags;
{
 char *filename, *tdir, *lroot;
 struct stat sb;
 int r, tdlen;
 static int seeded = 0;

 filename = (char *)xmalloc (PATH_MAX + 1);
 tdir = get_tmpdir (flags);
 tdlen = strlen (tdir);

 lroot = nameroot ? nameroot : DEFAULT_NAMEROOT;

#ifdef USE_MKTEMP
 sprintf (filename, "%s/%s.XX", tdir, lroot);
 if (mktemp (filename) == 0)
   {
 free (filename);
 filename = NULL;
   }
#else  /* !USE_MKTEMP */
 sh_seedrand ();
 while (1)
   {
 filenum = (filenum << 1) ^
   (unsigned long) time ((time_t *)0) ^
   (unsigned long) dollar_dollar_pid ^
   (unsigned long) ((flags & MT_USERANDOM) ? random () :
ntmpfiles++);
 sprintf (filename, "%s/%s-%lu", tdir, lroot, filenum);
 if (tmpnamelen > 0 && tmpnamelen < 32)
   filename[tdlen + 1 + tmpnamelen] = '\0';
#  ifdef HAVE_LSTAT
 r = lstat (filename, );
#  else
 r = stat (filename, );
#  endif
 if (r < 0 && errno == ENOENT)
   break;
   }
#endif /* !USE_MKTEMP */

 return filename;
}

The first one there uses mktemp(3), which is known to be racy and
insecure. The GNU man page has a pretty strong warning about it. Maybe
that's not used in GNU environments though?

The second one uses sh_seedrand(), which uses the bogus and
predictable random() libc stuff, or does nothing:

static void
sh_seedrand ()
{
#if HAVE_RANDOM
 int d;
 static int seeded = 0;
 if (seeded == 0)
   {
 struct timeval tv;

 gettimeofday (, NULL);
 srandom (tv.tv_sec ^ tv.tv_usec ^ (getpid () << 16) ^ (uintptr_t));
 seeded = 1;
   }
#endif
}

It then goes on to include a bunch of other things xored together and
whatnot, none of which are actually unpredictable.

Finally, in some cases where there isn't lstat, it uses plain stat
instead, which has problems. But even in the case where there's lstat,
it's calling it on the filename before opening it, resulting in
TOCTOU.

I haven't spent the 10 minutes reading onward at context and whatnot
to actually see what exactly you're doing; it's possible that you've
thought through all this and its particular use is fine somehow. But
from a cursory birdseye look, I wonder if there are three CVEs lurking
in here (mktemp, random stuff, toctou).

So just FYI.

But we could also avoid this entire tempfile creation discussion by
getting rid of the need for those functions in the first place. :)

Jason



Re: "here strings" and tmpfiles

2019-04-10 Thread Jason A. Donenfeld
On Thu, Apr 11, 2019 at 6:02 AM Jason A. Donenfeld  wrote:
> curious about is - what about internally treating "x <

Re: "here strings" and tmpfiles

2019-04-10 Thread Jason A. Donenfeld
Hi Chet,

On Wed, Apr 10, 2019 at 3:07 PM Chet Ramey  wrote:
> This is unnecessary hyperbole. The existing file-based mechanism works
> just fine. We're talking about what's essentially an optimization.
> [...]
> This doesn't make any sense.
> [...]
> There isn't an "insecure path."

I'm a bit late to the thread, so apologies if I came with some
presumptions about what we were discussing. I certainly didn't mean
any hyperbole. Rather, I have one very specific thing in mind: in some
security contexts, it's important that certain data doesn't hit the
disk but rather remains in memory. Talk all you want about how this
shouldn't be a real requirement, but actually, in certain contexts it
very much is. Now, it might be the case that bash really isn't the
right tool for that kind of thing, and I shouldn't use bash for tasks
with security requirements as such. But I sort of love bash, and have
written a lot of it, and I want it to be suitable for this. So from my
perspective, we're not talking about a mere optimization, but instead
something that either makes herestrings available for usage in this
context, or keeps them unavailable in this context.

I hope that can shed light on the motivation a bit. Pass got hit by
this a bit ago:
https://git.zx2c4.com/password-store/commit/?id=367efa5846492e1b0898aad8a2c26ce94163ba24

Anyway, the more interesting thing is discussing what a proper fix
would be. Do you see anything conceptually wrong with the NONBLOCK
approach I suggested? In theory, would that work? Another thing I was
curious about is - what about internally treating "x <

Re: "here strings" and tmpfiles

2019-04-10 Thread Daniel Kahn Gillmor
On Wed 2019-04-10 16:16:44 -0400, Chet Ramey wrote:
> Is it just that people have not realized all along that most shells,
> certainly all historical shells, that implement here documents use temp
> files to do it? It's really only the ash-based shells (not an insignificant
> portion of the shells in use, for sure) that use pipes exclusively.

Yes, i think most people have not realized that, all along.  I certainly
didn't, and i've been using bash for decades now.

> I'd love to see these concerns articulated concretely, so we can analyze
> the risk in terms of the temp-file-on-disk implementation.

I tried to do that at the start of this thread, but i'm happy to try
gain.

data written to the local filesystem can be discovered by someone
analyzing the disk controller data path, or by someone with access to
the underlying storage medium.

Even with encrypted block devices (e.g., dmcrypt), it's possible that
the key to the block device becomes known to the attacker, which allows
them to read the contents of blocks on the free list.

So anyone who puts anything sensitive into a here document or herestring
using bash is triggering a potential leak of that sensitive data.  While
the contents of many here documents are clearly not sensitive, or are
destined for the filesystem anyway, *some* data placed in heredocs *are*
sensitive, and are intended to be ephemeral.  Obvious examples include
cryptographic keying material or passwords known to the shell but not
currently on disk anywhere already.  But i'm sure there are many
non-obvious cases as well.  Part of the problem with evaluating this is
that there are many uses of bash that none of us on this list will ever
see.

> I'm sure Jason has been advocating that nobody use here documents since the
> Bourne shell introduced them, since that's the position he advocates above.

AFAICT, Jason only recently became aware of this, so he certainly hasn't
been advocating no one use here docuemnts  the Bourne
shell introduced them:

  
https://git.zx2c4.com/password-store/commit/?id=367efa5846492e1b0898aad8a2c26ce94163ba24

He's not the only person who has used them in an attempt at simplicity
or cleanliness only to find that this is a risk if sensitive data is
included in them.

> Unnecessary hyperbole diminshes the force of your argument.

i wasn't intending to use unnecessary hyperbole -- if you want to give
guidance that most people can follow, it's necessary for the guidance to
be clear and simple.  Complicated guidance gets ignored, or people get
confused and can't follow it closely.

fwiw, i'm not here to have an argument with you or to "win" at anything
-- i'm trying to help improve bash and the ecosystem around it, the same
as you are.  I certainly have no interest in having this be a contest.
If you see this discussion as an argument, please let me know and i'll
stop following up on this thread and adjust my expectations accordingly.

> The real question is why the ephemerality of heredocs and herestrings is
> important -- what is the use case that requires it, as opposed to it simply
> being an implementation detail?

This is an implementation detail that isn't well-documented.  in the
initial report, part (d) suggests that if this is an intended
implementation detail, it ought to be called out explicitly in the
documentation, so that users at least have a chance of avoiding this
kind of leakage.  (at least, those users who read the documentation, anyway)

>> Either of these implementations would be great, though i note that how
>> bash handles the extra fork (whether the parent or the child does the
>> pipe writing) might complicate the use of the wait builtin (or any
>> wait(-1) syscall) in the spawned process (e.g. similar to this bug
>> report: https://bugs.debian.org/920455)
>
> It wouldn't really affect that. The reason `wait' waits for process
> substitution processes is that they set $!, making them "known to the
> shell" and subject to wait without arguments.
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wait.html#tag_20_153

This behavior is actually different between bash 4.4.18 and 5.0, but i
think this is a separate discussion, so i'll defer it to a different
thread to avoid confusion here :) 

> What's gained by making that guarantee that was keeping people from using
> here documents before? I'm genuinely curious about use cases.

The security use case is for handling data that may be sensitive and
ephemeral.  Not every program using bash even knows whether the data it
has access to is intended to be ephemeral (and bash certainly doesn't
know), so unless the operator is explicitly aiming to write data to the
disk, it would be great to avoid touching non-volatile storage.

I hope this helps in understanding this different perspective somewhat.

all the best,

--dkg


signature.asc
Description: PGP signature


Re: "here strings" and tmpfiles

2019-04-10 Thread konsolebox
On Thu, Apr 11, 2019, 4:45 AM Chet Ramey  wrote:

> On 4/10/19 4:33 PM, konsolebox wrote:
> > On Wed, Apr 10, 2019 at 11:15 PM Chet Ramey  wrote:
> >> If we're going to go off into hypotheticals and speculation, it would be
> >> nice if memfd_create were available universally.
> >
> > I see many parts in lib/* that adapts to available system features
> > like mmap and MAP_ANONYMOUS.  I don't see why memfd_create should be
> > an exception.  Using a volatile file is much better than forking, and
> > this only requires a one-time implementation of a wrapper library
> > function that returns -1 if the feature is not available in the
> > system.  It should be easy to integrate with the current code since it
> > returns an fd.
>
> It has slightly inconvenient semantics, in that you can't open it more
> than once, and if you can't do that, you can't convert it from read-write
> to readonly.


Perhaps it can be reopened via /dev/fd.


--
konsolebox


Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/10/19 4:33 PM, konsolebox wrote:
> On Wed, Apr 10, 2019 at 11:15 PM Chet Ramey  wrote:
>> If we're going to go off into hypotheticals and speculation, it would be
>> nice if memfd_create were available universally.
> 
> I see many parts in lib/* that adapts to available system features
> like mmap and MAP_ANONYMOUS.  I don't see why memfd_create should be
> an exception.  Using a volatile file is much better than forking, and
> this only requires a one-time implementation of a wrapper library
> function that returns -1 if the feature is not available in the
> system.  It should be easy to integrate with the current code since it
> returns an fd.

It has slightly inconvenient semantics, in that you can't open it more
than once, and if you can't do that, you can't convert it from read-write
to readonly.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread Peter & Kelly Passchier
On 10/4/2019 09:04, Greg Wooledge wrote:
> On Wed, Apr 10, 2019 at 11:59:19AM -0400, Daniel Kahn Gillmor wrote:
>> If we look at the problem from the perspective of the risk of
>> herestring/heredoc content leaking to non-ephemeral storage,
> 
> The content is already in the damned SHELL SCRIPT.
> 
> How much more "non-ephemeral" can it get?

Both herestring and heredoc often contain variables, or some other
process substitution, for them to be completely literal is a less
interesting case for this issue.

Peter



Re: "here strings" and tmpfiles

2019-04-10 Thread konsolebox
On Wed, Apr 10, 2019 at 11:15 PM Chet Ramey  wrote:
> If we're going to go off into hypotheticals and speculation, it would be
> nice if memfd_create were available universally.

I see many parts in lib/* that adapts to available system features
like mmap and MAP_ANONYMOUS.  I don't see why memfd_create should be
an exception.  Using a volatile file is much better than forking, and
this only requires a one-time implementation of a wrapper library
function that returns -1 if the feature is not available in the
system.  It should be easy to integrate with the current code since it
returns an fd.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-10 Thread konsolebox
On Tue, Apr 9, 2019 at 10:01 PM Jason A. Donenfeld  wrote:
> A real solution for this issue involves getting rid of the temporary file
> all together. Since we're talking about a bash string, it's already in
> memory. Why not just fork() if the write() will block? A simple way would be
> to always fork(). A fancy way would be to set NONBLOCK mode, see if it
> returns EAGAIN, and only fork() if the write would block. Either way seem
> basically fine, with the critical part being that the temporary file is
> totally gone from the equation.

Except you now added forking.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/10/19 11:59 AM, Daniel Kahn Gillmor wrote:
> On Wed 2019-04-10 09:07:27 -0400, Chet Ramey wrote:

> I think we all agree that avoiding the filesystem where possible is
> likely to be an optimization and improvement (it means the heredoc will
> work in some circumstances where it didn't work before, and it means not
> having to juggle tmpfile names).
> 
> But as you said upthread, writing to disk is also a minor risk:
> 
 It's a risk that most shells and shell users accept, and have for many
 years. That doesn't suggest there's no risk, but that it's minor.

OK, so take a risk management approach. That risk is so minor that it has
existed -- and been accepted -- for as long as here documents have existed.
That risk I mentioned is not just security-related, though I imagine that's
where Jason's perspective is concentrated, but concerns resource shortages
(full disks) and availability (no writable file systems) as well. There is
such a thing as negligible risk, or risk that's easy to mitigate.

> 
> I think Jason is conerned about the risk specifically, and maybe doesn't
> care as much about the optimization.  I happen to care about both :)

Is it just that people have not realized all along that most shells,
certainly all historical shells, that implement here documents use temp
files to do it? It's really only the ash-based shells (not an insignificant
portion of the shells in use, for sure) that use pipes exclusively.

> 
> In particular, i care about the heredoc/herestring persistence risk
> because i know for a fact that some people don't understand that
> heredocs and herestrings aren't likely to be ephemeral, and have been
> using them with the intent of ephemerality.  I've personally harbored
> that misunderstanding in the past, and i'm pretty sure that there are
> other people using bash who are even less sophisticated about what's
> going on under the hood than i am.  I'd love to have bash protect those
> users from themselves at some point in the future automatically :)

I'd love to see these concerns articulated concretely, so we can analyze
the risk in terms of the temp-file-on-disk implementation.

>>>   - The security of this language construct is now OS and runtime-
>>> configuration dependent. That means it's not that reliable, and so
>>> we're basically back at advising square one: "don't use herestrings".
>>
>> This doesn't make any sense.
> 
> If we look at the problem from the perspective of the risk of
> herestring/heredoc content leaking to non-ephemeral storage, then
> i think Jason's perspective makes sense.  he's asking "how can we advise
> users of bash about the ephemerality of herestrings/heredocs?"  And if
> the answer is "it depends…" then the safe+simple guidance is "you MUST
> NOT assume that herestrings/heredocs are ephemeral".

I'm sure Jason has been advocating that nobody use here documents since the
Bourne shell introduced them, since that's the position he advocates above.
Unnecessary hyperbole diminshes the force of your argument.

The real question is why the ephemerality of heredocs and herestrings is
important -- what is the use case that requires it, as opposed to it simply
being an implementation detail?


>> If someone would like to take the code in the devel branch and add this, I
>> would certainly look at it.
> 
> Either of these implementations would be great, though i note that how
> bash handles the extra fork (whether the parent or the child does the
> pipe writing) might complicate the use of the wait builtin (or any
> wait(-1) syscall) in the spawned process (e.g. similar to this bug
> report: https://bugs.debian.org/920455)

It wouldn't really affect that. The reason `wait' waits for process
substitution processes is that they set $!, making them "known to the
shell" and subject to wait without arguments.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wait.html#tag_20_153

> 
> Anyway, if we had either fix in place, then the safe+simple guidance
> about ephemerality would be "as long as you're using bash version >=
> $SOME_VERSION, you can count on heredocs/herestrings being ephemeral",
> which would be super nice.

What's gained by making that guarantee that was keeping people from using
here documents before? I'm genuinely curious about use cases.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



signature.asc
Description: OpenPGP digital signature


Re: "here strings" and tmpfiles

2019-04-10 Thread Greg Wooledge
On Wed, Apr 10, 2019 at 11:59:19AM -0400, Daniel Kahn Gillmor wrote:
> If we look at the problem from the perspective of the risk of
> herestring/heredoc content leaking to non-ephemeral storage,

The content is already in the damned SHELL SCRIPT.

How much more "non-ephemeral" can it get?



Re: "here strings" and tmpfiles

2019-04-10 Thread Daniel Kahn Gillmor
On Wed 2019-04-10 09:07:27 -0400, Chet Ramey wrote:
> On 4/9/19 2:56 AM, Jason A. Donenfeld wrote:
>> Since originally raising this issue with dkg (leading to this email
>> thread), I've only followed along from a bit of a distance. But it does
>> look like there's been some good progress: there's now a commit that
>> fills the pipe up to the OS's maximum pipe size, and then falls back to
>> the old (buggy, vulnerable, scary) behavior. 
>
> This is unnecessary hyperbole. The existing file-based mechanism works
> just fine. We're talking about what's essentially an optimization.

I think we all agree that avoiding the filesystem where possible is
likely to be an optimization and improvement (it means the heredoc will
work in some circumstances where it didn't work before, and it means not
having to juggle tmpfile names).

But as you said upthread, writing to disk is also a minor risk:

>>> It's a risk that most shells and shell users accept, and have for many
>>> years. That doesn't suggest there's no risk, but that it's minor.

I think Jason is conerned about the risk specifically, and maybe doesn't
care as much about the optimization.  I happen to care about both :)

In particular, i care about the heredoc/herestring persistence risk
because i know for a fact that some people don't understand that
heredocs and herestrings aren't likely to be ephemeral, and have been
using them with the intent of ephemerality.  I've personally harbored
that misunderstanding in the past, and i'm pretty sure that there are
other people using bash who are even less sophisticated about what's
going on under the hood than i am.  I'd love to have bash protect those
users from themselves at some point in the future automatically :)

>>   - The security of this language construct is now OS and runtime-
>> configuration dependent. That means it's not that reliable, and so
>> we're basically back at advising square one: "don't use herestrings".
>
> This doesn't make any sense.

If we look at the problem from the perspective of the risk of
herestring/heredoc content leaking to non-ephemeral storage, then
i think Jason's perspective makes sense.  he's asking "how can we advise
users of bash about the ephemerality of herestrings/heredocs?"  And if
the answer is "it depends…" then the safe+simple guidance is "you MUST
NOT assume that herestrings/heredocs are ephemeral".

>> A real solution for this issue involves getting rid of the temporary file
>> all together. Since we're talking about a bash string, it's already in
>> memory. Why not just fork() if the write() will block? A simple way would be
>> to always fork(). A fancy way would be to set NONBLOCK mode, see if it
>> returns EAGAIN, and only fork() if the write would block. Either way seem
>> basically fine, with the critical part being that the temporary file is
>> totally gone from the equation.
>
> If someone would like to take the code in the devel branch and add this, I
> would certainly look at it.

Either of these implementations would be great, though i note that how
bash handles the extra fork (whether the parent or the child does the
pipe writing) might complicate the use of the wait builtin (or any
wait(-1) syscall) in the spawned process (e.g. similar to this bug
report: https://bugs.debian.org/920455)

Anyway, if we had either fix in place, then the safe+simple guidance
about ephemerality would be "as long as you're using bash version >=
$SOME_VERSION, you can count on heredocs/herestrings being ephemeral",
which would be super nice.

Thanks for all the work on bash!

  --dkg


signature.asc
Description: PGP signature


Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/9/19 12:19 AM, Robert Elz wrote:
> Date:Mon, 08 Apr 2019 17:04:41 -0700
> From:L A Walsh 
> Message-ID:  <5cabe199.9030...@tlinx.org>
> 
>   | On 4/8/2019 7:10 AM, Chet Ramey wrote:
> 
>   | > Pipes are objectively not the same as files. They
>   | >
>   | > 1. Do not have file semantics. For instance, they are not seekable.
>   | >   
>   | In the case of an object that is only meant to be read from,
>   | I would argue, "that's fine".
> 
> For stdin (or stdout/stderr), processes in general should not assume that
> seek will ever work, as terminals aren't seekable, nor are pipes

This is true, and not the same thing as saying that such programs do not
exist. They have in the past, and I've heard about them in the context of
making them work. I hope they've all been consigned to the dustbin of history.


> It is a filename that connects to an underlying file descriptor.
> Or I assume that is how bash does it.

It's a filename that exposes a pipe. The pipe can be anonymous and use
/dev/fd or a FIFO.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/9/19 7:48 PM, L A Walsh wrote:


> 
> I am aware of that, however, if a pipe implementation
> *stops* on reaching a full condition from some 'tmp-storage-space'
> and awaits for space to become available, a similar dynamic would
> apply.  That's all. 

Is there a pipe implementation out there that uses temporary files? And
if there is, should I worry about that less-than-the-lowest-common-
denominator case?

> 
> Example:  Suppose output from a program
> was buffered to disk files 64k in size.  The reader
> process would get input from those buffers on disk and
> free the files as they are read.  If the writer ran out of
> space, then sleeping and retrying the operation would make
> since, as it would be expected that the reader would be
> freeing blocks on disk as it read them.  It's not always
> a safe assumption, but what else can it do?

This doesn't have anything to do with the issue being discussed.

> 
>>   | Using a file doesn't sequence -- the writer can still continue
>>   | execution pass the point of bash possibly flagging an internal
>>   | error for a non-existent tmp file (writable media) and the
>>   | reader won't get that the "pipe" (file) had no successful writer,
>>   | but instead get an EOF indication and continue, not knowing that
>>   | a fatal error had just occurred.
>>
>> I doubt that is what happens.
>>   
> 
> That is what appeared to happen in the post mentioned by Chet.
> The boot process got a /dev/df/99 not found and continued on
> seemingly as though though there had been no input.

I'm not sure what this means. Failing to open a file isn't a fatal script
error.

>>   | However, that would
>>   | be code in the pipe implementation or an IO library on top
>>   | of some StdIO implementation using such.
>>
>> Pipes are implemented in the kernel - userland does nothing different
>> at all (except the way they are created.)
>>   
> 
> They usually are.  That doesn't prevent a stdlib implementation
> putting a wrapper around some "non-compliant" kernel call
> to implement a different 'view' to the users of that lib.

If we're going to go off into hypotheticals and speculation, it would be
nice if memfd_create were available universally.

> 
>>   | W/pipes, there is the race condition of the reader not being able
>>   | to read in the condition where the writer has already gone away.
>>
>> Huh?   That's nonsense.   It is perfectly normal for a reader
>> to read long after the writer has finished and exited.   Try this
>>
>>  printf %s\\n hello | { sleep 5; cat; }
>>   
> ===
> It may be normal in some cases, but:
> 
> https://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error

That case is exactly the opposite of the one being discussed here: it's a
write on a pipe when the reader has exited.

>> You are still missing Chet's point.   There is no "< <()" operator.
>> That is two bash syntax elements being combined.  "<" (redirect stdin)
>> and "<()" (create a name to refer to the output from the command).
>>   
> 
> I've never seen <() used without '<', so I thought it was
> part of the syntax '< <()'.  

You can't claim ignorance here. You've been told many times that these
are two separate elements.



> We are talking tradeoffs of using pipes to communicate
> heredocs vs. using a temporary file (presumably in/on /tmp), no?
> The statement reflected my thinking about how, currently,
> the entire contents of the pipe is being "spilled to disk"
> (spilled in the sense of their being insufficient room in
> memory -- or, in this case of there being no 'in-memory'
> implementation at all).

This makes no sense the way you wrote it. If you mean the current
implementation of using temporary files for here documents, you're
correct.


>>   | If bash uses /tmp, it can have a pipe of size 4.7G.  If
>>   | it uses memory, it would have pipe of 79G.
>>
>> That's gibberish.
>>   
> Oh please, its not that obtuse.  If bash currently writes the
> entire contents of "whatever" it is (the here doc), to a temporary
> file, then it is limited by the space on the temporary file system.

If you want to make that point, use `file' instead of `pipe'. What you
wrote doesn't make any sense otherwise. The word `pipe' has a specific
meaning in this context.


> 
> But the implementation of process substitution in bash
> isn't implemented that way in the currently released version.  It
> uses a tmp file on a disk of fixed size to store *all* of the output
> of the 'writer' before the reader is called.

This is absolutely not true. If you believe this, it might be a reason
you have made incorrect assumptions about other things. Process
substitution uses pipes: anonymous pipes exposed through /dev/fd or FIFOs.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread konsolebox
On Wed, Apr 10, 2019, 1:09 AM Eli Schwartz  wrote:
> That being said, it seems like a rather odd place to configure and use a
> heavyweight shell merely to allow third parties to include
> downstream-specific bashisms. I think there is a great deal of wisdom in
> the fact that the referenced issue (
> https://github.com/OpenRC/openrc/issues/288 ) is not accepted (it is
> still under discussion).

No there doesn't have to be.  The patch simply allows users who
already use bash as their shell to easily and properly configure bash
as the host shell for OpenRC.  Politics on how people should be forced
to stay conformant to conservative standards are out of context.
Also, oddness doesn't count as an argument.  And pretty much everyone
who would want to use bash knows that they'll be using a heavier
shell.

Any idea about idea about breakage, compatibility, change in software
feature or performance, something unwarranted and futuristic that
could harm the ecosystem, etc., are common knowledge and need not to
be emphasized.  Anyone who knows how to properly hack software knows
when and when not to write portable software.  And they know what they
gain or lose.  Too much conformity hinders innovation.

> The commit itself has nothing to do with bash, and is just as useful for
> changing openrc to use, for example, a statically compiled POSIX sh
> shell that is less likely to break, while /bin/sh is a less
> system-critical component -- or even a symlink to the heavyweight bash
> that you don't want slowing down your boot process.

The commit was explicitly made to "make it possible to use bash for
service scripts".  It wasn't about using another POSIXist shell that
is likely to break less.  I thought you saw the referenced issue.
And it's been "committed", though still experimental.  I'm not sure
what you mean by "not accepted", or how important it is for the idea
to be.

--
konsolebox



Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/9/19 10:07 AM, konsolebox wrote:

> Perhaps bash can also look at /dev/shm. It's a common tmpfs, but I
> haven't checked if it's standard and what utility mounts it.  I don't
> really use it.

Another non-portable feature.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/9/19 2:56 AM, Jason A. Donenfeld wrote:
> Since originally raising this issue with dkg (leading to this email
> thread), I've only followed along from a bit of a distance. But it does
> look like there's been some good progress: there's now a commit that
> fills the pipe up to the OS's maximum pipe size, and then falls back to
> the old (buggy, vulnerable, scary) behavior. 

This is unnecessary hyperbole. The existing file-based mechanism works
just fine. We're talking about what's essentially an optimization.


> Seems like there are several
> problems with this approach:
> 
>   - Determining the maximum pipe size at build time doesn't make sense
> for systems where such a thing is actually determined (and adjustable)
> at runtime.

The alternative is to use PIPE_BUF, which would be fine but throw away a
lot of possible uses. For instance, my Mac OS X system has a pipe capacity
of 64K, but PIPE_BUF is set to 512 bytes. There are a lot of scripts that
could take advantage of that difference to use pipes.

>   - The security of this language construct is now OS and runtime-
> configuration dependent. That means it's not that reliable, and so
> we're basically back at advising square one: "don't use herestrings".

This doesn't make any sense.

> 
>   - If user-supplied input is used in a herestring, the user now controls
> whether the secure path or the insecure path is used.

There isn't an "insecure path."

> 
> A real solution for this issue involves getting rid of the temporary file
> all together. Since we're talking about a bash string, it's already in
> memory. Why not just fork() if the write() will block? A simple way would be
> to always fork(). A fancy way would be to set NONBLOCK mode, see if it
> returns EAGAIN, and only fork() if the write would block. Either way seem
> basically fine, with the critical part being that the temporary file is
> totally gone from the equation.

If someone would like to take the code in the devel branch and add this, I
would certainly look at it.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread Chet Ramey
On 4/9/19 1:03 AM, pepa65 wrote:

> I think Linda's main drive is to seek improvement in how bash works. Now
> that lack of memory is in no way a constraint for the vast majority of
> situations where bash is commonly used, it would be great if that memory
> could be used instead of writing to a file system -- whether a pipe, a
> here doc/string does that, or explicitly through redirection. Things
> could work without requiring the presence of a file system.

There is no portable way to turn an arbitrary block of memory into a file
descriptor. There are mmap-based approaches that can get you most of the
way there, but they require a file descriptor to start with.

You can always invent some kind of local IPC that uses memory buffers, but
the implementation cost of doing that outweighs the benefit.


> If temporary files are not created in all cases of here docs/strings, it
> would be great if the buffer size that bash allocates could be set.

Some systems allow the pipe buffer size to be set, but that's not portable
either.

> Bash not writing temporary files for here strings & docs would be a
> great feature to me.

Look at the devel branch.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-10 Thread Greg Wooledge
On Tue, Apr 09, 2019 at 04:48:30PM -0700, L A Walsh wrote:
> But the implementation of process substitution in bash
> isn't implemented that way in the currently released version.  It
> uses a tmp file on a disk of fixed size to store *all* of the output
> of the 'writer' before the reader is called.

You're getting confused.

Process substititutions  <(cmd)  >(cmd)   use either a named pipe, or
an entry in /dev/fd/, depending on the platform, as determined at
build time.  The cmd is run as a child process in the background.
The <() or >() syntax is replaced by the filename of the named pipe or
/dev/fd/ entry.

Here documents  <<  and here strings  <<<  use temporary files.  No
new processes are created.



Re: "here strings" and tmpfiles

2019-04-10 Thread Greg Wooledge
On Tue, Apr 09, 2019 at 04:48:30PM -0700, L A Walsh wrote:
> > cp <(process) /tmp/foo
> ---
> *red face*  I'd never tried to copy something that
> looked like input redirection.  My apologies on my misconception.

One of the archetypal examples that we give when explaining process
substitution to new people is:

diff -u <(sort file1) <(sort file2)



Re: "here strings" and tmpfiles

2019-04-09 Thread L A Walsh



On 4/8/2019 9:19 PM, Robert Elz wrote:
>  
>   | Optionally, I would accept that
>   | an implementation would support forward seeking as some equivalent
>   | to having read the bytes.
>
> I suppose one could make pipes do that, but no implementation I have
> ever seen does, so I don't think you should hold your breath waiting for that 
> one to happen.
>   
Never seen it either, and was only stating that I could see
it being supported as one can skip input, however, it's
counter-intuitive that any mechanism seeking backwards would
make sense.
>   | > 2. Have limited capacity. Writers will sleep when the pipe becomes full.
>   | >   
>   | So does a read-only disk, except writer doesn't flag the error to
>   | the reader in the same way a broken pipe would.
>
> Broken pipe wasn't Chet's point, rather with pipes it is possible to
> deadlock - an obvious example where a shell needs to be careful is
> in something like
>
>   X=$( cat << FOO )
>   

I am aware of that, however, if a pipe implementation
*stops* on reaching a full condition from some 'tmp-storage-space'
and awaits for space to become available, a similar dynamic would
apply.  That's all. 

Example:  Suppose output from a program
was buffered to disk files 64k in size.  The reader
process would get input from those buffers on disk and
free the files as they are read.  If the writer ran out of
space, then sleeping and retrying the operation would make
since, as it would be expected that the reader would be
freeing blocks on disk as it read them.  It's not always
a safe assumption, but what else can it do?

[explanation of data piping elided -- seems to be similar
to using a tmp-space in a manner similar to my example].


> In general here docs (and here strings) are overused ...
>   
---
Often the choice is based on intent and a matter of
script formatting.

> ...
>
>   | since writing to a read-only tmp or reading from a non
>   | existent fileshould be regarded as writing to a pipe with no
>   | listeners (because no one will ever be able to read from that
>   | 'tmp' file since it doesn't exist).
>
> Sorry, that makes no sense.   The file cases have no valid fd
> (opening a non-existant file fails, opening a file for writing
> on a read only filesys fails).   A better analogy would be when
> writing to a file fails when the filesystem becomes full, or the
> user's quota is exceeded.
>   
Precisely, you are correct.  I was referring to an attempt of
mapping errors in using a file for tmp-space into types of errors
one would normally get from a real pipe.

That said, I could also imagine trying to open output to a
process on a process of a different security level on a
mandatory-access controlled OS where the writer doesn't
have permission to write or send information to the
'reader'.  If that happened, I would think it would have
equivalent error semantics as trying to open
a write-FD, on a RO file system.  This would especially be true
if the device's RO-state wasn't known about until attempting
to write to it (like an unwritable CD media in a CD-writer device).

>   | Using a file doesn't sequence -- the writer can still continue
>   | execution pass the point of bash possibly flagging an internal
>   | error for a non-existent tmp file (writable media) and the
>   | reader won't get that the "pipe" (file) had no successful writer,
>   | but instead get an EOF indication and continue, not knowing that
>   | a fatal error had just occurred.
>
> I doubt that is what happens.
>   

That is what appeared to happen in the post mentioned by Chet.
The boot process got a /dev/df/99 not found and continued on
seemingly as though though there had been no input.
>   | However, that would
>   | be code in the pipe implementation or an IO library on top
>   | of some StdIO implementation using such.
>
> Pipes are implemented in the kernel - userland does nothing different
> at all (except the way they are created.)
>   

They usually are.  That doesn't prevent a stdlib implementation
putting a wrapper around some "non-compliant" kernel call
to implement a different 'view' to the users of that lib.

>   | W/pipes, there is the race condition of the reader not being able
>   | to read in the condition where the writer has already gone away.
>
> Huh?   That's nonsense.   It is perfectly normal for a reader
> to read long after the writer has finished and exited.   Try this
>
>   printf %s\\n hello | { sleep 5; cat; }
>   
===
It may be normal in some cases, but:

https://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error

I've encountered this error when I've use pipes. You may
not be seeing it due to buffer sizes (default buffer size
on linux it is 1M).
>   | "Various purposes"...  Ok, so how do I give that file name
>   | to 'cp' in the next line and copy it somewhere?
>
> You mean
>
>   cp <(process) /tmp/foo
>
> It is, it has to be to work.
>   
---
*red face*  I'd never tried to copy 

Re: "here strings" and tmpfiles

2019-04-09 Thread Eli Schwartz
On 4/9/19 10:25 AM, konsolebox wrote:
> On Mon, Apr 8, 2019 at 10:39 PM Greg Wooledge  wrote:
>> That's incorrect in this context.  We're talking about boot scripts here,
>> not interactive user shells.  In boot scripts, on every operating system
>> I've ever used, the shell being used is either POSIX sh or Bourne sh.
>>
>> Everyone who writes boot scripts knows this.  Except, apparently, you.
> 
> Not everyone who aren't distro slaves.
> https://github.com/OpenRC/openrc/commit/d64c9d205083ca82823f9f5ff178a5581f6c8b2a
> 
> A group of "popular" or historical distros don't define how a Linux
> system should be built.

Arch Linux has used bash as the default system /bin/sh for as long as I
know of, including since before the switch from sysvinit to systemd.
(Although I'm by no means the only person to replace it with a symlink
to dash.)

That being said, it seems like a rather odd place to configure and use a
heavyweight shell merely to allow third parties to include
downstream-specific bashisms. I think there is a great deal of wisdom in
the fact that the referenced issue (
https://github.com/OpenRC/openrc/issues/288 ) is not accepted (it is
still under discussion).

The commit itself has nothing to do with bash, and is just as useful for
changing openrc to use, for example, a statically compiled POSIX sh
shell that is less likely to break, while /bin/sh is a less
system-critical component -- or even a symlink to the heavyweight bash
that you don't want slowing down your boot process.

-- 
Eli Schwartz
Arch Linux Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: "here strings" and tmpfiles

2019-04-09 Thread konsolebox
On Tue, Apr 9, 2019 at 11:28 PM Chet Ramey  wrote:
>
> On 4/9/19 11:25 AM, konsolebox wrote:
> > On Tue, Apr 9, 2019 at 10:28 PM Chet Ramey  wrote:
> >>
> >> On 4/9/19 10:10 AM, konsolebox wrote:
> >>> On Wed, Mar 20, 2019 at 8:19 PM Greg Wooledge  wrote:
> 
>  Just like that one time L. Walsh tried to write a bash boot script that
>  used <() to populate an array, and it failed because she was running
>  it too early in the boot sequence, and /dev/fd/ wasn't available yet.
> >>>
> >>> @Chet, Isn't bash supposed to use named pipes alternatively, and
> >>> dynamically?
> >>
> >> No. It's a build-time decision, and /dev/fd is preferred.
> >
> > Why not make it load-time at least?
>
> Maybe someday, but it's extremely low priority.

Yeah, and also perhaps lazy initialization is better. Using load-time
means it doesn't matter if /dev/fd gets fixed later through
initialization of udev, etc.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-09 Thread Chet Ramey
On 4/9/19 11:25 AM, konsolebox wrote:
> On Tue, Apr 9, 2019 at 10:28 PM Chet Ramey  wrote:
>>
>> On 4/9/19 10:10 AM, konsolebox wrote:
>>> On Wed, Mar 20, 2019 at 8:19 PM Greg Wooledge  wrote:

 Just like that one time L. Walsh tried to write a bash boot script that
 used <() to populate an array, and it failed because she was running
 it too early in the boot sequence, and /dev/fd/ wasn't available yet.
>>>
>>> @Chet, Isn't bash supposed to use named pipes alternatively, and
>>> dynamically?
>>
>> No. It's a build-time decision, and /dev/fd is preferred.
> 
> Why not make it load-time at least?  

Maybe someday, but it's extremely low priority.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-09 Thread konsolebox
On Tue, Apr 9, 2019 at 10:28 PM Chet Ramey  wrote:
>
> On 4/9/19 10:10 AM, konsolebox wrote:
> > On Wed, Mar 20, 2019 at 8:19 PM Greg Wooledge  wrote:
> >>
> >> Just like that one time L. Walsh tried to write a bash boot script that
> >> used <() to populate an array, and it failed because she was running
> >> it too early in the boot sequence, and /dev/fd/ wasn't available yet.
> >
> > @Chet, Isn't bash supposed to use named pipes alternatively, and
> > dynamically?
>
> No. It's a build-time decision, and /dev/fd is preferred.

Why not make it load-time at least?  Not that I really care, since I
know when I can use process substitution in my scripts and when not.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-09 Thread Chet Ramey
On 4/9/19 10:10 AM, konsolebox wrote:
> On Wed, Mar 20, 2019 at 8:19 PM Greg Wooledge  wrote:
>>
>> Just like that one time L. Walsh tried to write a bash boot script that
>> used <() to populate an array, and it failed because she was running
>> it too early in the boot sequence, and /dev/fd/ wasn't available yet.
> 
> @Chet, Isn't bash supposed to use named pipes alternatively, and
> dynamically?  

No. It's a build-time decision, and /dev/fd is preferred.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-09 Thread konsolebox
On Mon, Apr 8, 2019 at 10:39 PM Greg Wooledge  wrote:
> That's incorrect in this context.  We're talking about boot scripts here,
> not interactive user shells.  In boot scripts, on every operating system
> I've ever used, the shell being used is either POSIX sh or Bourne sh.
>
> Everyone who writes boot scripts knows this.  Except, apparently, you.

Not everyone who aren't distro slaves.
https://github.com/OpenRC/openrc/commit/d64c9d205083ca82823f9f5ff178a5581f6c8b2a

A group of "popular" or historical distros don't define how a Linux
system should be built.

-- 
konsolebox



Re: "here strings" and tmpfiles

2019-04-09 Thread Greg Wooledge
On Tue, Apr 09, 2019 at 10:10:44PM +0800, konsolebox wrote:
> @Chet, Isn't bash supposed to use named pipes alternatively, and
> dynamically?  Or does it just decide what to use based on the current
> system?

The second thing.  On platform X, bash uses named pipes.  On platform Y,
bash uses /dev/fd/.  It's decided at compile time.



Re: "here strings" and tmpfiles

2019-04-09 Thread konsolebox
On Wed, Mar 20, 2019 at 8:19 PM Greg Wooledge  wrote:
>
> Just like that one time L. Walsh tried to write a bash boot script that
> used <() to populate an array, and it failed because she was running
> it too early in the boot sequence, and /dev/fd/ wasn't available yet.

@Chet, Isn't bash supposed to use named pipes alternatively, and
dynamically?  Or does it just decide what to use based on the current
system?

-- konsolebox



Re: "here strings" and tmpfiles

2019-04-09 Thread konsolebox
On Wed, Mar 20, 2019 at 9:05 AM Robert Elz  wrote:
> Note: I am not suggesting bash should change - using files for here docs
> is the way they were originally implemented (in the Bourne sh) (though it
> had bugs, which could leave the files lying around in some cases).
>
> However, using files for here docs makes here docs unusable in a shell
> running in single user mode with no writable filesystems (whatever is
> mounted is read only, until after file system checks are finished).

Here docs and here strings are rarely used in pre-rw boot scripts, and
in my opinion should be avoided, but if it's necessary, an initramfs
should be used.  Some users can also mount /tmp as tmpfs earlier if
they know what they are doing.

Perhaps bash can also look at /dev/shm. It's a common tmpfs, but I
haven't checked if it's standard and what utility mounts it.  I don't
really use it.

Again to be clear, I'm against here * being used or viewed as seekable files.

-- konsolebox



Re: "here strings" and tmpfiles

2019-04-09 Thread Jason A. Donenfeld
Since originally raising this issue with dkg (leading to this email
thread), I've only followed along from a bit of a distance. But it does
look like there's been some good progress: there's now a commit that
fills the pipe up to the OS's maximum pipe size, and then falls back to
the old (buggy, vulnerable, scary) behavior. Seems like there are several
problems with this approach:

  - Determining the maximum pipe size at build time doesn't make sense
for systems where such a thing is actually determined (and adjustable)
at runtime.

  - The security of this language construct is now OS and runtime-
configuration dependent. That means it's not that reliable, and so
we're basically back at advising square one: "don't use herestrings".

  - If user-supplied input is used in a herestring, the user now controls
whether the secure path or the insecure path is used.

A real solution for this issue involves getting rid of the temporary file
all together. Since we're talking about a bash string, it's already in
memory. Why not just fork() if the write() will block? A simple way would be
to always fork(). A fancy way would be to set NONBLOCK mode, see if it
returns EAGAIN, and only fork() if the write would block. Either way seem
basically fine, with the critical part being that the temporary file is
totally gone from the equation.

Thoughts on this?

Thanks,
Jason



Re: "here strings" and tmpfiles

2019-04-09 Thread Chet Ramey
On 4/9/19 8:36 AM, Greg Wooledge wrote:

> Bash always forks for $() as far as I'm aware, which is why bash 3.1
> introduced printf -v var.  

It's not, but it was a nice side effect.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-09 Thread Greg Wooledge
On Tue, Apr 09, 2019 at 02:32:38PM +0700, Robert Elz wrote:
> The idea is basically just to do
> 
>   var=$( cmd )
> 
> right?   But without a fork.   That's something that can be done today,
> no new syntax needed (bash might even do it sometimes, I don't know, the
> FreeBSD shell does.)

wooledg:~$ strace -o log bash -c 'x=$(echo hi)'
...
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f5166f16a10) = 19218
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=0x562d61250410, sa_mask=[], 
sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f5166f50940}, 
{sa_handler=0x562d61250410, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, 
sa_restorer=0x7f5166f50940}, 8) = 0
close(4)= 0
read(3, "hi\n", 128)= 3
read(3, "", 128)= 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=19218, si_uid=563, 
si_status=0, si_utime=0, si_stime=0} ---
...

Bash always forks for $() as far as I'm aware, which is why bash 3.1
introduced printf -v var.  That's the only way to get printf-formatted
output into a bash variable without using a temp file or a fork.



Re: "here strings" and tmpfiles

2019-04-09 Thread Robert Elz
Date:Mon, 8 Apr 2019 23:36:39 -0700
From:pepa65 
Message-ID:  

  | When in the past I proposed this syntax:
  |  cmd >>>var
  | the idea was to commit the output of a command into memory (in the form
  | of a variable), without requiring a pipe or file.

In general that cannot work, cmd and the shell are in separate
processes, even if some form of shared memory were used, cmd would
somehow have to be taught how to do that - but only to do it when
that particular form of output is being used (which since redirects
are handled by the shell, it normally knows nothing about).

Of course if cmd is built into the shell, then it would be easy,
but inventing new syntax which only works in very special cases is
not a good idea.

The idea is basically just to do

var=$( cmd )

right?   But without a fork.   That's something that can be done today,
no new syntax needed (bash might even do it sometimes, I don't know, the
FreeBSD shell does.)

When cmd is not built in, then the shell simply forks, after making
a pipe, and it works as you'd expect.   But when cmd is built in, and
executing it will do no (lasting) damage to the shell execution environment,
then there's no need to fork.   Since neither printf nor echo affect the
execution environment at all, they're perfect cases for that kind of
optimisation (this is also a frequent idiom, so can have real benefits.)

  | What is the technique you are referring to?

Exactly the above, if cmd is built in, its output goes into memory,
more or less what would happen (inside the shell, where all this is
happening) just as if its output were read from a pipe for a non-builtin,
but with no pipe (or other I/O) involved. Then that data is simply assigned.

The same technique works for stdin, in a case like cmd1 | cmd2
where both are builtin - cmd1 writes into a memory buffer, and cmd2
reads from that same thing (this needs care as the shell needs to
handle any scheduelling that's required, running cmd1 until it
ends or the buffer fills, then cmd2 until it has consumed all
available, then nack to cmd1 again...)   Whether this is worth the
effort is questionable.   The same can be done for here docs (or
strings) being read by built in commands, which was the actual case
I had in mind in the previous message.

Of course, there are often also easier techniques - a lot of the
examples being tossed around have easier (if perhaps more verbose)
ways to be written.   If you want to assign some known data (such
that you could put it in a here doc/string - which includes values
of variables, etc, of course) then rather than

read a b c <<< 'word1 word2 word3'

which is admittedly very compact, and looks cute, you can just do

a=word1 b=word2 c=word3

and the same when you're filling in an array, you just
need to explicitly add the subscripts.

I suspect that some of this is because bash's "readarray" is
slightly different than "read" or a simple assignment (this is
a guess based entirely upon bits and pieces I have picked up
from this list) - which is an example of why adding new special
case "stuff" is not a great idea in general, if it works just the
same as the existing stuff, then it isn't needed (perhaps just as
a frill for simplicity), if it doesn't, then it tends to interact
badly with everything else.


The point is that this kind of thing can be done just using optimisation
techniques of the current syntax - and a script that uses it will
work anywhere (just perhaps not as fast) - inventing new stuff to
try and make things work better is rarely a good idea, it just makes
the whole system a gigantic mess of ad-hoc special cases.

Of course, if there's a problem with the way that $( ) is defined
to work (like the trailing \n stripping, or whatever) that can be
addressed, either by some new syntax "this is just like that, except
that in the new one " or by some shell options that modify the
way that things work, which can be set by a script that knows what
it is doing (perhaps set inside the cmd substitution itself, so it
only affects that one, or outside to affect all of them.)   Of course,
any of that loses portability.

  | But when data gets passed between commands, it would be great if memory
  | could be used for that, for various good reasons. :-)

That's what a pipe is.   In general there needs to be some mechanism
that is general enough that any random command works with it, and that
means having the kernel involved to manage access to the data safely.

A file in a memory backed filesystem (a tmpfs or whatever) isn't that
much different.

If you have a very specialised set of commands that want to communicate
with each other, then you can write them to use shared memory, and have
them communicate that way - but there is little chance that all the
standard commands (or even any of them) are suddenly going to be modified
to make that work for general use.   And there is certainly no way to
make that happen by some 

Re: "here strings" and tmpfiles

2019-04-09 Thread pepa65
On 8/4/2019 22:53, Robert Elz wrote:
> [Aside: when the destination is a builtin, another strategy is to
>  simply write the here doc into mem, and have the builtin read directly
>  from the mem buffer - no actual I/O of any kind involved at all.]

When in the past I proposed this syntax: cmd >>>var
the idea was to commit the output of a command into memory (in the form
of a variable), without requiring a pipe or file.

What is the technique you are referring to?

>   | I think Linda's main drive is to seek improvement in how bash works.
> 
> That's a fine objective - but remember that the shells primary purpose
> is to run other commands (interactively or via a script) - the real
> work should normally be done by the other commands, not by the shell.

But when data gets passed between commands, it would be great if memory
could be used for that, for various good reasons. :-)

Peter



Re: "here strings" and tmpfiles

2019-04-08 Thread Robert Elz
Date:Mon, 8 Apr 2019 22:03:25 -0700
From:pepa65 
Message-ID:  

  | What is nice about here docs/strings is that there are no subshells
  | involved.

When they use files, that's correct, but when a pipe is used, unless
the data size is both known in advance, and limited, a sub-shell is
needed just to write the here doc text through the pipe, so it is
able to stall when needed without affecting anything else.

That's why Chet suggested the possibility of using a pipe for small
here docs, and a file for big ones (though to me, a pipe for all makes
more sense - the difference between small and large is whether the
shell simply writes into the kernel pipe buffer, and then exec's the
process to read it (or reads itself for a builtin) or whether it
forks a subshell whose job it is is to merely feed the here doc data
into the pipe (allowing stalls when the pipe buffer fills).

[Aside: when the destination is a builtin, another strategy is to
 simply write the here doc into mem, and have the builtin read directly
 from the mem buffer - no actual I/O of any kind involved at all.]

Using a sub-shell also helps with the posix semnatics which require
that the redirects be evaluated in a sub-shell context (generally that
of the command they're being used with).   There are ways to fake that,
but the simple way is much easier, and more reliable.

  | I think Linda's main drive is to seek improvement in how bash works.

That's a fine objective - but remember that the shells primary purpose
is to run other commands (interactively or via a script) - the real
work should normally be done by the other commands, not by the shell.
So, optimising how commands get located, exec'd, ... is all a great idea
(avoiding forking wherever possible) but spending lots of time and
adding all kinds of trash to allow complete programs to just be written
as sh script is probably the wrong approach - there are much better
programming languages around for general purpose programming than sh.
(Which to use depends upon the nature of the program.)

Also, in my earlier e-mail I gave this example:

  | printf %s\\n hello | { sleep 5; cat; } 

I realised after that a better test case is ...

   { date +1:%c; date +2:%c;} | { sleep 20; date +3:%c; cat;}; date +4:%c

when I ran it (using bash, if it matters, it shouldn't)

3:Tue Apr  9 11:54:20 2019
1:Tue Apr  9 11:54:00 2019
2:Tue Apr  9 11:54:00 2019
4:Tue Apr  9 11:54:20 2019

(and yes, I contrived to start the sequence just as the time reached the
start of a minute, so that is real output, but required me to hit "return"
at just the right time to generate it...)

In that the '3' line (the first) shows the time just before cat
starts, the '1' and '2' lines are output from cat, through the pipe.
The point of the second one (the '2' line) is to show that the fist
date command did not stall waiting for the cat to start - it wrote
its output and exited, allowing the 2nd date command to run at the
same apparent time - 20 seconds before cat started reading anything.
That second date command finished just as quickly.   The write side
of the pipe would be closed after that, with the sleep still running.
Then date '3', the cat which reads the output from date 1 & 2, and
a final date ('4') just to round things off.   Reading from a pipe
after the writer has finished works just fine...

kre




Re: "here strings" and tmpfiles

2019-04-08 Thread pepa65
On 8/4/2019 21:19, Robert Elz wrote:
> In general here docs (and here strings) are overused - it is always
> possible to simply write a pipe instead

What is nice about here docs/strings is that there are no subshells
involved.

I think Linda's main drive is to seek improvement in how bash works. Now
that lack of memory is in no way a constraint for the vast majority of
situations where bash is commonly used, it would be great if that memory
could be used instead of writing to a file system -- whether a pipe, a
here doc/string does that, or explicitly through redirection. Things
could work without requiring the presence of a file system.

> Some do, actually - in fact, I think all do, they start off with
> no memory allocated, and grab more as data is written.   But they
> all have a limit on how much they will buffer for one pipe, otherwise
> one stupid process could clog the system for everyone (having no
> available memory/swap is a much worse situation than a filesystem
> simply being full.)

If temporary files are not created in all cases of here docs/strings, it
would be great if the buffer size that bash allocates could be set.

Bash not writing temporary files for here strings & docs would be a
great feature to me.

Peter



Re: "here strings" and tmpfiles

2019-04-08 Thread Robert Elz
Date:Mon, 08 Apr 2019 17:04:41 -0700
From:L A Walsh 
Message-ID:  <5cabe199.9030...@tlinx.org>

  | On 4/8/2019 7:10 AM, Chet Ramey wrote:

  | > Pipes are objectively not the same as files. They
  | >
  | > 1. Do not have file semantics. For instance, they are not seekable.
  | >   
  | In the case of an object that is only meant to be read from,
  | I would argue, "that's fine".

For stdin (or stdout/stderr), processes in general should not assume that
seek will ever work, as terminals aren't seekable, nor are pipes, so if
some command is run as:
cmd
or
whatever | cmd

it cannot expect to seek stdin, it won't work, and it cannot really
tell the difference (well, it can, if it insists, but shouldn't)
between those and
cmd << EOF  (or <<< if you insist)
or
cmd < filename

so even if those happen to be seekable (the second is, the first
might be) cmd should never rely upon that.

  | Optionally, I would accept that
  | an implementation would support forward seeking as some equivalent
  | to having read the bytes.

I suppose one could make pipes do that, but no implementation I have
ever seen does, so I don't think you should hold your breath waiting for
that one to happen.

  | > 2. Have limited capacity. Writers will sleep when the pipe becomes full.
  | >   
  | So does a read-only disk, except writer doesn't flag the error to
  | the reader in the same way a broken pipe would.

Broken pipe wasn't Chet's point, rather with pipes it is possible to
deadlock - an obvious example where a shell needs to be careful is
in something like

X=$( cat << FOO )

(where the here doc text is also there in whatever place the shell
in question requires) - and particularly if the shell happens to have
cat builtin, and is able to simplement simple command substitutions
without forking.

There the shell (if it uses a pipe for the here doc) would be writing
to the pipe, and immediately reading it again (as a built-in cat with
no options or args simply connects its stdin to stdout).

The point is that at some point the pipe buffer fills, and the writing
process stalls (at some point it must - there is no other way - the only
question is how much data gets buffered) until the reading process
consumes some, and makes space for more.  But where the reader and writer
are the same process, that never happens.

In general here docs (and here strings) are overused - it is always
possible to simply write a pipe instead

printf %s\\n 'data' | cmd ...

instead of

cmd ... <<'EOF'
data
EOF

(or using "data" if the here doc is 

Re: "here strings" and tmpfiles

2019-04-08 Thread L A Walsh



On 4/8/2019 7:10 AM, Chet Ramey wrote:
> On 4/7/19 4:21 PM, L A Walsh wrote:
>   
>> On 3/22/2019 6:49 AM, Chet Ramey wrote:
>> 
>>> Yes, that's how bash chooses to implement it. There are a few portable
>>> ways
>>> to turn a string into a file descriptor, and a temp file is one of them (a
>>> child process using a pipe is another, but pipes have other issues).
>>>   
>>>   
>> Such as?  That are more common that having no writeable tmp?
>> 
>
> Pipes are objectively not the same as files. They
>
> 1. Do not have file semantics. For instance, they are not seekable.
>   
In the case of an object that is only meant to be read from,
I would argue, "that's fine".  Optionally, I would accept that
an implementation would support forward seeking as some equivalent
to having read the bytes.
> 2. Have limited capacity. Writers will sleep when the pipe becomes full.
>   
So does a read-only disk, except writer doesn't flag the error to
the reader in the same way a broken pipe would.  Instead, execution
proceeds as though nothing had happened -- and if stderr was mixed
in with hundreds of other startup lines might be what the user
would see (nothing happened) and wouldn't know something didn't
get initialized or brought up properly. 
> 3. Have ordering constraints: you can't write a pipe with no
> reader, for instance.
>
> These, unlike a "no writeable tmp," have been around for as 
> long as pipes have existed in Unix.
>   
The fact that the pipe does execution sequencing is often
a bonus, since writing to a read-only tmp or reading from a non
existent fileshould be regarded as writing to a pipe with no
listeners (because no one will ever be able to read from that
'tmp' file since it doesn't exist).

Using a file doesn't sequence -- the writer can still continue
execution pass the point of bash possibly flagging an internal
error for a non-existent tmp file (writable media) and the
reader won't get that the "pipe" (file) had no successful writer,
but instead get an EOF indication and continue, not knowing that
a fatal error had just occurred.
> There is a middle ground, which is to use pipes for here 
> documents that are shorter than the pipe capacity, but fall 
> back to temp files for others, which doesn't require a child
> process. I implemented that in the devel version.
>   
I can't say that's wrong, though I would _like_ for the pipe to
try expanding its buffer via memory allocation, which no pipe
implementation, that I'm aware of, does.  However, that would
be code in the pipe implementation or an IO library on top
of some StdIO implementation using such.

W/pipes, there is the race condition of the reader not being able
to read in the condition where the writer has already gone away.
To avoid that i've had the parent send some message (signal,
semaphore, etc) to the child to indicate the parent has finished
reading what the child has written.  If the child's last write
included an "EOF", then the parent's msg to the child causes
the child to close the pipe and exit.
>   
>> Then came along a way to do a process in background and end up
>> with being able to read & process its data in the main (foreground) 
>> process w/this syntax:
>>
>> readarray -t foregnd < <(echo  $'one\ntwo\nthree')
>>
>> Which I envisioned as 
>> as implemented something like (C-ish example
>> 
>
> I don't think you've ever really understood that these are two
> separate constructs: process substitution, which turns a 
> process into a filename you can write to and read from for
> various purposes, and input redirection.
>   
"Various purposes"...  Ok, so how do I give that file name
to 'cp' in the next line and copy it somewhere?

It's not really a filename is it?  It's a file descriptor --
a handle -- just like a pipe is a handle, but there's no name
associated with it.  It doesn't have 'name' semantics where the
'name' is associated with a data-stream that can be read later.
They are different types of objects. 

A Name-object doesn't have the data in it, but can be passed
around, "dataless', with its data stored elsewhere.  An open
call can connect a program with the data stored for a given name.
Whereas what "< <()" creates is a file descriptor to be READ from.
The parent can't write to it with useful effect.  What's in
parens needs to generate some output.  That is read from the
parent, which is what it is used for.

When I use '< <()', I've never wanted a filename.  I've wanted:

readarray dlines < $("ls /tmp" | )

So that 'dlines' ends up in the parent when done.
I realize that 'lastpipe' was added at some point that,
used with some syntax, would allow me to put the last
item in a pipe in the parent.  But changing what side of
a pipe ends up persisting after, vs. using the above which
does a 1 time read to ensure output in parents ends up
persisting makes me more nervous than the 1-time usage.

>> So I didn't realize instead of doing it simply using
>> native pipes like above, it was implemented some other 

Re: "here strings" and tmpfiles

2019-04-08 Thread Greg Wooledge
On Mon, Apr 08, 2019 at 10:53:46AM -0400, Chet Ramey wrote:
> On 4/8/19 10:36 AM, Greg Wooledge wrote:
> 
> > That's incorrect in this context.  We're talking about boot scripts here,
> > not interactive user shells.  In boot scripts, on every operating system
> > I've ever used, the shell being used is either POSIX sh or Bourne sh.
> 
> This is clearly wrong in general, though it might be true on systems you've
> used (e.g., Debian and Ubuntu in Linuxland). If you have a system where
> bash is installed as /bin/sh (e.g., RHEL or Fedora), that is the shell you
> use to write boot scripts.

I've used more than just Linux systems.  On most commercial Unix
derivatives, /bin/sh is either a stripped-down Korn shell variant, or
a legacy POSIX or Bourne shell.

On some systems (e.g. HP-UX 10), boot scripts use /sbin/sh which is
a statically-linked POSIX-based shell, and are only able to use other
statically linked tools from the /sbin directory, because the shared
libraries aren't mounted yet.  At least, until you get past the point
where everything is mounted.  If you're writing a boot script, YOU need
to know when and how that happens, and therefore which tools you can
use in which script.

But you're right: we actually *do* have at least one Red Hat (CentOS)
based system where /bin/sh links to bash.  It's the minority, though.



Re: "here strings" and tmpfiles

2019-04-08 Thread Chet Ramey
On 4/8/19 10:36 AM, Greg Wooledge wrote:

> That's incorrect in this context.  We're talking about boot scripts here,
> not interactive user shells.  In boot scripts, on every operating system
> I've ever used, the shell being used is either POSIX sh or Bourne sh.

This is clearly wrong in general, though it might be true on systems you've
used (e.g., Debian and Ubuntu in Linuxland). If you have a system where
bash is installed as /bin/sh (e.g., RHEL or Fedora), that is the shell you
use to write boot scripts.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-08 Thread Greg Wooledge
On Sun, Apr 07, 2019 at 01:06:21PM -0700, L A Walsh wrote:
> On 3/20/2019 5:19 AM, Greg Wooledge wrote:
> > Just like that one time L. Walsh tried to write a bash boot script that
> > used <() to populate an array, and it failed because she was running
> > it too early in the boot sequence, and /dev/fd/ wasn't available yet.

> /dev/fd was available, and so was /proc that it symlinked to.
> What wasn't available was "/tmp" being mounted as a writeable
> file system. I.e. -- exactly the case we are talking about being
> a problem *AGAIN*.

Sorry, I didn't remember it correctly.

The original thread appears to be here:
https://lists.gnu.org/archive/html/bug-bash/2014-10/msg00056.html

And you started a second thread here:
https://lists.gnu.org/archive/html/bug-bash/2014-10/msg00064.html

> Various boot processes use /dev and /proc before any file systems
> are mounted.  Requiring a mounted, writeable file system to run a shell
> script during boot was the reason I had problems.

As I said back then, and as I said more recently, if you're writing a
boot script (which is a fundamental piece of an operating system), you
really need to know what you're doing.  That includes knowing which
shell syntax features rely on which operating system features, and which
operating system features are (un)available at which times during the
boot sequence.

Either move your script so that it runs after /tmp becomes writable,
or use $TMPDIR to tell it to use a different place that IS writable for
its temp files, or write it to not use temp files.

> > 2) Don't use bash for scripts that run early in the boot sequence.
> ---
> unacceptable as bash is used as *THE* defacto linux shell.

That's incorrect in this context.  We're talking about boot scripts here,
not interactive user shells.  In boot scripts, on every operating system
I've ever used, the shell being used is either POSIX sh or Bourne sh.

Everyone who writes boot scripts knows this.  Except, apparently, you.

> > 3) Whatever features you *do* use in boot scripts, make sure they're
> >available at the point in the boot sequence when the script runs.
> 
> Pipes are available in the OS before any user scripts are run.

We keep talking past each other and I don't know how to fix that.

You're advocating a fundamental change in a shell that is not even being
used in boot scripts (other than, apparently, by YOU).  The changes you
propose would potentially break many shell scripts that other people use,
just to make it possible for you (and nobody else) to do something unique.

I am advocating that you use the tools you already have, which are the
same tools that everyone else has been using for the same job you're
doing, for the last several decades.

> > 4) Whatever features you use in scripts *in general*, make sure you
> >understand how they work.
> 
> No... Do you understand how your TV works to watch it?   Or
> your microwave, in order to heat food.

For the microwave: I know that it has to be plugged in.  I know that
the circuit breakers have to be not-tripped.  I know that it emits
heat and possibly radiation and therefore I should not stack a bunch of
flammable things on top of it, or block the ventilation, or sit on it,
or do anything else that would expose myself to said radiation.

I know that it has a specific wattage (1200 W), which may be higher
or lower than the wattage specified on the box that the food came in
(typically higher), and therefore I may have to adjust the cooking time
(typically downward).  I know how to set the cooking intensity to less
than 100%.  I know that with my particular microwave, this setting means
the radiation is toggled on and off intermittently, not actually reduced.

For the television: I know that it has to be plugged in, and the circuit
breakers not-tripped.  I know that it has multiple inputs which are fed
by cables of various kinds.  I know which kinds of cables go in which
holes, and what devices those cables are attached to, and how to operate
those devices.  I know how to use the remote control, or the OTHER remote
control in an emergency, to switch among the various input sources.
I know how to change the batteries in the remote(s), and how to program
the secondary (universal) remote to talk to this particular television.

And so on, and so on.

I have the knowledge that is appropriate to what I am doing.

You're conflating PROGRAMMING and BEING A USER.  Each role has specific
knowledge requirements.  Those requirements are different.

> This attitude is why so
> many people have resisted using computers -- because programmers who
> made "friendly user interfaces", were outnumbered in the 1990's by
> those who got liberal arts degrees [...]

You are writing BOOT SCRIPTS.  These are low-level pieces of
infrastructure.  You are acting in the role of a SYSTEM PROGRAMMER.

That role is entirely different from end user.

Comparing your struggles to write boot scripts with anyone else's 

Re: "here strings" and tmpfiles

2019-04-08 Thread Chet Ramey
On 4/7/19 4:21 PM, L A Walsh wrote:
> On 3/22/2019 6:49 AM, Chet Ramey wrote:
>> Yes, that's how bash chooses to implement it. There are a few portable
>> ways
>> to turn a string into a file descriptor, and a temp file is one of them (a
>> child process using a pipe is another, but pipes have other issues).
>>   
> Such as?  That are more common that having no writeable tmp?

Pipes are objectively not the same as files. They

1. Do not have file semantics. For instance, they are not seekable.

2. Have limited capacity. Writers will sleep when the pipe becomes full.

3. Have ordering constraints: you can't write a pipe with no reader, for
   instance.

These, unlike a "no writeable tmp," have been around for as long as pipes
have existed in Unix.

There is a middle ground, which is to use pipes for here documents that
are shorter than the pipe capacity, but fall back to temp files for
others, which doesn't require a child process. I implemented that in the
devel version.

> Then came along a way to do a process in background and end up
> with being able to read & process its data in the main (foreground) 
> process w/this syntax:
> 
> readarray -t foregnd < <(echo  $'one\ntwo\nthree')
> 
> Which I envisioned as 
> as implemented something like (C-ish example

I don't think you've ever really understood that these are two separate
constructs: process substitution, which turns a process into a filename
you can write to and read from for various purposes, and input redirection.


> So I didn't realize instead of doing it simply using
> native pipes like above, it was implemented some other way.
And that's probably why.

> 
> didn't understand the complexity of the need
> for < <( to need a named pipe or fifo)

That, too.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-08 Thread Chet Ramey
On 4/7/19 4:06 PM, L A Walsh wrote:

>> Just like that one time L. Walsh tried to write a bash boot script that
>> used <() to populate an array, and it failed because she was running
>> it too early in the boot sequence, and /dev/fd/ wasn't available yet
>>   
> 
> ---
> /dev/fd was available, and so was /proc that it symlinked to.
> What wasn't available was "/tmp" being mounted as a writeable
> file system. I.e. -- exactly the case we are talking about being
> a problem *AGAIN*.

He's probably referring to

http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00181.html

where you wanted to use process substitution before /dev/fd was mounted.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-04-07 Thread L A Walsh
On 3/22/2019 6:49 AM, Chet Ramey wrote:
> Yes, that's how bash chooses to implement it. There are a few portable
> ways
> to turn a string into a file descriptor, and a temp file is one of them (a
> child process using a pipe is another, but pipes have other issues).
>   
Such as?  That are more common that having no writeable tmp?

Pipes are the first thing that most unix programmers using
a unix-like shell on an unix-like OS think of.  From
Tuesday, 2015-Oct-13 13:51:03 (-0700) on this list:
Subject: my confusion on various I/O redirections syntaxes and indirect
methods

Chet Ramey wrote:
> On 10/12/15 7:39 PM, Linda Walsh wrote:
>> Does it also use a tmp file and use process-substitution, or is
>> that only when parens are present?
> 
> Here-documents and here-strings use temporary files and open them as
> the standard input (or specified file descriptor) for the command.
> 
>> read a < <( echo x)
>>
>> I'm under the impression, uses a tmp file.
> 
> Why would you think that? 



Well, we have 
"<< xxx"
as a HERE DOC using a tmp file, Some time ago, the ability to do 
"multiple assignments" at the same time was added (when I asked how to 
do that) that was told to use:

"read x y z <<< "one two three"

   (which I initially equated to something like:
(x y z)=(one two three)

That would be like the regular assignment:
xyz=(one two three)

but with the array syntax on the left, would do word
splitting on the left and assign to the individual vars; 
as I was searching for a way to do multiple assignments 
in the same statement).

Then came along a way to do a process in background and end up
with being able to read & process its data in the main (foreground) 
process w/this syntax:

readarray -t foregnd < <(echo  $'one\ntwo\nthree')

Which I envisioned as 
as implemented something like (C-ish example
off top of head using a perl-code I wrote to do the same):

  int savein,saveout;
  int pid;
  dup2(0, savein);
  dup2(1, saveout);

  int inout[2];

  #define stdin inout[0]
  #define stdout inout[1]

  pipe(,O_NONBLOCK);
  dupto(stdin,0);
  dupto(stdout,1);

   setup_childsighandler(to close 0 when child exits);

  if ($pid=fork()) {  #parent

dupto(saveout,1);
shell("readarray -t uservar("xyz")");   #reads from pipe:inout[0]
#child handler closes 0
dupto(savein,0);

  } else if (pid==0) {  #child

close(0);
shell("echo $'a\nb\nc'");   #output goes out on pipe:inout[1]
exit(0);

  }

  ##parent continues -- no tmpfiles or named fifo's needed.
---

So I didn't realize instead of doing it simply using
native pipes like above, it was implemented some other way.

didn't understand the complexity of the need
for < <( to need a named pipe or fifo)

These examples and concepts came up when I 
was trying to write a bash script [running in early boot] that threw
out some error cases like /dev/fd/99 not found... [or
/tmp/tmpx2341 not found...]

> The documentation clearly says it uses a named
> pipe or a file descriptor associated with a /dev/fd filename (which happens
> to be a pipe in this case).

yeah with the "clear and unambiguous"[sic] syntax
of :

   <<  xxx
   <<< xxx
   <<< $(echo 'xxx')
   < < (xxx)

I can't imagine why'd I ever have been confused -- or,
given the pipe example above -- why any of the above
had to use [diskfile] based io.

So the fact that I get confused about what extra-complex
is used for which syntax isn't that surprising to me --
is it that surprising to you that given the complexities
chosen for implementation, why some people might be
confused about remembering the details of each when
they all could have been done without any [diskfile]
confusions??

==
(end quoted email)

Using tmp files instead of pipes is what MSDOS used to do to 
emulate pipes that unix had.  It was slow, clunky and not reliable
because the underlying file system wasn't always writeable.








Re: "here strings" and tmpfiles

2019-04-07 Thread L A Walsh
On 3/20/2019 5:19 AM, Greg Wooledge wrote:
> On Wed, Mar 20, 2019 at 07:49:34AM +0700, Robert Elz wrote:
>   
>> However, using files for here docs makes here docs unusable in a shell
>> running in single user mode with no writable filesystems (whatever is
>> mounted is read only, until after file system checks are finished).
>> 
>
> Meanwhile, proposals based around /dev/fd/* would also make here docs
> unusable in a shell running early in the boot process, before all
> file systems are mounted.
>
> Just like that one time L. Walsh tried to write a bash boot script that
> used <() to populate an array, and it failed because she was running
> it too early in the boot sequence, and /dev/fd/ wasn't available yet.
>   

---
/dev/fd was available, and so was /proc that it symlinked to.
What wasn't available was "/tmp" being mounted as a writeable
file system. I.e. -- exactly the case we are talking about being
a problem *AGAIN*.

Various boot processes use /dev and /proc before any file systems
are mounted.  Requiring a mounted, writeable file system to run a shell
script during boot was the reason I had problems.



> So, my counterpoints are:
>
> 1) Leave it alone.  It's fine.
>   

No, it's not -- it's been biting people for the past 4
years or omre.
> 2) Don't use bash for scripts that run early in the boot sequence.
>   
---
unacceptable as bash is used as *THE* defacto linux shell.
> 3) Whatever features you *do* use in boot scripts, make sure they're
>available at the point in the boot sequence when the script runs.
>   

Pipes are available in the OS before any user scripts are run.

> 4) Whatever features you use in scripts *in general*, make sure you
>understand how they work.
>   

No... Do you understand how your TV works to watch it?   Or
your microwave, in order to heat food.   This attitude is why so
many people have resisted using computers -- because programmers who
made "friendly user interfaces", were outnumbered in the 1990's by
those who got liberal arts degrees and thought that qualified them
as a software programmer.   They often could write programs that
worked, but required more support and user training because most
of them don't know how to design something friendly. 

The features should behave according to the documents.  That's
why in some cases, I've tried to get wording improved - like the
person recently who couldn't find documentation for '+=-?' along
side ':+ := := :?', because it was buried in a passing sub-clause
in a prior section (I never could find it either and assumed it
was some old shell practice that will be supported to the end of
time,  but is no longer 'in favor', like $[integer exp] vs. using
$((integer exp)).
> Even if Chet changed how here docs work in bash 5.1, nobody would
> be safe to use those features in their "I'm feeding a password with
> a here string" scripts for at least 20 years, because there will
> still be people running older versions of bash for at least that long.
>   
---
So untrue.  If the system boots on bash5.1, because that's what
ships on linux 5.x from vendors, then that's what will be there.
We aren't porting OS-boot scripts from linux to machines that can't
run current software requirements.

Your script doesn't have to support Bourne Shell 1.0.  It might
have to support some posix implementation -- but bash doesn't even
support aliases working in interactive mode by default -- as required
to be posix compatible.  That means anyone relying on aliases to work
because they are using the posix requirements as a minimum, will be
surprised when a user uses bash and aliases are broken (don't work,
not enabled) by default -- they work in any posix compatible shell
which bash claims to be, but disassociates its posix mode from
some ancient-no alias mode such that toggling the posix bit resets
multiple features and doesn't save and restore those feature when
toggling it back.  It's like it's posix mode was designed to
not play well with normal bash function -- like it was designed to
be broken.

If you enter an optional mode, that an later exit, it's a
basic computer software 'given', that the previous mode should be
restored, by default.  Global effects are generally considered
a poor practice because of the tendency to cause unexpected effects
"at a distance" (far from the point they were changed).



> Thus, leave it alone.
>
>   



Re: "here strings" and tmpfiles

2019-03-24 Thread konsolebox
On Tue, Mar 19, 2019, 9:36 PM Greg Wooledge  wrote:

> On Tue, Mar 19, 2019 at 09:20:33AM -0400, Daniel Kahn Gillmor wrote:
> > On Tue 2019-03-19 08:25:50 -0400, Greg Wooledge wrote:
> > > On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
> > >> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello
> there"'
> > >> It turns out that this creates a temporary file, actually touching the
> > >> underlying filesystem:
> > >
> > > Yes, just like here documents do.  And have always done, in all shells.
> >
> > Apologies for being unaware of the history.  It looks like there are a
> > handful of possible approaches today that minimize these fixes, which
> > may not have been possible on older systems, which i listed upthread.
> > And they work on arbitrary file descriptors, not just stdin.
> >
> > Do you think that bash should not improve the situation, at least on
> > platforms that support these other approaches?
>
> There are scripts that *rely* on the seekability of the temporary files
> created by here-documents and here-strings.  "Improving" the "situation"
> would break backward compatibility.
>

That's broken practice. They should use a real temporary "file" explicitly
if they want seekability.

-- konsolebox


Re: "here strings" and tmpfiles

2019-03-22 Thread Greg Wooledge
On Fri, Mar 22, 2019 at 10:28:52AM -0400, Chet Ramey wrote:
> On 3/20/19 8:19 AM, Greg Wooledge wrote:
> 
> > Even if Chet changed how here docs work in bash 5.1, nobody would
> > be safe to use those features in their "I'm feeding a password with
> > a here string" scripts for at least 20 years, because there will
> > still be people running older versions of bash for at least that long.
> 
> Snark aside, this is a serious consideration. Linux distros are still
> shipping with bash-4.2 (and not with all the patches applied, either),
> which was originally released in late 2010.

And Mac OS X is still shipping bash 3.2.  Please believe me, this was
intended as a completely serious issue.  We deal with it in IRC all the
time.



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/20/19 8:19 AM, Greg Wooledge wrote:

> Even if Chet changed how here docs work in bash 5.1, nobody would
> be safe to use those features in their "I'm feeding a password with
> a here string" scripts for at least 20 years, because there will
> still be people running older versions of bash for at least that long.

Snark aside, this is a serious consideration. Linux distros are still
shipping with bash-4.2 (and not with all the patches applied, either),
which was originally released in late 2010.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/19/19 9:07 AM, Daniel Kahn Gillmor wrote:
> Thanks for the feedback, Eduardo--
> 
> On Mon 2019-03-18 17:40:17 -0700, Eduardo A. Bustamante López wrote:
>> I don't think the implementation details of herestrings are documented 
>> anywhere,
>> and I'm not too sure if they should (i.e. IMO if you need that degree of 
>> control
>> over the implementation details, then you should use something other than
>> shell).
> 
> I hear you in general -- i also don't want the documentation to be as
> detailed as the source code.  But casually sending ephemeral data to
> disk is a risk that i think ought to be avoided or at least avoidable.

It's a risk that most shells and shell users accept, and have for many
years. That doesn't suggest there's no risk, but that it's minor.


> If bash was in the habit of writing the environment to disk, i think
> users would rightly complain.

This is a straw man. I assume you're exaggerating for effect.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/20/19 7:36 AM, Daniel Kahn Gillmor wrote:
> On Tue 2019-03-19 09:31:55 -0400, Greg Wooledge wrote:
>> There are scripts that *rely* on the seekability of the temporary files
>> created by here-documents and here-strings.  "Improving" the "situation"
>> would break backward compatibility.
> 
> i hope you noticed that of my suggested improvements, only one of them
> (a) breaks seekability.  Do you have a preference among the other
> proposals?  I'm partial to memfd_create(2) on platforms that support it,
> though i'm not sure how to turn that file descriptor into O_RDONLY
> before the exec.

I can't see one by looking at the man page on the web, but I don't have
ready access to a system that implements memfd_create.


>> There is simply NO valid reason to write <<<"$secret" in a script, and
>> thus there is no need to "improve" anything other than the scripts
>> that are doing that.  Use a pipe instead.
> 
> Not all tools take their secret inputs on stdin.  indeed, some are
> explicitly designed to accept special values on other file descriptors.
> 
> How do you replicate 3<<<"$secret" with a pipeline?

This is the kind of thing process substitution is good for.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



signature.asc
Description: OpenPGP digital signature


Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/19/19 8:49 PM, Robert Elz wrote:
> Date:Tue, 19 Mar 2019 08:25:50 -0400
> From:Greg Wooledge 
> Message-ID:  <20190319122550.khv5jp66iobjo...@eeg.ccf.org>
> 
>   | Yes, just like here documents do.  And have always done, in all shells.
> 
> That's not correct.   There are shells that don't use files for here
> docs.   Any application that relies on stdin being seekable is broken
> (unless it makes that happen for itself) - the most obvious example
> which is not seekable is when stdin is a terminal.

I'm not saying such an application isn't broken. I'm saying that such
applications exist and have worked with bash.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/18/19 8:40 PM, Eduardo A. Bustamante López wrote:

> Having said that, have you tried process substitution as an option?
> 
> You should be able to do something like:
> 
> 
>   mycommand < <(printf %s 'super secret')
> 
> 
> That will:
> 
> - not write the 'super secret' string to the file-system, nor
> - show the mentioned string in the process tree (because printf is a bash
>   built-in command, and thus, does not require a fork).
> 

That works if the process substitution implementation uses /dev/fd. If it
uses a named pipe, it will touch the file system.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/19/19 8:25 AM, Greg Wooledge wrote:
> On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
>> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'
>> It turns out that this creates a temporary file, actually touching the
>> underlying filesystem:
> 
> Yes, just like here documents do.  And have always done, in all shells.

Not quite all shells. Historical shells, yes, and many modern shells. The
ash-derived shells use pipes.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: "here strings" and tmpfiles

2019-03-22 Thread Chet Ramey
On 3/18/19 5:18 PM, Daniel Kahn Gillmor wrote:
> hi bash developers--
> 
> I ran the following command to get a sense of how bash deals with here
> strings under the hood:
> 
> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'
> 
> (i'm testing with bash 5.0-2 on debian testing/unstable).
> 
> It turns out that this creates a temporary file, actually touching the
> underlying filesystem:

Yes, that's how bash chooses to implement it. There are a few portable ways
to turn a string into a file descriptor, and a temp file is one of them (a
child process using a pipe is another, but pipes have other issues).


> I could find no mention in the bash(1) manpage of any risk of either
> here documents or here strings touching the underlying filesystem.

Why would the man page mention this?

> 
> I know that some systems use heredocs or herestrings explicitly to avoid
> things like:
> 
>  * writing to the filesystem,
>  * invoking extra processes, or
>  * making sensitive data avaialble to the process table.

These are making assumptions about the underlying implementation that are
not guaranteed.

> So writing this stuff to the filesystem, where it is likely to touch the
> underlying disk, seems particularly problematic. And of course there is
> the potential for weird race conditions around filename selection common
> to all tmpfile-style shenanigans.
> 
> A few possible options for trying to improve the situation:
> 
>  a) use socketpair(2) or pipe(2) instead of making a tmpfile.  this has
> the potential downside that the semantics of access to the remaining
> file descriptor would be subtly different from "regular file"
> semantics.

Correct, plus a general implementation would require a child process to
send the data through the pipe.

> 
>  b) On systems that support O_TMPFILE, try something like
> open("/dev/shm", O_RDWR|O_CREAT|O_EXCL|O_TMPFILE).  /dev/shm tends
> to be a globally-writable tmpfs, so that avoids touching any disk,
> and O_TMPFILE avoids tmpfile-style race conditions.  This might need
> a fallback (to the current tmpdir selection mechanics?) in case
> /dev/shm isn't available.
> 
>  c) Just use O_TMPFILE with the current tmpdir selection mechanics, if
> it's supported.  This isn't quite as clever as trying to use
> /dev/shm first, and it won't fix the herestrings hitting the disk,
> but it at least avoids tmpfile races.

I prefer to support more portable alternatives. If someone wants to take a
run at an implementation that uses a pipe, I'd be happy to take a look at it.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



signature.asc
Description: OpenPGP digital signature


Re: "here strings" and tmpfiles

2019-03-20 Thread Greg Wooledge
On Wed, Mar 20, 2019 at 07:49:34AM +0700, Robert Elz wrote:
> However, using files for here docs makes here docs unusable in a shell
> running in single user mode with no writable filesystems (whatever is
> mounted is read only, until after file system checks are finished).

Meanwhile, proposals based around /dev/fd/* would also make here docs
unusable in a shell running early in the boot process, before all
file systems are mounted.

Just like that one time L. Walsh tried to write a bash boot script that
used <() to populate an array, and it failed because she was running
it too early in the boot sequence, and /dev/fd/ wasn't available yet.

So, my counterpoints are:

1) Leave it alone.  It's fine.

2) Don't use bash for scripts that run early in the boot sequence.

3) Whatever features you *do* use in boot scripts, make sure they're
   available at the point in the boot sequence when the script runs.

4) Whatever features you use in scripts *in general*, make sure you
   understand how they work.

Even if Chet changed how here docs work in bash 5.1, nobody would
be safe to use those features in their "I'm feeding a password with
a here string" scripts for at least 20 years, because there will
still be people running older versions of bash for at least that long.

Thus, leave it alone.



Re: "here strings" and tmpfiles

2019-03-20 Thread Greg Wooledge
On Wed, Mar 20, 2019 at 07:36:41AM -0400, Daniel Kahn Gillmor wrote:
> How do you replicate 3<<<"$secret" with a pipeline?

Not strictly a pipeline, but:

3< <(printf %s "$secret")

This is actually preferred in many cases, because it doesn't add a
newline.  <<< always adds a newline to the result, because it's
mimicking here documents, which always end in a newline due to
their syntax.



Re: "here strings" and tmpfiles

2019-03-20 Thread Daniel Kahn Gillmor
On Tue 2019-03-19 09:31:55 -0400, Greg Wooledge wrote:
> There are scripts that *rely* on the seekability of the temporary files
> created by here-documents and here-strings.  "Improving" the "situation"
> would break backward compatibility.

i hope you noticed that of my suggested improvements, only one of them
(a) breaks seekability.  Do you have a preference among the other
proposals?  I'm partial to memfd_create(2) on platforms that support it,
though i'm not sure how to turn that file descriptor into O_RDONLY
before the exec.

> There is simply NO valid reason to write <<<"$secret" in a script, and
> thus there is no need to "improve" anything other than the scripts
> that are doing that.  Use a pipe instead.

Not all tools take their secret inputs on stdin.  indeed, some are
explicitly designed to accept special values on other file descriptors.

How do you replicate 3<<<"$secret" with a pipeline?

Thanks for helping to think this through!

Regards,

--dkg


signature.asc
Description: PGP signature


Re: "here strings" and tmpfiles

2019-03-19 Thread Robert Elz
Date:Tue, 19 Mar 2019 08:25:50 -0400
From:Greg Wooledge 
Message-ID:  <20190319122550.khv5jp66iobjo...@eeg.ccf.org>

  | Yes, just like here documents do.  And have always done, in all shells.

That's not correct.   There are shells that don't use files for here
docs.   Any application that relies on stdin being seekable is broken
(unless it makes that happen for itself) - the most obvious example
which is not seekable is when stdin is a terminal.

The same applies to any other file descriptor which is opened when
an application is started.

POSIX (XCU section 2.7.4) explicitly says (of here docs):

It is unspecified whether the file descriptor is opened as a
regular file, a special file, or a pipe. Portable applications
cannot rely on the file descriptor being seekable (see XSH lseek( )).

Note: I am not suggesting bash should change - using files for here docs
is the way they were originally implemented (in the Bourne sh) (though it
had bugs, which could leave the files lying around in some cases).

However, using files for here docs makes here docs unusable in a shell
running in single user mode with no writable filesystems (whatever is
mounted is read only, until after file system checks are finished).

kre




Re: "here strings" and tmpfiles

2019-03-19 Thread Daniel Kahn Gillmor
On Tue 2019-03-19 08:25:50 -0400, Greg Wooledge wrote:
> On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
>> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'
>> It turns out that this creates a temporary file, actually touching the
>> underlying filesystem:
>
> Yes, just like here documents do.  And have always done, in all shells.

Apologies for being unaware of the history.  It looks like there are a
handful of possible approaches today that minimize these fixes, which
may not have been possible on older systems, which i listed upthread.
And they work on arbitrary file descriptors, not just stdin.

Do you think that bash should not improve the situation, at least on
platforms that support these other approaches?

--dkg


signature.asc
Description: PGP signature


Re: "here strings" and tmpfiles

2019-03-19 Thread Greg Wooledge
On Tue, Mar 19, 2019 at 09:20:33AM -0400, Daniel Kahn Gillmor wrote:
> On Tue 2019-03-19 08:25:50 -0400, Greg Wooledge wrote:
> > On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
> >> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'
> >> It turns out that this creates a temporary file, actually touching the
> >> underlying filesystem:
> >
> > Yes, just like here documents do.  And have always done, in all shells.
> 
> Apologies for being unaware of the history.  It looks like there are a
> handful of possible approaches today that minimize these fixes, which
> may not have been possible on older systems, which i listed upthread.
> And they work on arbitrary file descriptors, not just stdin.
> 
> Do you think that bash should not improve the situation, at least on
> platforms that support these other approaches?

There are scripts that *rely* on the seekability of the temporary files
created by here-documents and here-strings.  "Improving" the "situation"
would break backward compatibility.

I already showed how to use a pipeline to send information from a shell
variable to another process's stdin without using a here-string (which
is a bashism that is much less portable than the pipeline, in addition
to being less "safe" in the incredibly narrow niche situation being
described in this thread).

There is simply NO valid reason to write <<<"$secret" in a script, and
thus there is no need to "improve" anything other than the scripts
that are doing that.  Use a pipe instead.



Re: "here strings" and tmpfiles

2019-03-19 Thread Daniel Kahn Gillmor
Thanks for the feedback, Eduardo--

On Mon 2019-03-18 17:40:17 -0700, Eduardo A. Bustamante López wrote:
> I don't think the implementation details of herestrings are documented 
> anywhere,
> and I'm not too sure if they should (i.e. IMO if you need that degree of 
> control
> over the implementation details, then you should use something other than
> shell).

I hear you in general -- i also don't want the documentation to be as
detailed as the source code.  But casually sending ephemeral data to
disk is a risk that i think ought to be avoided or at least avoidable.
If bash was in the habit of writing the environment to disk, i think
users would rightly complain.

> Having said that, have you tried process substitution as an option?

sure, that's an option (as long as process substitution is enabled on
the platform -- apparently that's not universal either).  Also possible
(for stdin in particular) is sending data via a pipeline using bash
builtins.  Both of these require users of bash to rewrite their scripts
though.

It seems like it'd be preferable for the shell itself to avoid these
problems automatically, at least on platforms where it's possible to do
so.  Otherwise, we *require* users of the shell to know which things are
"safe" and which things aren't before they can use the shell safely.

  --dkg



Re: "here strings" and tmpfiles

2019-03-19 Thread Daniel Kahn Gillmor
On Mon 2019-03-18 17:18:10 -0400, Daniel Kahn Gillmor wrote:
> A few possible options for trying to improve the situation:
>
>  a) use socketpair(2) or pipe(2) instead of making a tmpfile.  this has
> the potential downside that the semantics of access to the remaining
> file descriptor would be subtly different from "regular file"
> semantics.
>
>  b) On systems that support O_TMPFILE, try something like
> open("/dev/shm", O_RDWR|O_CREAT|O_EXCL|O_TMPFILE).  /dev/shm tends
> to be a globally-writable tmpfs, so that avoids touching any disk,
> and O_TMPFILE avoids tmpfile-style race conditions.  This might need
> a fallback (to the current tmpdir selection mechanics?) in case
> /dev/shm isn't available.
>
>  c) Just use O_TMPFILE with the current tmpdir selection mechanics, if
> it's supported.  This isn't quite as clever as trying to use
> /dev/shm first, and it won't fix the herestrings hitting the disk,
> but it at least avoids tmpfile races.
>
>  d) If none of the above can be done, at the very least, bash(1)'s
> section on here docs and here strings should warn that the contents
> of these documents are likely to get written to the disk
> unprotected.

One more possibility for an implementation fix occurs to me (at least on
systems with Linux >= 3.17 and glibc >= 2.27):

 e) bash could use use memfd_create(2) for heredocs and herestrings --
that should preserve "regular file" semantics, avoid tmpfile races,
and avoid hitting the disks.

--dkg


signature.asc
Description: PGP signature


Re: "here strings" and tmpfiles

2019-03-19 Thread Greg Wooledge
On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
> strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'
> It turns out that this creates a temporary file, actually touching the
> underlying filesystem:

Yes, just like here documents do.  And have always done, in all shells.

> For example, sending a password or secret key material from the
> environment to stdin would be a typical way to use a herestring.

Note that the environment may also be visible to users on the system,
via "ps eww aux" or similar commands.  The availability of the BSD ps(1)
options and the details about who can see what are OS-specific.

> A few possible options for trying to improve the situation:

Don't put sensitive data in shell scripts?

In general:

1) Do not pass secrets as arguments to external commands.  The arguments
   of a command are generally visible in ps(1).

2) Do not pass secrets as environment variables.  The initial environment
   of a process is generally visible in ps(1).

3) Read the documentation of the thing you're trying to authenticate
   against.  Find out the various ways it can accept authentication
   secrets/tokens and choose the most appropriate.  This may mean using
   ssh keys stored in an ssh-agent, etc.

4) If something requires a password, let it prompt the user for the
   password by itself.  Just run it inside a terminal so that it can
   launch a dialog with the end user if required.  Do not try to "help"
   it by storing the password in your script and then trying to figure
   out how to circumvent its security in order to pass the password to it.

5) If you absolutely MUST store a password somewhere on disk, don't
   store it inside the shell script.  Shell scripts must have read
   permissions in order to be used.  Store the password in a separate
   file that doesn't have universal read permission, and let the
   appropriate process read that file.  The appropriate process may be
   your script in rare cases, but more often it'll be whatever program
   is actually going to use that password.

Thus, there is absolutely no reason you should ever have a secret password
inside a here document or here string.  That would mean the password
is hard-coded inside a shell script, which violates several of my points.

If the password has been read from a file and is now inside a shell
variable (NOT environment variable) in memory, and you want to pass it
on stdin to a process, you do that by running something like

printf %s\\n "$secret" | program

No here strings are wanted.  printf is a builtin, so you aren't violating
point 1.  $secret is not exported, so you aren't violating point 2.

https://mywiki.wooledge.org/BashFAQ/069 and
https://mywiki.wooledge.org/BashFAQ/078 also touch on this, but they
could both use some expansion/rewriting, I see.



Re: "here strings" and tmpfiles

2019-03-18 Thread Eduardo A . Bustamante López
On Mon, Mar 18, 2019 at 05:18:10PM -0400, Daniel Kahn Gillmor wrote:
> hi bash developers--
(...)
>  a) use socketpair(2) or pipe(2) instead of making a tmpfile.  this has
> the potential downside that the semantics of access to the remaining
> file descriptor would be subtly different from "regular file"
> semantics.

Disclaimer: not a bash developer, just a user.

I don't think the implementation details of herestrings are documented anywhere,
and I'm not too sure if they should (i.e. IMO if you need that degree of control
over the implementation details, then you should use something other than
shell).


Having said that, have you tried process substitution as an option?

You should be able to do something like:


  mycommand < <(printf %s 'super secret')


That will:

- not write the 'super secret' string to the file-system, nor
- show the mentioned string in the process tree (because printf is a bash
  built-in command, and thus, does not require a fork).



"here strings" and tmpfiles

2019-03-18 Thread Daniel Kahn Gillmor
hi bash developers--

I ran the following command to get a sense of how bash deals with here
strings under the hood:

strace -o tmp/bash.herestring.strace -f bash -c 'cat <<<"hello there"'

(i'm testing with bash 5.0-2 on debian testing/unstable).

It turns out that this creates a temporary file, actually touching the
underlying filesystem:

[…]
18557 openat(AT_FDCWD, "/home/dkg/tmp/sh-thd.UCPAvB", 
O_RDWR|O_CREAT|O_EXCL, 0600) = 3
18557 fchmod(3, 0600)   = 0
18557 fcntl(3, F_SETFD, FD_CLOEXEC) = 0
18557 write(3, "hello there", 11)   = 11
18557 write(3, "\n", 1) = 1
18557 openat(AT_FDCWD, "/home/dkg/tmp/sh-thd.UCPAvB", O_RDONLY) = 4
18557 close(3)  = 0
18557 unlink("/home/dkg/tmp/sh-thd.UCPAvB") = 0
18557 fchmod(4, 0400)   = 0
18557 dup2(4, 0)= 0
18557 close(4)  = 0
18557 execve("/bin/cat", ["cat"], 0x5577ec52b8e0 /* 49 vars */) = 0
[…]

I could find no mention in the bash(1) manpage of any risk of either
here documents or here strings touching the underlying filesystem.

I know that some systems use heredocs or herestrings explicitly to avoid
things like:

 * writing to the filesystem,
 * invoking extra processes, or
 * making sensitive data avaialble to the process table.

For example, sending a password or secret key material from the
environment to stdin would be a typical way to use a herestring.

So writing this stuff to the filesystem, where it is likely to touch the
underlying disk, seems particularly problematic. And of course there is
the potential for weird race conditions around filename selection common
to all tmpfile-style shenanigans.

A few possible options for trying to improve the situation:

 a) use socketpair(2) or pipe(2) instead of making a tmpfile.  this has
the potential downside that the semantics of access to the remaining
file descriptor would be subtly different from "regular file"
semantics.

 b) On systems that support O_TMPFILE, try something like
open("/dev/shm", O_RDWR|O_CREAT|O_EXCL|O_TMPFILE).  /dev/shm tends
to be a globally-writable tmpfs, so that avoids touching any disk,
and O_TMPFILE avoids tmpfile-style race conditions.  This might need
a fallback (to the current tmpdir selection mechanics?) in case
/dev/shm isn't available.

 c) Just use O_TMPFILE with the current tmpdir selection mechanics, if
it's supported.  This isn't quite as clever as trying to use
/dev/shm first, and it won't fix the herestrings hitting the disk,
but it at least avoids tmpfile races.

 d) If none of the above can be done, at the very least, bash(1)'s
section on here docs and here strings should warn that the contents
of these documents are likely to get written to the disk
unprotected.

Does this make sense?  If this has been raised and discussed elsewhere,
please don't hesitate to point me to any archives of that discussion.

Please keep me in cc during any replies, i'm not subscribed to
bug-bash@gnu.org.

Regards,

--dkg


signature.asc
Description: PGP signature