Re: #!/bin/sh & execve

2003-03-03 Thread abc
> > the method used by FBSD 2.2.7 seems the most sane to me,
> > where execve's argv[] is loaded by each whitespace
> > seperated element after the shebang,
> > then by command line options.
> >
> > 1.  it is flexible.
> > 2.  it functions intuitively.
> > 3.  i don't think it breaks less flexible methods.
> 
> It also suffers from problems with arguments that are meant to include
> spaces, like:
> 
>   #!/bin/sh "hello world" "foo bar"
> 
> Without a fully functional sh(1)-like parser, any solution that does
> magic with argv[] is incomplete :-(

Apologies for a delayed response.
This concerned how to load execve()'s argv[] array
when parsing the 'shebang' line of a script, ie:
whether to pass everything after '#!/interpeter'

1.  as one string into execve()'s argv[] array, as some systems do, or
2.  as individual arguments, as exist after #!/interpreter, separated
by whitespace.

Bug report 16383 showed the variance in the various UNIX's
of how this is done.  Orginal SysV specs say to load '1 argument'
only after #!/interpreter, leaving it ambiguous as to whether
the '1 argument' is the 1st whitespace separated argument,
or whether it is everything after #!/interpreter as one string.
Posix and SUSv3 don't say anything about how to load execve()'s
argv[] array after #!/interpreter, and seem to leave it to
"whatever is the most intelligent way".

I suggested it made more sense as FBSD 2.2.7 used to do it,
where execve()'s argv[] array was loaded contiguously with
whitespace separated elements, so one could use constructs
such as "#!/bin/sh /script-preprocessor options", and to
allow "#!/interpreter opt1 opt2" and "#!/interpreter opt1 arg1"
type constructs, things that intuitively work as one would
expect on a command line, since there didn't appear to be
any penalty for allowing this flexibility.

A plausible argument was given in response:

>   #!/bin/sh "hello world" "foo bar"

I repond as follows:  that's something only a Windoze user would think of
doing! :)  Unix users don't put whitespace in filenames, nor would they create
options to programs that contain whitespace.  Also, to load:

'"hello world" "foo bar"'

as one string, breaks it's purpose anyway.  it is a bizarre example that has
little practical value, and can be easily compensated for by getting rid of
whitespace in a filename, in the bizarre case where it may exist as such.
Finally, to be slightly extreme with a tiny performance penalty, a beginning
and ending quote in a string could be check for and respected by execve()'s
code that fills argv[].

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message


Re: #!/bin/sh & execve

2003-02-28 Thread Giorgos Keramidas
Oh boy!  Deja-vu...

On 2003-02-28 18:32, [EMAIL PROTECTED] wrote:
> This concerned how to load execve()'s argv[] array when parsing the
> 'shebang' line of a script, ie: whether to pass everything after
> '#!/interpeter'
>
> 1.  as one string into execve()'s argv[] array, as some systems do, or
> 2.  as individual arguments, as exist after #!/interpreter, separated
> by whitespace.

I don't like 2 and I will definitely suggest that it's *not*
implementd for various reasons.  See below.

> Bug report 16383 showed the variance in the various UNIX's of how
> this is done.

I'm not sure if this is the right bug report.  The (closed now)
ports/16383 PR is about a broken port, that has been fixed.

> Orginal SysV specs say to load '1 argument' only after
> #!/interpreter, leaving it ambiguous as to whether the
> '1 argument' is the 1st whitespace separated argument,
> or whether it is everything after #!/interpreter as one
> string.

I don't mean to sound harsh here, but FreeBSD is BSD not SysV.
We don't have to copy SysV 'bugs', and splitting blindly at
whitespace is something I'll always consider a bug :(

> Posix and SUSv3 don't say anything about how to load execve()'s
> argv[] array after #!/interpreter, and seem to leave it to "whatever
> is the most intelligent way".

If they don't explicitly require a certain feature then it's not
mandatory to implement it in any particular way.

> I suggested it made more sense as FBSD 2.2.7 used to do it,
> where execve()'s argv[] array was loaded contiguously with
> whitespace separated elements, so one could use constructs
> such as "#!/bin/sh /script-preprocessor options", and to
> allow "#!/interpreter opt1 opt2" and "#!/interpreter opt1 arg1"
> type constructs, things that intuitively work as one would
> expect on a command line, since there didn't appear to be
> any penalty for allowing this flexibility.
>
> A plausible argument was given in response:
>
> > #!/bin/sh "hello world" "foo bar"

Thanks for 'plausible' :)

> I repond as follows:  that's something only a Windoze user would
> think of doing! :)  Unix users don't put whitespace in filenames,
> nor would they create options to programs that contain whitespace.

The only characters that are not allowed to be part of a filename are
the path separator and ASCII nul '\0'.  I don't like and I won't ever
accept limitations other than that.  Doing "special things" for any
other character is one of the technical reasons why I don't use DOS
and/or Windows.  Why do you think that we have to do this in a way
that is obviously broken for filenames that are perfectly valid?

> Also, to load:
>
> '"hello world" "foo bar"'
>
> as one string, breaks it's purpose anyway.

It still is a valid filename, and I might choose to write a shell
script called "Find misfiled PRs.sh".  How do you then propose that
exec*() should handle filenames of that sort?

> it is a bizarre example that has little practical value, and can be
> easily compensated for by getting rid of whitespace in a filename,

There are various reasons why splitting blindly on whitespace is a bug
waiting in the background of programs to bite you in the future.  For
another known problem with this way of splitting parts of a command
line look at http://www.freebsd.org/cgi/query-pr.cgi?pr=docs/35678.

A very annoying 'feature' of make(1)...  Do we really have to copy it
to the way exec*() works?

Also, think about Samba shares.  One doesn't always have control of
what something includes in the name.  If I mount over Samba a Windows
filesystem that contains some of my shell scripts why does it seem
good to you to force limitations on me about the way I name my files?

FreeBSD used to have a motto along the lines of "We provide the tools,
and you make the policy".  Implementing and mandating limitations of
this sort is something that is far beyond the borders of "tools" and
very deep within the heart of "policy".  I don't like this... at all :(

> in the bizarre case where it may exist as such.  Finally, to be
> slightly extreme with a tiny performance penalty, a beginning and
> ending quote in a string could be check for and respected by
> execve()'s code that fills argv[].

I'd have to see the code and several sample scripts to comment on
this.

- Giorgos


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message


Re: #!/bin/sh & execve

2003-02-09 Thread abc
minor correction/addition to previous post:

instead of "infinitely recursive", i should've
said that it would break things if "script"
re-exec's the same file with a
different interpreter.

--

>   #!/bin/sh
>   . script

this won't work if "script" is going to do something
before exec'ing the file itself.  it will end up
being infinitely recursive.  and similarly for
the following:

> > #!/bin/sh -n script this is currently not ok
>   but why shouldn't it be?
>   #!/bin/sh
>   exec /bin/sh -n script
>
> > #!/bin/sh script 1 2this is ok with FBSD and RH Linux,
> > but not ok in a few implementations,
> > but why shouldn't it be?
>
>   #!/bin/sh
>   exec /bin/sh script 1 2

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message



Re: #!/bin/sh & execve

2003-02-09 Thread Giorgos Keramidas
Please don't remove me from the Cc: list when you reply to posts that
you want me to see.  Otherwise, I might miss one of your replies and
give you the false impression that I'm somehow ignoring your posts.

On 2003-02-10 01:18, [EMAIL PROTECTED] wrote:
> Giorgos Keramidas wrote:
> > [EMAIL PROTECTED] wrote:
> > > it seems more sane to allow arguments to a script given to an
> > > interpreter on the shebang line, passing everything after
> > > "#!/interpreter [arg]" off for "eval" or "sh -c" type parsing.
> >
> > This is something that can be bth good and bad though.  As you have
> > pointed out, if a limited sort of parsing is allowed, then it would
> > most likely have to be sh(1)-like.  This means that the mechanism that
> > inteprets '#!' would have to know all the intricacies of the sh(1)
> > parser to work correctly in all cases.  Incomplete sh(1)-like parsers
> > that would understand "most of the sh(1) shell syntax" would be
> > exactly that... incomplete.
>
> the method used by FBSD 2.2.7 seems the most sane to me,
> where execve's argv[] is loaded by each whitespace
> seperated element after the shebang,
> then by command line options.
>
> 1.  it is flexible.
> 2.  it functions intuitively.
> 3.  i don't think it breaks less flexible methods.

It also suffers from problems with arguments that are meant to include
spaces, like:

#!/bin/sh "hello world" "foo bar"

Without a fully functional sh(1)-like parser, any solution that does
magic with argv[] is incomplete :-(


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message



Re: #!/bin/sh & execve

2003-02-09 Thread abc
> > it seems more sane to allow arguments to a script given to an
> > interpreter on the shebang line, passing everything after
> > "#!/interpreter [arg]" off for "eval" or "sh -c" type parsing.
>
> This is something that can be bth good and bad though.  As you have
> pointed out, if a limited sort of parsing is allowed, then it would
> most likely have to be sh(1)-like.  This means that the mechanism that
> inteprets '#!' would have to know all the intricacies of the sh(1)
> parser to work correctly in all cases.  Incomplete sh(1)-like parsers
> that would understand "most of the sh(1) shell syntax" would be
> exactly that... incomplete.

the method used by FBSD 2.2.7 seems the most sane to me,
where execve's argv[] is loaded by each whitespace
seperated element after the shebang,
then by command line options.

1.  it is flexible.
2.  it functions intuitively.
3.  i don't think it breaks less flexible methods.

i thank you very much for explaining this to me.

> Another bad thing about this is that you would then need a lot more
> memory to handle things like:
>
>   #!/bin/sh -c \
>   'my-magic-script.sh arg1 arg2 \
>   arg3 ...' \
>   `backquoted command`
>
> I'm not objecting to something like this.  If you happen to roll
> patches for the kernel that can make it work, I'll probably try them
> too.  But are the benefits of writing something like this worth the
> time required to write and test it?

i agree.  my main problem, which doesn't exist with FBSD
(thankfully), was, for example,

scriptA
---
#!/bin/sh ./scriptB 1 2 3

where scriptB runs (with options), then
processes scriptA, then exec's scriptA
with a modified command line.

(it would've been a long chore to write scriptB in C code,
 and it would've been a "kludge" to run scriptB on the
command line with scriptA as an argument - forcing
one to always type 2 words to do one command).

many OS's do not allow this since they load
"./scriptB 1 2 3" into a single argv[] element,
which, of course, the interpreter cannot run.

which seemed very stupid to me.
i saw problems and limitations,
and no benefit to that solution.

> There is one portable way.  It's easy to remember too:
>
>   #!/bin/sh
>
> No spaces, no args.  It works so far on all the systems I've tried.

heh - yes - i agree.  i was afraid someone would pick
apart all this!  i didn't really take the time to study
the functionality of the sh(1) options.  i only meant
to show the unintuitive nature of the implimentations
with regard to parsing.

> > #!/bin/sh scriptthis is obviously ok.
>
>   #!/bin/sh
>   . script

this won't work if "script" is going to do something
before exec'ing the file itself.  it will end up
being infinitely recursive.  and similarly for
the following:

> > #!/bin/sh -n script this is currently not ok
>   but why shouldn't it be?
>   #!/bin/sh
>   exec /bin/sh -n script
>
> > #!/bin/sh script 1 2this is ok with FBSD and RH Linux,
> > but not ok in a few implementations,
> > but why shouldn't it be?
>
>   #!/bin/sh
>   exec /bin/sh script 1 2

> The only objection I have in making execve() behave as if the whole
> she-bang thing was a valid sh(1) command, is that "I don't want sh(1)
> being imported into the kernel tree, period."  Of course, what I want
> is irrelevant if someone comes up with a solution to the problem of
> having an sh(1)-like parser without having sh(1) in the kernel :-)

yes - i agree.  i think freebsd hackers are the best.
and have the best design/implementation philosophies.
i am always humbled in their presence.

> - Giorgos

thank you.

Freebsd seems to be the only intelligent OS.
2.2.7, imho, seems to be "correct".
it may not "follow", but sometimes
intelligence has to lead ...

I would be interested in anyone could tell
me how/why any of the other solutions are
"more intelligent/practical".

It is my personal observation the solutions
of most vendors is due to SysV's limiting
definition of execve(2).

But I did note that Posix/SUSv3 definitions
remove such arbitrary limitations (the single [arg]).

#!/tmp/interp -a -b -c #d e

 Solaris 8: args: "/tmp/interp" "-a""/tmp/x2"
 Tru64 4.0: args: "interp"  "-a -b -c #d e" "/tmp/x2"
*FreeBSD 2.2.7: args: "/tmp/interp" "-a" "-b" "-c" "#d" "e" "/tmp/x2"
 FreeBSD 4.0:   args: "/tmp/interp" "-a" "-b" "-c"  "/tmp/x2"
 Linux 2.4.12:  args: "/tmp/interp" "-a -b -c #d e" "/tmp/x2"
 Linux 2.2.19:  args: "interp"  "-a -b -c #d e" "/tmp/x2"
 Irix 6.5:  args: "/tmp/interp" "-a -b -c #d e" "/tmp/x2"
 HPUX 11.00:args: "/tmp/x2" "-a -b -c #d e" "/tmp/x2"
 AIX 4.3:   args: "interp"  "-a -b -c #d e" "/tmp/x2"
 Mac OX X:  args: "interp"  "-a -b -c #d e" "/tmp/x2"

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-q

Re: #!/bin/sh & execve

2003-02-09 Thread Giorgos Keramidas
On 2003-02-08 21:58, [EMAIL PROTECTED] wrote:
> this does seem to be an ambiguous area.
>
> it seems more sane to allow arguments to a script given to an
> interpreter on the shebang line, passing everything after
> "#!/interpreter [arg]" off for "eval" or "sh -c" type parsing.

This is something that can be bth good and bad though.  As you have
pointed out, if a limited sort of parsing is allowed, then it would
most likely have to be sh(1)-like.  This means that the mechanism that
inteprets '#!' would have to know all the intricacies of the sh(1)
parser to work correctly in all cases.  Incomplete sh(1)-like parsers
that would understand "most of the sh(1) shell syntax" would be
exactly that... incomplete.

Another bad thing about this is that you would then need a lot more
memory to handle things like:

#!/bin/sh -c \
'my-magic-script.sh arg1 arg2 \
arg3 ...' \
`backquoted command`

I'm not objecting to something like this.  If you happen to roll
patches for the kernel that can make it work, I'll probably try them
too.  But are the benefits of writing something like this worth the
time required to write and test it?

> i don't know how it breaks anything to load execve's argv[] with
> everything after the shebang, followed by command line options/args.
> but it sure muddies the water if you don't.

There is one portable way.  It's easy to remember too:

#!/bin/sh

No spaces, no args.  It works so far on all the systems I've tried.

> #!/bin/sh -xthis is obviously ok.

#!/bin/sh
set -x
[...]

> #!/bin/sh -vx   this is obviously ok too.

Similarly.

> #!/bin/sh -c"string"this is currently not ok, but why shouldn't it be?

#!/bin/sh
exec "string"
exit 1

> #!/bin/sh -c "string"   this is currently not ok, but why shouldn't it be?

Similarly.

> #!/bin/sh scriptthis is obviously ok.

#!/bin/sh
. script

> #!/bin/sh -n script this is currently not ok, but why shouldn't it be?

#!/bin/sh
exec /bin/sh -n script

> #!/bin/sh script 1 2this is ok with FBSD and RH Linux,
> but not ok in a few implementations,
> but why shouldn't it be?

#!/bin/sh
exec /bin/sh script 1 2

> it seems that only a minority of execve() man pages /
> implementations are preventing the sane solution ...

The only objection I have in making execve() behave as if the whole
she-bang thing was a valid sh(1) command, is that "I don't want sh(1)
being imported into the kernel tree, period."  Of course, what I want
is irrelevant if someone comes up with a solution to the problem of
having an sh(1)-like parser without having sh(1) in the kernel :-)

- Giorgos


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message



Re: #!/bin/sh & execve

2003-02-08 Thread abc
this does seem to be an ambiguous area.

it seems more sane to allow arguments
to a script given to an interpreter on the
shebang line, passing everything after
"#!/interpreter [arg]" off for
"eval" or "sh -c" type parsing.

i don't know how it breaks anything to load execve's argv[]
with everything after the shebang, followed by command line
options/args.  but it sure muddies the water if you don't.

otherwise there is a can of worms unnecessarily:

#!/bin/sh -xthis is obviously ok.
#!/bin/sh -vx   this is obviously ok too.
#!/bin/sh -c"string"this is currently not ok, but why shouldn't it be?
#!/bin/sh -c "string"   this is currently not ok, but why shouldn't it be?
#!/bin/sh scriptthis is obviously ok.
#!/bin/sh -n script this is currently not ok, but why shouldn't it be?
#!/bin/sh script 1 2this is ok with FBSD and RH Linux,
but not ok in a few implementations,
but why shouldn't it be?

it seems that only a minority of execve() man pages / implementations
are preventing the sane solution ...

> The only thing I can find in IEEE Std 1003.1-2001 (aka SUSv3) is
> 
>  "If the first line of a file of shell commands starts with the
>   characters "#!", the results are unspecified."
> 
> which would indicate that there is no "proper" way of doing this.  You
> may also want to have a look at bin/16393; at the bottom is a list of
> how some unices handle the situation.  Your best bet at trying to be
> portable is to use at most one argument, no whitespace and no "#".
> 
> The PR: 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message



Re: #!/bin/sh & execve

2003-02-08 Thread Mikko Työläjärvi
On Sat, 8 Feb 2003 [EMAIL PROTECTED] wrote:

> say i have 2 scripts, scriptA and scriptB.
>
> scriptA
> ---
> #!/bin/sh ./scriptB 1 2 3
>
> scriptB
> ---
> #!/bin/sh
>
> echo 0:$0
> echo 1:$1
> echo 2:$2
> echo 3:$3
>
> --
>
> $ ./scriptA
>
> $0:./scriptB
> $1:1
> $2:2
> $3:3
>
> --
>
> according to execve(2), only a single [arg] should be recognized:
>
> #! interpreter [arg]
>
>  When an interpreter file is execve'd, the system actually execve's the
>  specified interpreter.  If the optional arg is specified, it becomes the
>  first argument to the interpreter, and the name of the originally
>  execve'd file becomes the second argument; otherwise, the name of the
>  originally execve'd file becomes the first argument.  The original argu-
>  ments are shifted over to become the subsequent arguments.  The zeroth
>  argument is set to the specified interpreter.
>
> so the argv[] array in execve() should be loaded as:
>
> argv[0]=sh, argv[1]=scriptB, argv[2]=scriptA, and
> argv[3...]=command line args passed to scriptA.
>
> i read many many execve() man pages, and it seems like this
> is the way things should be.  but in practice, it appears on
> many unix's, argv[] gets loaded additionally with any options
> given to a script (which is given as the "[arg]" to the interpreter)
> on the 1st line of a script.
>
> can anyone tell me if this is "proper", and why or why not?
> there doesn't seem to be consistency across unix's.
> some ignore, or give an error if more than one
> "[arg]" exists on the 1st line of a script.

The only thing I can find in IEEE Std 1003.1-2001 (aka SUSv3) is

 "If the first line of a file of shell commands starts with the
  characters "#!", the results are unspecified."

which would indicate that there is no "proper" way of doing this.  You
may also want to have a look at bin/16393; at the bottom is a list of
how some unices handle the situation.  Your best bet at trying to be
portable is to use at most one argument, no whitespace and no "#".

The PR: 

  $.02,
  /Mikko


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-questions" in the body of the message