On 2/22/06, James Antill <[EMAIL PROTECTED]> wrote:
> "Axel Liljencrantz" <[EMAIL PROTECTED]> writes:
>
> >> > Fish _does_ have a \ parser. \n, \t, \\  and friends all work, as do
> >> > \$, \{, \% \(, and other characters used for argument expansion. Fish
> >> > even supports \xXX, \oOOO, \uXXXX, etc for arbitrary symbols, e.g.
> >> > 'echo \u2026' writes out an unicode ellipsis character to stdout.
> >>
> >>  Ahh, I had tried it with quotes Ie. "a\nb" and even "a\"b" (which is
> >> a syntax error) ... and tr does \ processing itself, so I just assumed
> >> the rest.
> >
> > Yes, it may seem a bit counter intuitive to allow \n but not '\n', but
> > once you start thinking about it, the role of quotes in shellscript
> > languages are to _remove_ various types of argument expansion, not to
> > add new ones. And from that perspective, I really think the fish
> > syntax makes perfect sense. And it sure beats the braindead $'\n'
> > syntax used in other shells.
>
>  Yeh, just from seeing C I expect ' " to allow them (and zsh does too).

You would expect it, yes. But as I said, if you think about it, it
really doesn't fit with how shells use quotes, i.e. to _remove_
expansion types. It really makes no sense for quotes to mean drop
wildcards, command substitutions and some other expansion types, but
instead add these other expansion types. Making quotes remvove certain
expansion types is the only thing that makes sense, in my opinion. So
this is a place where what you might naively expect is the exact
opposite of what is useful and logical.

As to _not_ removing backslash escapes in quoted strings, I think that
all expansion types should be dropped by default when quoting, that is
what quoting is _for_ in a shell. Only those that specifically have
_strong_ reasons for not beeing removed shouldn't be dropped.
Currently, that means the following:

* \' escapes a ' in single quoted strings
* \" escapes a " in double quoted strings
* Variable expansion is enabled in double quoted mode

_All_ other expansion types are disabled in quoted strings.

>
> >> > Actually, though, one of the above _is_ missing, namely \0. Fish
> >> > internally uses null-terminated strings, it would take a bit of work
> >> > to handle \0 correctly, though I suppose that this should be done.
> >>
> >>  http://www.and.org/vstr/security ;)
> >
> > You wrote that? Very nice page. In my defense I will say these things:
>
>  I did, hence the wink :).

Cool. I'm guessing wchar_t support for vstr is not on the roadmap, though?

>
> > * When I started writing fish, I couldn't find any good dynamic string
> > library using wchar_t. This is also why fish uses it's own internal
> > tokenizer instead of something like flex.
>
>  Yeh, the general solution I've seen/used seems to be to avoid wchar_t
> and just declare char to be utf-8 ... but that's coming from someone
> who only speaks English, so...

Yeah, that is not really acceptable for me. It works in the US, since
you either use UTF-8 or ASCII, which is a subset of UTF-8. But here in
europe, 8859-* still seems to be more common than UTF-8, so I don't
want to go down that path.

And using UTF-8 as the internal character set has its share of issues,
since you can't directly acces a character at a specific index,
somthing which you end up wanting to do in a shell, since almost
everything you do is a string manipulation of some kind. That means
you'd have to independently keep track of string size in bytes, string
size in number of characters and string size in on-screen width, three
different values. Keeping track of two with wchar_t is difficult
enough...

>
> >>  Also note that \000 is just the normal octal encoding though,
> >> Ie. from the tr man page:
> >>
> >>        \NNN   character with octal value NNN (1 to 3 octal digits)
> >
> > I had no idea. Though I must say that 'normal' depends on where you
> > are coming from. tr (and probably many other commands/languages) use
> > \NNN, while Python, C and a host of other computer languages use
> > \oOOO.
>
>  That's not true (for C at least). See ISO 9899:1999 (C99) 6.4.4.4
> ... it has \N \NN and \NNN for how to escape octal values. The only
> matches for \o in the text are for references to \octal digits, which
> reference 6.4.4.4.
>  I don't have a copy of c89 handy, but I'd put money on \oNNN not
> being std.
>  I'd never seen the \oNNN syntax before you mentioned it, but I don't
> see any harm in keeping it.

What do you know... Wonder wheere I got it from, then... Oh well, I'll
drop it. I don't like having multiple implementations of the same
functionality, so keeping both is not desirable, from my point of
view.

>
> > I'm curious to how others see the the interaction between ';' and '|',
> > if anyone can explan to me their view on how they should be
> > interpreted, I'm all ears.
>
>  My viewpoint is more of a syntax/protocol one, so I read "echo a;;"
> as "echo a" <END command> "" <END command> ... where the null
> statement is not the ; itself, but the lack of anything before it. So
> I see "foo|;" as "foo" <Wait for input next command> "" <END command>
> and the "" wouldn't qualify as an input command.
>  But, maybe I've written too many network protocol apps. :)

Ok, but that means that '|' temporarily changes the meaning of ';'. It
creates statefullness in an otherwise stateless stream of tokens.

>
> --
> James Antill -- [EMAIL PROTECTED]
> http://www.and.org/and-httpd

--
Axel


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Fish-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fish-users

Reply via email to