On 2014/08/11 03:03, Vadim Zhukov wrote:
> 2014-08-11 2:06 GMT+04:00 patrick keshishian <[email protected]>:
> > On 8/10/14, Vadim Zhukov <[email protected]> wrote:
> >> This changes the way ifconfig(8) to print lines like 'crazy "nwid',
> >> i.e., containing double quotes inside the data being output.
> >> At the present, such lines will be printed in the following way:
> >>
> >> "crazy "nwid"
> >>
> >> And this makes everything that tries to parse such lines go crazy
> >> in their turn. I propose to force unambigious hexadecimal output
> >> in this case.
> >
> > Caution: Slippery slope ahead!
> >
> > Any other "weird" characters that may confuse parsers? I see a
> > bunch of networks with single-quotes in them in my area. What
> > about back-slashes, back-ticks, exclamation-marks, hash-marks,
> > ...?
> 
> No problem with those: they'll be a single word (here "word" means
> "string without white space characters" or get quoted anyway. See,
> what happens with different strings fed into print_string() now:
> 
> 1. Simple ASCII word:
> foo
> foo
> 
> 2. A few ASCII words:
> foo bar
> "foo bar"
> 
> 3. ASCII word containing "safe" symbol:
> foo&bar
> foo&bar
> 
> 4. A few ASCII words plus a "safe" symbol:
> foo &bar
> "foo &bar"
> 
> 5. ASCII word with a double quote:
> foo"bar
> foo"bar
> 
> 6. A few ASCII words with a double quote:
> foo "bar
> "foo "bar"
> 
> 7. Non-ASCII word:
> fooАБВbar
> 0x666f6fd090d091d0926261720a
> 
> The 1-4 and 7 could be easily parsed, e.g., by regex. But 5 and 6
> can't be. My patch fixes this situation, changing them to 0x form,
> too.

ifconfig is a user-interface itself and this isn't great from a user
point-of-view, e.g. take a network named like so:

freewifi password "blah"

this would become a string which is unreadable without consulting
ascii(7) or similar.

> There is another problem, with 0x.* strings being undistinguishable:
> is it an original value, or was it translated by ifconfig? But IMHO it
> should be discussed and dealed separately. I could be wrong, though.
> :)

0x strings are always translated, check out the conditions of the if()
in line 1500. So it's not actually an ambiguous format, though might
not be obvious to the user.

My personal preference for any directly printed strings would be to
always print surrounding " and escape any internal quotes:

        ieee80211: nwid "blah"
        ieee80211: nwid "some\"string\""
        ieee80211: nwid "freewifi password \"blah\""

and print hexdumped strings without quotes, so it's more obvious that
they have been dumped:

        ieee80211: nwid 0x3078313233343536

Another question is what to do with (increasingly common) unicode
SSIDs, we could probably do better than the existing "if (buf[i] & 0x80
|| !isprint(buf[i]))" if we know that we're in a utf8 locale.


Reply via email to