On Mon, Jan 03, 2005 at 11:16:34PM +0100, Mauro Tortonesi wrote: > Alle 22:09, domenica 2 gennaio 2005, Jan Minar ha scritto: > > On Sun, Jan 02, 2005 at 01:37:36AM +0100, Mauro Tortonesi wrote: > especially after you've posted a bug report on bugtraq (which was more a > personal attack than a professional bug report) saying that wget authors are > all incompetent...
I basically agree with You on this point.
> > > as Fumitoshi UKAI suggested, the best choice would be to escape only the
> > > strings that need to be escaped. so, i think we should probably check
> > > together which strings passed to logprintf in the wget code need to be
> > > escaped. anyone willing to help?
> >
> > You don't want to check whether this or that string accidentally needs
> > or doesn't need to get escaped. The right way is to sanitize *all*
> > untrusted input before you even start thinking about using it.
>
> mmmh, i don't think so. why would you for example want or need to escape
> format strings (that are retrieved via gettext and are already in your local
> charset), the URLs to download or the configuration data read from wgetrc?
Indeed, there's no point in not trusting other parts of the program
(apart from robustness, sometimes). I think I've heard this one
somewhere, and I have to repeat: there's no difference between the .po
files and the .h or .c files: It's all just different ways of
programming. You would have to rewrite gettext to make some security
boundary between the C code and the translated strings.
I meant any input coming from an untrusted source such as a different
user on the same system, or anything fetched from a network (be it a
genuine server response, or some MiM-injected crap). -- But this is a
basic security concept.
> anyway, simone piunno and i have been talking a lot about this problem and
> we've found that apart from a couple of minor problems (very easy to fix) the
> current implementation of escape_buffer works fine. the problem is when you
> pass escaped multibyte strings as arguments to printf. if these strings
> contain a 0x00 byte, it will be incorrectly interpreted by printf as a string
> termination characher. simone says for example that UTF16 strings can contain
> null bytes.
AFAICT my patch doesn't introduce any problems that haven't been there
before:
$ grep printf wget-filter-controls.patch.v3--multibyte-aware
* do about this *from within logvprintf()*.
+ swprintf (buf, 2, L"%d", ((c & 0xff) >> 6));
+ swprintf (buf, 2, L"%d", ((c & 0x3f) >> 3));
+ swprintf (buf, 2, L"%d", (c & 7));
> i don't really have any clue on how to solve this problem. simone suggests to
> change the internal format of strings in wget to UTF8, but of course i would
> prefer a less invasive solution if possible... i don't even know if we could
> keep using gettext in that case.
What's wrong with mbrtowc(3) and friends? The mysterious solution is
probably to use wprintf(3) instead printf(3). Couple of questions on #c
on freenode would give you that answer.
I really don't mean it as a personal attack, but since You've showed You
don't know much about basic security principles, or [the more intricate
parts of] C, I think You really should handle the maintainanceship of
wget to someone more experienced.
Jan.
--
)^o-o^| jabber: [EMAIL PROTECTED]
| .v K e-mail: jjminar FastMail FM
` - .' phone: +44(0)7981 738 696
\ __/Jan icq: 345 355 493
__|o|__Minář irc: [EMAIL PROTECTED]
pgpyXIK5cwNiY.pgp
Description: PGP signature
