Re: new string module

Jan Minar Tue, 04 Jan 2005 17:49:17 -0800

On Mon, Jan 03, 2005 at 11:16:34PM +0100, Mauro Tortonesi wrote:
> Alle 22:09, domenica 2 gennaio 2005, Jan Minar ha scritto:
> > On Sun, Jan 02, 2005 at 01:37:36AM +0100, Mauro Tortonesi wrote:
> especially after you've posted a bug report on bugtraq (which was more a 
> personal attack than a professional bug report) saying that wget authors are 
> all incompetent...


I basically agree with You on this point.

> > > as Fumitoshi UKAI suggested, the best choice would be to escape only the
> > > strings that need to be escaped. so, i think we should probably check
> > > together which strings passed to logprintf in the wget code need to be
> > > escaped. anyone willing to help?
> >
> > You don't want to check whether this or that string accidentally needs
> > or doesn't need to get escaped. The right way is to sanitize *all*
> > untrusted input before you even start thinking about using it.
> 
> mmmh, i don't think so. why would you for example want or need to escape 
> format strings (that are retrieved via gettext and are already in your local 
> charset), the URLs to download or the configuration data read from wgetrc?

Indeed, there's no point in not trusting other parts of the program
(apart from robustness, sometimes).  I think I've heard this one
somewhere, and I have to repeat: there's no difference between the .po
files and the .h or .c files:  It's all just different ways of
programming.  You would have to rewrite gettext to make some security
boundary between the C code and the translated strings.

I meant any input coming from an untrusted source such as a different
user on the same system, or anything fetched from a network (be it a
genuine server response, or some MiM-injected crap). -- But this is a
basic security concept.

> anyway, simone piunno and i have been talking a lot about this problem and 
> we've found that apart from a couple of minor problems (very easy to fix) the 
> current implementation of escape_buffer works fine. the problem is when you 
> pass escaped multibyte strings as arguments to printf. if these strings 
> contain a 0x00 byte, it will be incorrectly interpreted by printf as a string 
> termination characher. simone says for example that UTF16 strings can contain 
> null bytes.

AFAICT my patch doesn't introduce any problems that haven't been there
before:

$ grep printf wget-filter-controls.patch.v3--multibyte-aware
                 * do about this *from within logvprintf()*.
+                               swprintf (buf, 2, L"%d", ((c & 0xff) >> 6));
+                               swprintf (buf, 2, L"%d", ((c & 0x3f) >> 3));
+                               swprintf (buf, 2, L"%d", (c & 7));

> i don't really have any clue on how to solve this problem. simone suggests to 
> change the internal format of strings in wget to UTF8, but of course i would 
> prefer a less invasive solution if possible... i don't even know if we could 
> keep using gettext in that case.

What's wrong with mbrtowc(3) and friends?  The mysterious solution is
probably to use wprintf(3) instead printf(3).  Couple of questions on #c
on freenode would give you that answer.

I really don't mean it as a personal attack, but since You've showed You
don't know much about basic security principles, or [the more intricate
parts of] C, I think You really should handle the maintainanceship of
wget to someone more experienced.

Jan.
-- 
 )^o-o^|    jabber: [EMAIL PROTECTED]
 | .v  K    e-mail: jjminar FastMail FM
 `  - .'     phone: +44(0)7981 738 696
  \ __/Jan     icq: 345 355 493
 __|o|__Minář  irc: [EMAIL PROTECTED]

pgpyXIK5cwNiY.pgp
Description: PGP signature

Re: new string module

Reply via email to