date:20020325

cvs commit: modperl/t/net/perl util.pl

2002-03-25 Thread dougm


dougm   02/03/25 10:45:23

  Modified:.Changes
   src/modules/perl Util.xs
   t/net/perl util.pl
  Log:
  backing out change: properly escape highbit chars in Apache::Utils::escape_html
  
  Revision  ChangesPath
  1.639 +0 -4  modperl/Changes
  
  Index: Changes
  ===
  RCS file: /home/cvs/modperl/Changes,v
  retrieving revision 1.638
  retrieving revision 1.639
  diff -u -r1.638 -r1.639
  --- Changes   25 Mar 2002 02:57:59 -  1.638
  +++ Changes   25 Mar 2002 18:45:23 -  1.639
  @@ -26,10 +26,6 @@
   properly deal with $r->status codes (e.g. redirect) in
   Apache::RegistryNG [Geoff Young <[EMAIL PROTECTED]>]
   
  -properly escape highbit chars in Apache::Utils::escape_html
  -[Geoff Young <[EMAIL PROTECTED]>,
  -Robin Berjon <[EMAIL PROTECTED]>]
  -
   allow $r->auth_name and $r->auth_type to be set on win32
   [John Kelly <[EMAIL PROTECTED]>]
   
  
  
  
  1.11  +1 -8  modperl/src/modules/perl/Util.xs
  
  Index: Util.xs
  ===
  RCS file: /home/cvs/modperl/src/modules/perl/Util.xs,v
  retrieving revision 1.10
  retrieving revision 1.11
  diff -u -r1.10 -r1.11
  --- Util.xs   24 Mar 2002 21:57:53 -  1.10
  +++ Util.xs   25 Mar 2002 18:45:23 -  1.11
  @@ -32,9 +32,6 @@
   return sv;
   }
   
  -#define IS_HIGHBIT_CHAR(b) \
  -   ( (((unsigned char)(b)) > 126) && (((unsigned char)(b)) <= 255) )
  -
   static SV *my_escape_html(char *s)
   {
   int i, j;
  @@ -46,7 +43,7 @@
j += 3;
else if (s[i] == '&')
j += 4;
  -else if (s[i] == '"' || IS_HIGHBIT_CHAR(s[i]))
  +else if (s[i] == '"')
j += 5;
   
   if (j == 0)
  @@ -70,10 +67,6 @@
memcpy(&SvPVX(x)[j], """, 6);
j += 5;
}
  -else if (IS_HIGHBIT_CHAR(s[i])) {
  -sprintf(&SvPVX(x)[j], "&#%d;", (unsigned char)s[i]);
  -j += 5;
  -}
else
SvPVX(x)[j] = s[i];
   
  
  
  
  1.13  +1 -5  modperl/t/net/perl/util.pl
  
  Index: util.pl
  ===
  RCS file: /home/cvs/modperl/t/net/perl/util.pl,v
  retrieving revision 1.12
  retrieving revision 1.13
  diff -u -r1.12 -r1.13
  --- util.pl   24 Mar 2002 21:57:53 -  1.12
  +++ util.pl   25 Mar 2002 18:45:23 -  1.13
  @@ -2,7 +2,7 @@
   use Apache::test;
   $|++;
   my $i = 0;
  -my $tests = 8;
  +my $tests = 7;
   
   my $r = shift;
   $r->send_http_header('text/plain');
  @@ -61,10 +61,6 @@
   
   
   EOF
  -
  -#XXX: this test could be more robust, but its better than nothing
  -my $c = Apache::Util::escape_html("\x8b");
  -test ++$i, $c =~ /^&\#\d{3,3}\;$/;
   
   my $txt = "No html tags in here at all";
   my $etxt = Apache::Util::escape_html($txt);

Re: cvs commit: modperl/t/net/perl util.pl

2002-03-25 Thread Eric Cholet

--On Monday, March 25, 2002 10:29:11 -0800 Doug MacEachern 
<[EMAIL PROTECTED]> wrote:

> i had a bad feeling about this.  we should not be implementing
> escape_html  to begin with, the functionality should all be in apache.
> i'm going to  back out the patch.  anybody care to make a doc patch to
> explain the  problems with escape_html before the patch went in?  thanks.

I believe the patch is useful though, in cases where the charset is not 
explicitely
specified and there's an odd character with the 8th bit set it will now do 
the
right thing. I guess a lot of US coders would fall in that situation... I 
suppose
it's faster than HTML::Entities (I haven't benchmarked it though).
So I suspect the patch will fix more situations than it breaks: if using
a single-byte non-ASCII encoding, it doesn't actually break anything, just 
adds
bloat. If using a multi-byte encoding escape_html was broken/inapplicable 
already.

--
Eric Cholet

Re: cvs commit: modperl/t/net/perl util.pl

2002-03-25 Thread Geoffrey Young

Doug MacEachern wrote:
> 
> i had a bad feeling about this.  we should not be implementing escape_html
> to begin with, the functionality should all be in apache.  i'm going to
> back out the patch. 

sounds wise, especially considering people like Eric will end up with larger pages as a
result, while the patch fixes a rather obscure vunerability, for which other solutions
(HTML::Entities) are available.

> anybody care to make a doc patch to explain the
> problems with escape_html before the patch went in?  

I nominate robin, since I forget how it came up in the first place :)

IIRC is was due to this post

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-03/msg00750.html

and specifically an exploit involving browsers incorrectly assuming 0x8b as a "<" and 
0x9b
as a ">", thus creating a way around escape_html().

Robin, does that accurately summarize it?  it's been far too long for me :)

--Geoff

Re: cvs commit: modperl/t/net/perl util.pl

2002-03-25 Thread Doug MacEachern


i had a bad feeling about this.  we should not be implementing escape_html 
to begin with, the functionality should all be in apache.  i'm going to 
back out the patch.  anybody care to make a doc patch to explain the 
problems with escape_html before the patch went in?  thanks.

Re: cvs commit: modperl/t/net/perl util.pl

2002-03-25 Thread Eric Cholet

--On Sunday, March 24, 2002 21:57:54 + [EMAIL PROTECTED] wrote:

> dougm   02/03/24 13:57:53
>
>   Modified:.Changes STATUS
>src/modules/perl Util.xs
>t/net/perl util.pl
>   Log:
>   Submitted by:   Geoff Young <[EMAIL PROTECTED]>
>   Reviewed by:dougm
>   properly escape highbit chars in Apache::Utils::escape_html

This is uncool for those of us using a non-ASCII encoding and sending
out lots of characters with the 8th bit set, e.g. in a French page
many accented characters will be replaced by 6-byte sequences.
If I'm sending out "Content-type: text/html; charset=ISO-8859-1",
and calling escape_html to escape '<', '>' and the like, I'm going
to be serving quite a lot more bytes than before this patch.

However escape_html () has no clue as to what the character set is,
and whether it has been correctly specified in the Content-Type.
It has also be mentionned here that escape_html is only valid for
single-byte encodings.

So this patch does the right thing to escape the odd 8 bit char in
a mostly ASCII output, but users of other charsets should be warned
not to use it. I use HTML::Entities::encode($_[0], '<>&"') myself.

Therefore I propose a doc patch to clear this up:

Index: Util.pm
===
RCS file: /home/cvs/modperl/Util/Util.pm,v
retrieving revision 1.8
diff -u -r1.8 Util.pm
--- Util.pm 4 Mar 2000 20:55:47 -   1.8
+++ Util.pm 25 Mar 2002 18:19:37 -
@@ -68,6 +68,13 @@

  my $esc = Apache::Util::escape_html($html);

+This function is unaware of its argument's character set and encoding.
+It assumes a single-byte encoding and escapes all characters with the
+8th bit set. Do not use it with multi-byte encodings such as utf8.
+When using a single byte non-ASCII encoding such as ISO-8859-1,
+consider specifying the character set in the Content-Type header,
+and using HTML::Entities to avoid unnecessary escaping.
+
 =item escape_uri

 This function replaces all unsafe characters in the $string with their

--
Eric Cholet

cvs commit: modperl/t/net/perl util.pl

Re: cvs commit: modperl/t/net/perl util.pl

Re: cvs commit: modperl/t/net/perl util.pl

Re: cvs commit: modperl/t/net/perl util.pl

Re: cvs commit: modperl/t/net/perl util.pl

5 matches

Site Navigation

Mail list logo

Footer information