Hi, 

A page constructed in the following way causes a somewhat nasty looking
segfault in current sable and dev versions of Lynx:

 $ perl -e 'print "<meta http-equiv=\"Content-Type\" content=\"text/html; 
charset=UTF-8\">\n&#", "1"x 2100000, ";\n"' > test.html
 $ lynx test.html
 Segmentation fault
 $

This happens every time at least on 64-bit Linux (Debian 6.0.3).

The crast state looks like:
   (gdb) bt
   #0  0x00007ffff765416f in _IO_vfscanf () from /lib/libc.so.6
   #1  0x00007ffff766b935 in vsscanf () from /lib/libc.so.6
   #2  0x00007ffff765a0b8 in sscanf () from /lib/libc.so.6
   #3  0x00000000004d84e9 in SGML_character (context=0x8d3fa0, c_in=59) at 
../../../WWW/Library/Implementation/SGML.c:2646
   #4  0x00000000004dd963 in SGML_write (context=0x8d3fa0, str=0x7ad440 '6' 
<repeats 200 times>..., l=2920)
      at ../../../WWW/Library/Implementation/SGML.c:4381
   [...]
   (gdb) x/i $rip
   => 0x7ffff765416f <_IO_vfscanf+5807>:   callq  0x7ffff7685750 <memcpy>
   (gdb) frame 3
   #3  0x00000000004d84e9 in SGML_character (context=0x8d3fa0, c_in=59) at 
../../../WWW/Library/Implementation/SGML.c:2646
   2646                     : sscanf(string->data, "%20lu", &lcode)) == 1) {
   (gdb) p string->data
   $2 = 0x7ffff7043010 '1' <repeats 200 times>...
   2641    #ifdef USE_PRETTYSRC
   2642                entity_string = string->data;
   2643    #endif
   2644                if ((context->isHex
   2645                     ? sscanf(string->data, "%lx", &lcode)
-> 2646                     : sscanf(string->data, "%lu", &lcode)) == 1) {
   2647                    UCode_t code = (UCode_t) lcode;
   2648
   2649    /* =============== work in ASCII below here ===============  S/390 
-- gil -- 1092 */
   2650                    if (AssumeCP1252(context)) {


A simple way to avoid the crash would be for example to bound the number
of bytes sscanf can read, because here the input is something coming from
outside and sscanf is expecting a representation of a number within the
valid range.

--- lynx2-8-8/WWW/Library/Implementation/SGML.c 2011-06-13 03:18:54.000000000 
+0300
+++ lynx2-8-8-scan/WWW/Library/Implementation/SGML.c    2011-11-30 
11:10:11.000000000 +0200
@@ -2643,7 +2643,7 @@
 #endif
            if ((context->isHex
-                ? sscanf(string->data, "%lx", &lcode)
-                : sscanf(string->data, "%lu", &lcode)) == 1) {
+                ? sscanf(string->data, "%20lx", &lcode)
+                : sscanf(string->data, "%20lu", &lcode)) == 1) {
                UCode_t code = (UCode_t) lcode;
 
 /* =============== work in ASCII below here ===============  S/390 -- gil -- 
1092 */


-- 
Aki Helin / OUSPG

_______________________________________________
Lynx-dev mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/lynx-dev

Reply via email to