I never claimed the software would work or is supposed to work
for Chinese characters. I've been corresponding with this guy
and I told him the same thing.
I'm not the original author of the code, so I don't understand
the compression algorithm very well, so I'm not inclined to make
a hack fix without knowing the real problem.
- Paul
On Wed, 18 Jan 2006, Erik Schanze wrote:
> Dear Paul,
>
> [ Please keep Debian-BTS in CC line. ]
>
> a Debian user of txt2pdbdoc has reported a bug on converting chinese
> documents.
> (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=346348)
>
> I can confirm the bug with his test file on my side. Please test it and
> give us your opinion.
>
> URL to test file:
> http://bugs.debian.org/cgi-bin/bugreport.cgi/gbk.txt?bug=346348;msg=20;att=1
>
> Thank you in advance.
>
> ---------- Forwarded Message ----------
>
> Subject: Bug#346348: txt2pdbdoc core dump when compress chinese document
> Date: Sonntag, 8. Januar 2006 09:06
> From: Xie Yanbo <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
>
> On 1/8/06, Erik Schanze <[EMAIL PROTECTED]> wrote:
> > Please provide some sample data for testing. (No private material!)
> > A backtrace of gdb would also be nice, if you are able to do so.
>
> I got this backgrace running with attachment file `gbk.txt'.
>
> 0$ echo $LANG
> C
> 0$ export LANG=C
> 0$ ./txt2pdbdoc test gbk.txt gbk.pdb
> *** glibc detected *** double free or corruption (out): 0x0804da50 ***
> Aborted (core dumped)
> 134$ gdb ./txt2pdbdoc ./core
> GNU gdb 6.4-debian
> Copyright 2005 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are welcome to change it and/or distribute copies of it under
> certain conditions. Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for
> details. This GDB was configured as "i486-linux-gnu"...Using host
> libthread_db library "/lib/tls/libthread_db.so.1".
>
> Core was generated by `./txt2pdbdoc test gbk.txt gbk.pdb'.
> Program terminated with signal 6, Aborted.
>
> warning: Can't read pathname for load map: Input/output error.
> Reading symbols from /lib/tls/libc.so.6...done.
> Loaded symbols for /lib/tls/libc.so.6
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> #0 0xb7e3a7a7 in raise () from /lib/tls/libc.so.6
> (gdb) bt
> #0 0xb7e3a7a7 in raise () from /lib/tls/libc.so.6
> #1 0xb7e3c04b in abort () from /lib/tls/libc.so.6
> #2 0xb7e71015 in __fsetlocking () from /lib/tls/libc.so.6
> #3 0xb7e77667 in malloc_usable_size () from /lib/tls/libc.so.6
> #4 0xb7e77b02 in free () from /lib/tls/libc.so.6
> #5 0x08048cfb in compress (b=0xbfe6c11c) at txt2pdbdoc.c:306
> #6 0x0804950b in encode (document_name=0xbfe6d8f6 "test",
> src_file_name=0x0, dest_file_name=0x0) at txt2pdbdoc.c:561
> #7 0x08049b3f in main (argc=3, argv=0xbfe6c204) at txt2pdbdoc.c:217
> (gdb) frame
> #0 0xb7e3a7a7 in raise () from /lib/tls/libc.so.6
> (gdb) l txt2pdbdoc.c:306
> 301 /* when we get to the end of the buffer, don't
> inc past the */
> 302 /* end; this forces the residue chars out one
> at a time */
> 303 if ( tail != end )
> 304 ++tail;
> 305 }
> 306 free( buf_orig );
> 307
> 308 if ( space )
> 309 b->data[ b->len++ ] = ' '; /* add
> left-over space */
> 310
> (gdb)
> 0$
>
> And compress utf8 file will be ok.
>
> 0$ cat gbk.txt | iconv -f gbk -t utf8 > utf8.txt
> 0$ ./txt2pdbdoc test utf8.txt utf8.pdb
>
> These can explain what happens:
>
> 0$ gdb ./txt2pdbdoc
> (gdb) b txt2pdbdoc.c:306
> Breakpoint 1 at 0x8048cf0: file txt2pdbdoc.c, line 306.
> (gdb) b txt2pdbdoc.c:561
> Breakpoint 2 at 0x8049500: file txt2pdbdoc.c, line 561.
> (gdb) r test gbk.txt gbk.pdb
> Starting program: /home/xyb/deb/txt2pdbdoc/txt2pdbdoc-1.4.4/txt2pdbdoc
> test gbk.txt gbk.pdb
>
> Breakpoint 2, encode (document_name=0xbfab38ca "test",
> src_file_name=0x1 <Address 0x1 out of bounds>,
> dest_file_name=0x1 <Address 0x1 out of bounds>) at txt2pdbdoc.c:561
> 561 compress( &buf );
> (gdb) p buf.len
> $1 = 4004
> (gdb) c
> Continuing.
>
> Breakpoint 1, compress (b=0xbfab1a0c) at txt2pdbdoc.c:306
> 306 free( buf_orig );
> (gdb) p b.len
> $2 = 6034
> (gdb) s
> *** glibc detected *** double free or corruption (out): 0x0804da50 ***
>
> Program received signal SIGABRT, Aborted.
> 0xb7e807a7 in raise () from /lib/tls/libc.so.6
> (gdb) c
> Continuing.
>
> Program terminated with signal SIGABRT, Aborted.
> The program no longer exists.
> (gdb) q
> 0$
> 0$ gdb ./txt2pdbdoc
> (gdb) b txt2pdbdoc.c:306
> Breakpoint 1 at 0x8048cf0: file txt2pdbdoc.c, line 306.
> (gdb) b txt2pdbdoc.c:561
> Breakpoint 2 at 0x8049500: file txt2pdbdoc.c, line 561.
> (gdb) r test utf8.txt utf8.pdb
> Starting program: /home/xyb/deb/txt2pdbdoc/txt2pdbdoc-1.4.4/txt2pdbdoc
> test utf8.txt utf8.pdb
>
> Breakpoint 2, encode (document_name=0xbfe518c8 "test",
> src_file_name=0x1 <Address 0x1 out of bounds>,
> dest_file_name=0x1 <Address 0x1 out of bounds>) at txt2pdbdoc.c:561
> 561 compress( &buf );
> (gdb) p buf.len
> $1 = 4096
> (gdb) c
> Continuing.
>
> Breakpoint 1, compress (b=0xbfe4ff7c) at txt2pdbdoc.c:306
> 306 free( buf_orig );
> (gdb) p b.len
> $2 = 4351
> (gdb) c
> Continuing.
>
> Breakpoint 2, encode (document_name=0xbfe518c8 "test",
> src_file_name=0x1 <Address 0x1 out of bounds>,
> dest_file_name=0x1 <Address 0x1 out of bounds>) at txt2pdbdoc.c:561
> 561 compress( &buf );
> (gdb) p buf.len
> $3 = 1872
> (gdb) c
> Continuing.
>
> Breakpoint 1, compress (b=0xbfe4ff7c) at txt2pdbdoc.c:306
> 306 free( buf_orig );
> (gdb) p b.len
> $4 = 2256
> (gdb) c
> Continuing.
>
> Program exited normally.
> (gdb) q
> 0$
>
> -------------------------------------------------------
>
>
> Kindly regards,
>
> Erik
>
>
>
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]