On Wed, Apr 10, 2013 at 1:57 PM, Roland Mainz <[email protected]> wrote: > On Wed, Apr 10, 2013 at 1:32 PM, Roland Mainz <[email protected]> > wrote: >> [CC:'ing Werner since this is i18n related and was only observed on >> SuSE 12.3 Linux for now...] >> >> Attached (as "astksh20130409_suse123_32bit_builtin_iconv_hang1.txt.gz") >> is a (compressed) text file which causes the AST "iconv" builtin >> utility from ast-ksh.2013-04-09 to "hang" in an endless loop in 32bit >> i386 builds (AMD64 64bit builds are OK... *ONLY* the 32bit builds loop >> forever...). >> >> Example: >> -- snip -- >> $ gunzip astksh20130409_suse123_32bit_builtin_iconv_hang1.txt.gz >> $ LC_ALL=en_US.UTF-8 ../build_i386_32bit_debug/arch/linux.i386/bin/ksh >> -c 'builtin iconv ; iconv -f UTF-8 >> /tmp/astksh20130409_suse123_32bit_builtin_iconv_hang1.txt >/tmp/zzz2 ; >> true' >> <hangs forever> >> -- snip -- >> >> Neither 32bit or 64bit builds trigger any valgrind hits and the gdb >> stacktrace is no very usefull either: >> -- snip -- >> $ LC_ALL=en_US.UTF-8 gdb --args >> ../build_i386_32bit_debug/arch/linux.i386/bin/ksh -c 'builtin iconv ; >> iconv -f UTF-8 /tmp/astksh20130409_suse123_32bit_builtin_iconv_hang1.txt >>>/tmp/zzz2 ; true' >> GNU gdb (GDB) SUSE (7.5.1-2.1.1) >> Copyright (C) 2012 Free Software Foundation, Inc. >> [snip] >> Reading symbols from >> /home/test001/work/ast_ksh_20130409/build_i386_32bit_debug/arch/linux.i386/bin/ksh...done. >> (gdb) run >> Starting program: >> /home/test001/work/ast_ksh_20130409/build_i386_32bit_debug/arch/linux.i386/bin/ksh >> -c builtin\ iconv\ \;\ iconv\ -f\ UTF-8\ >> /tmp/astksh20130409_suse123_32bit_builtin_iconv_hang1.txt\ >> \>/tmp/zzz2\ \;\ true >> Missing separate debuginfo for /lib/ld-linux.so.2 >> [snip] >> ^C >> Program received signal SIGINT, Interrupt. >> 0xf7dd2447 in __gconv_transform_utf8_internal () from /lib/libc.so.6 >> (gdb) where >> #0 0xf7dd2447 in __gconv_transform_utf8_internal () from /lib/libc.so.6 >> #1 0xf7dcd00a in __gconv () from /lib/libc.so.6 >> #2 0xf7dcc5b2 in iconv () from /lib/libc.so.6 >> #3 0x00000001 in ?? () >> #4 0x0821b400 in ?? () >> Backtrace stopped: previous frame inner to this frame (corrupt stack?) >> -- snip -- >> (I don't know how to "fix" the "previous frame inner to this frame" >> issue... ;-( ) > > More data: if I force the ksh93 builtin "iconv" to read from a pipe I > get a warning about an incomplete multibyte sequence... > -- snip -- > $ LC_ALL=en_US.UTF-8 ../build_i386_32bit_debug/arch/linux.i386/bin/ksh > -c 'builtin iconv ; cat > /tmp/astksh20130409_suse123_32bit_builtin_iconv_hang1.txt | iconv -f > UTF-8 >/tmp/zzz2 ; true' > iconv: incomplete multibyte sequence at offset 32767 [Invalid argument] > -- snip -- > ... it seems the issue is somehow related to the difference that > "iconv" reading a plain file uses |mmap()| ... triggering a different > codepath than reading from a pipe. > > Question is now... who is correct ? GNU "iconv" doesn't seem to print > any warnings/errors for the input file while AST "iconv" prints a > warning when reading from a pipe and hangs when reading via |mmap()| > ... > ... another issue is... why does this only happen for 32bit builds ?
The issue does happen for 64bit builds, too. It seems it happens (for 32bit builds) when a multibyte character is exactly at a 32k buffer boundary... one part of the multibyte character is in the first buffer and the rest of the multibyte character's bytes is in the 2nd buffer. Here is a reduced/standalone testcase: -- snip -- $ ksh -c 'builtin iconv ; integer i ; typeset prefix="123" ; for ((i=0 ; i < 2**16 ; i++ )) ; do printf "%s\u[20ac]" "$prefix" ; done | iconv -f UTF-8 >xxx' -- snip -- (the string length of "prefix" may have to be varied to catch sfio buffers of a different size (I'll write a testcase for the builtin iconv later)) ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) [email protected] \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) _______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
