On Fri, Jun 29, 2012 at 7:57 PM, Roland Mainz <roland.ma...@nrubsig.org> wrote:
> On Fri, Jun 29, 2012 at 5:45 PM, Roland Mainz <roland.ma...@nrubsig.org> 
> wrote:
>> On Fri, Jun 29, 2012 at 5:16 PM, Lionel Cons
>> <lionelcons1...@googlemail.com> wrote:
>> [Removing ast-us...@research.att.com]
>>> On 29 June 2012 16:41, Irek Szczesniak <iszczesn...@gmail.com> wrote:
>>>> On Fri, Jun 29, 2012 at 4:16 PM, Irek Szczesniak <iszczesn...@gmail.com> 
>>>> wrote:
>>>>> On Fri, Jun 29, 2012 at 2:25 PM, Lionel Cons
>>>>> <lionelcons1...@googlemail.com> wrote:
>>>>>> On 29 June 2012 07:29, Glenn Fowler <g...@research.att.com> wrote:
>>>>>>>
>>>>>>> the AT&T Software Technology ast 2012-06-28 source release
>>>>>>> has been posted to the download site
>>>>>>>        http://www.research.att.com/sw/download/
>>>>>>> the notes and changes link has details on the release
>>>>>>>
>>>>>>> the git source repository will be updated later today
>>>>>>>        http://www.research.att.com/sw/gitweb/
>>>>>>
>>>>>> Can anyone confirm that his release is broken? We've been unable to
>>>>>> produce a ksh binary from this on Fedora and Suse which can be used to
>>>>>> process autoconf scripts or any other production script. This is
>>>>>> really bad, and if there are confirmations then please WITHDRAW that
>>>>>> release.
>>>>>
>>>>> With less emphasis on strong language, and only speaking for Solaris
>>>>> 9/10/11 on x86-64 (Intel Xenon) and SPARC64, both build with Sun
>>>>> Studio 12, I can say that something is wrong. We've upgraded from the
>>>>> last non-beta release to this beta and *lots* of things broke down
>>>>> immediately. Analysis is still in progress but it's Friday afternoon
>>>>> so a more detailed report may come Monday.
>>>>
>>>> David, I think ksh suffers from a data corruption problem in en_US.UTF-8.
>>>>
>>>> On Solaris 10/x86-64 (Intel Xenon) build with Sun Studio 12 we see this:
>>>>
>>>> #OK: input file
>>>> md5sum /usr/pub/UTF-8
>>>> 05f555672fd120af5b633f5bc89b3938  /usr/pub/UTF-8
>>>>
>>>> #OK: bash4.0 passes it through correctly:
>>>> LC_ALL=en_US.UTF-8 bash -c 'sl=$( /bin/cat /usr/pub/UTF-8 ) ; md5sum
>>>> <<<"$sl" ; true'
>>>> 05f555672fd120af5b633f5bc89b3938  -
>>>>
>>>> #OK: dash (with patch for <<< syntax) passes it through correctly:
>>>> LC_ALL=en_US.UTF-8 bash -c 'sl=$( /bin/cat /usr/pub/UTF-8 ) ; md5sum
>>>> <<<"$sl" ; true'
>>>> 05f555672fd120af5b633f5bc89b3938  -
>>>>
>>>> #FAIL: but astksh20120628 **CORRUPTS** the data in the en_US.UTF-8 locale:
>>>> LC_ALL=en_US.UTF-8 ~/bin/ksh -c 'sl=$( /bin/cat /usr/pub/UTF-8 ) ;
>>>> md5sum <<<"$sl" ; true'
>>>> 756e8851f95e59b7a0bed28e20b72d50  -
>>>>
>>>> #OK: same ksh93 binary with C locale passes data through OK:
>>>> LC_ALL=C ~/bin/ksh -c 'sl=$( /bin/cat /usr/pub/UTF-8 ) ; md5sum
>>>> <<<"$sl" ; true'
>>>> 05f555672fd120af5b633f5bc89b3938  -
>>>
>>> Hurrah, my Friday afternoon is ruined.
>>>
>>> So with ast-ksh 2012-06-28 I get a warning about a broken multibyte
>>> character (UTF-8 locale):
>>> kshbroken20120628 -c 'builtin wc ; wc -m -w -l <<< "$( cat
>>> /usr/pub/UTF-8 )" ; true'
>>> wc: line 2146: warning: invalid multibyte character
>>>    2889   43115  183566
>>>
>>> With ast-ksh 2011-02-08 (delivered with Illumos) this works as expected:
>>> /bin/ksh -c 'builtin wc ; wc -m -w -l <<< "$( cat /usr/pub/UTF-8 )" ;
>>> true'
>>>   24576  258437 1619916
>>>
>>> What was the last known working release?
>>
>> 1. Please stop bashing David and Glenn. Thank you.
>> (While I do understand the frustration I don't see the point to vent
>> it here... and to throw the stone back: If people would help with
>> testing more often (like grabbing a beta) instead of just picking the
>> official releases it would catch such issues much quicker. The issue
>> is there since ast-ksh.2011-06-30... see below...).
>>
>> 2. Looking at my mighty collection:
>> -- snip --
>> $ (find ast_ksh_201* -name ksh -a -type f | fgrep '/bin/ksh' | while
>> read ex ; do echo "## $ex" ; "$ex" -c 'sl=$( /bin/cat /usr/pub/UTF-8 )
>> ; md5sum <<<"$sl" ; true' ; done)
>> [snip]
>> ## ast_ksh_20110415/build_i386_64bit_opt_dgkfix001/arch/sol11.i386/bin/ksh
>> 05f555672fd120af5b633f5bc89b3938  -
>> ## ast_ksh_20110428/build_i386_64bit_opt_dgkfix002/arch/sol11.i386/bin/ksh
>> 05f555672fd120af5b633f5bc89b3938  -
>> ## ast_ksh_20110505/build_i386_64bit_opt_dgkfix001/arch/sol11.i386/bin/ksh
>> 05f555672fd120af5b633f5bc89b3938  -
>> ## ast_ksh_20110630/build_i386_64bit_opt/arch/sol11.i386/bin/ksh
>> 20fe858776d12b5ffa385f2527f2d291  -
>> ## ast_ksh_20120518/build_i386_64bit_opt/arch/sol11.i386/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> ## ast_ksh_20120518/build_i386_64bit_debug/arch/sol11.i386/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> ## 
>> ast_ksh_20120531/build_i386_64bit_opt_extrabuiltins/arch/sol11.i386/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> ## ast_ksh_20120531/build_i386_32bit_debug/arch/sol11.i386/bin/ksh
>> a10d4421ca420f47e21694d3c9841a9f  -
>> ## ast_ksh_20120606/build_i386_32bit_debug/arch/sol11.i386/bin/ksh
>> a10d4421ca420f47e21694d3c9841a9f  -
>> ## 
>> ast_ksh_20120606/build_i386_64bit_opt_extrabuiltins/arch/sol11.i386-64/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> ## ast_ksh_20120612/build_i386_64bit_extrabuiltins/arch/sol11.i386-64/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> ## 
>> ast_ksh_20120628/build_i386_64bit_extrabuiltins_opt/arch/sol11.i386-64/bin/ksh
>> 756e8851f95e59b7a0bed28e20b72d50  -
>> -- snip --
>> ... last know working release (with the correct md5 hash) was
>> ast-ksh.2011-05-05, the first failures occur in ast-ksh.2011-06-30 and
>> the current wrong md5 hash sum is there since (at least)
>> ast-ksh.2012-05-18.
>> NOTE: These dates of the releases only tell the betas for which I have
>> builds... not neccesarily the dates where the faulty change was
>> commited to the code base.
>
> FYI I have a workaround... it fixes both x="$( <bigfile )" and the
> incarnation of the same bug reported in this email thread... testing
> is underway... ETA ~~2-3 hours...

Attached (as "astopen20120628_command_substitution_fix001.diff") is
the workaround (backing out "11-05-27  A bug with command substitution
with the shift jis locale has been fixed.").

Please test this...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
diff -r -u build_i386_64bit_plain/src/cmd/ksh93//sh/macro.c build_i386_64bit_extrabuiltins_opt/src/cmd/ksh93/sh/macro.c
--- src/cmd/ksh93/sh/macro.c	Thu Jun 28 16:30:46 2012
+++ src/cmd/ksh93/sh/macro.c	Fri Jun 29 18:16:51 2012
@@ -507,7 +507,7 @@
 					int		i;
 					unsigned char	mb[8];
 
-					n = mbconv((char*)mb, c);
+					n = wctomb((char*)mb, c);
 					for(i=0;i<n;i++)
 						sfputc(stkp,mb[i]);
 				}
@@ -2025,10 +2025,7 @@
 	struct slnod            *saveslp = mp->shp->st.staklist;
 	struct _mac_		savemac;
 	int			savtop = stktell(stkp);
-	char			lastc=0, *savptr = stkfreeze(stkp,0);
-#if SHOPT_MULTIBYTE
-	wchar_t			lastw=0;
-#endif /* SHOPT_MULTIBYTE */
+	char			lastc, *savptr = stkfreeze(stkp,0);
 	int			was_history = sh_isstate(SH_HISTORY);
 	int			was_verbose = sh_isstate(SH_VERBOSE);
 	int			was_interactive = sh_isstate(SH_INTERACTIVE);
@@ -2155,6 +2152,7 @@
 	mp->ifsp = nv_getval(np);
 	stkset(stkp,savptr,savtop);
 	newlines = 0;
+	lastc = 0;
 	sfsetbuf(sp,(void*)sp,0);
 	bufsize = sfvalue(sp);
 	/* read command substitution output and put on stack or here-doc */
@@ -2212,17 +2210,6 @@
 		}
 		else if(lastc)
 		{
-#if SHOPT_MULTIBYTE
-			if(lastw)
-			{
-				int	n;
-				char	mb[8];
-				n = mbconv(mb, lastw);
-				mac_copy(mp,mb,n);
-				lastw = 0;
-			}
-			else
-#endif /* SHOPT_MULTIBYTE */
 			mac_copy(mp,&lastc,1);
 			lastc = 0;
 		}
@@ -2231,22 +2218,8 @@
 			str[c] = 0;
 		else
 		{
-			ssize_t len = 1;
-
 			/* can't write past buffer so save last character */
-#if SHOPT_MULTIBYTE
-			if ((len = mbsize(str))>1)
-			{
-				len = mb2wc(lastw,str,len);
-				if (len < 0)
-				{
-					lastw = 0;
-					len = 1;
-				}
-			}
-#endif /* SHOPT_MULTIBYTE */
-			c -= len;
-			lastc = str[c];
+			lastc = str[--c];
 			str[c] = 0;
 		}
 		mac_copy(mp,str,c);
@@ -2264,21 +2237,7 @@
 			sfnputc(stkp,'\n',newlines);
 	}
 	if(lastc)
-	{
-#if SHOPT_MULTIBYTE
-		if(lastw)
-		{
-			int	n;
-			char	mb[8];
-			n = mbconv(mb, lastw);
-			mac_copy(mp,mb,n);
-			lastw = 0;
-		}
-		else
-#endif /* SHOPT_MULTIBYTE */
 		mac_copy(mp,&lastc,1);
-		lastc = 0;
-	}
 	sfclose(sp);
 	return;
 }
_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to