thanks for the detailed followups

I have a different workaround that only adds extra computation for js_JP.SJIS
I also believe this workaround handles shift states
the one you posted only checked for mb length == 1 and I think that can happen
in shift state != 0, no?

anyway, the <ast.h> mbchar() macro calls (*ast.mb_towc)() for mb locales
by default ast.mb_towc=mbtowc, but the ast specific LC_ALL=debug locale
sets ast.mb_towc=debug_mbtowc (which we use in some mb regression tests)

I added ast.mb_towc=sjis_mbtowc to handle '\\' and '~' in js_JP.SJIS

---
static int
sjis_mbtowc(register wchar_t* p, register const char* s, size_t n)
{
        if (n && p && s && (*s == '\\' || *s == '~') && !memcmp(mb_state, 
mb_state_zero, sizeof(mbstate_t)))
        {
                *p = *s;
                return 1;
        }
        return mbrtowc(p, s, n, mb_state);
}
---

here is a ksh-ized regression test that also lists the specific errors
instead of just counting them

---
#
# Byte ranges for Shift-JIS encoding (hexadecimal):
# First byte:   81-9F, E0-EF
# Second byte:  40-7E, 80-FC
#
# Now test out some multi byte characters which
# include 7bit aka ASCII bytes with 0x81 0x{40-7E}
#

Command=${0##*/}
integer Errors=0

function err_exit
{
        print -u2 -n "\t"
        print -u2 -r ${Command}[$1]: "${@:2}"
        ((Errors++))
}
alias err_exit='err_exit $LINENO'
alias binprintf=/usr/bin/printf

typeset -i16 chr

for ((chr=0x40; chr<=0x7E; chr++))
do      c=${chr#16#}
        for s in \\x81\\x$c \\x$c
        do      b="$(printf "$s")"
                n="$(binprintf "$s")"
                [[ $b == "$n" ]] || err_exit "printf difference for \"$s\" -- 
builtin '$b' native '$n'"
                u=$(print -- $b)
                q=$(print -- "$b")
                [[ $u == "$q" ]] || err_exit "quoted print difference for 
\"$s\" -- $b => '$u' vs \"$b\" => '$q'"
        done
done
exit $Errors
---

it only has one difference
        tst-03[32]: quoted print difference for "\x81\x7c" -- | => '\|' vs 
"|" => '|'
I'll check with dgk on that in the (later) morning

-- Glenn Fowler -- AT&T Research, Florham Park NJ --

_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to