On Friday, April 19, 2002, at 05:01 , Nick Ing-Simmons wrote:
> I am not sure when the change went in, but current Encode.xs
> has broken Tk804.

Ouch.

> With $encoding->decode($string,1)
>
> now croaks if character does not map. Croaking is fine as a default
> for checking but Tk would like a value of check which does not croak,
> but just returns leaving $string starting with the failing character.
> I could do a G_EVAL but that is a lot of overhead, and does not tell me
> which character position failed (unless $string is updated before
> the croak.)

Yikes.  I DID fix the behavior as documented.  But it was not just 
Encode::CN::HZ that was taking advantage of UNDOCUMENTED feature after 
all :).

> (Tk does 10,000s of probes - found a character XXXX, have font
> with encoding YYYY, can YYYY encode XXXX ?  I hope to reduce that
> number by refining the code but it will still do a lot)
>
> With current Encode I don't get to try any interesting fonts
> because it croaks when Tk asks iso-8859-1 if it can do the interesting
> character :-(

~!@#$%^&*()_+  (My feeling expressed in octet stream :)

> Right now we have:
>   check == 0,  fallback char   (New and overdue - thanks!)
>   check == -1, perlqq \X{xxxx} style croak

Ah, it does not croak.  It FALLS BACK that way.

>   otherwise \N{U+XXXX} style croak
>
> (Did \N{U+XXXX} get (back) in ? - I seem to recall it got removed once.)

Didn't touch that part.

> You have established the principle of check values meaning something
> (which was always the plan).
>
> Can I suggest though that we make it a bit mask - a stab at an initial
> set of bits :
>   check == 0 - fallback
>   (check & 3) == 1 - croak
>   (check & 3) == 2 - warn
>   (check & 3) == 3 - silent return
>   (check & 4)      - \x{xxxx} vs \N{U+XXXX}
> If you like make $string adjustment optional
>   check & 8      - Update Don't bother to update $string.

Looks good to me.  Maybe I should add constants for that.  Maybe I would 
modify which bits means what, however.

> Thus
>   check == 0  - fallbacks
>   check == 1  - \N{U+XXXX} croak
>   check == 2  - \x{XXXX} croak
>   check == 3  - silent fail
>   chack == 4  - Uninteresting
>   check == 5  - \N{U+XXXX} warn
>   check == 6  - \x{XXXX} warn
>   check == 11 - silent fail with $string updated (What Tk wants)
>
> Better schemes welcome.

What a good timing.  I was about to release the next version.  I'll take 
a shower, implement them, possible add test suits for them before the 
release.

> Another alternative hinted at in old pods was passing check as an SV.
> Then if SV was a scalar ref, then set $str to point at fail and return
> reason code in the scalar.

This one is very attractive but too attractive when code freeze is 
near.  So let's go bit masks for the time being.

> PS:
>
> To pick nits - Encode.xs's "layout" looks rather peculiar
> with perl source's default tab setting of 8 and expected indent of 4,
> and many of files you have touched now have trailing whitespace
> on ends of lines.

I've noticed that.  Trailing spaces must be due to patches after patches 
applied (When you paste directly that happens.  That has already been 
fixed in the upcoming version
(I applied "indent-buffer" in Emacs :).

Dan the Encode Maintainer

Reply via email to