I had forgot about this until today until I saw errors in some new code.

Is there anything below that needs clarification?    IIRC, I got stuck
because I didn't really follow what the tests were attempting to test.
Does the test need to do anything more than make sure wide characters get
turned into octets when freezing and then when thawed it's character data
again (i.e. the utf8 flag survives round trip)?

Thanks,



On Wed, Oct 3, 2012 at 7:27 AM, Bill Moseley <mose...@hank.org> wrote:

>
>
> On Tue, Oct 2, 2012 at 8:37 AM, Tomas Doran <bobtf...@bobtfish.net> wrote:
>
>> I entirely agree - there should never be explicit code like this needed.
>>
>> However, currently, if you remove it - then the tests fail.
>>
>
> I just looked at the tests.  It's easy to make it mostly pass, but there's
> parts of that test that I don't understand.
>
> Just so we agree, the serialized data should be utf8-encoded octets, and
> deserialized back into characters.  That's what JSON does automatically.
> And it's expected that character data is correctly flagged -- e.g. utf8
> octets brought in from, say a database, is correctly decoded
> (pg_enable_utf8 for Postgresql as an example).
>
> I also use JSON (or JSON::XS) so not clear if JSON::ANY behaves the same.
>
>
> The test has inline utf8 *character* strings but fails to set "use utf8"
> at the start of the test.  So, this attribute:
>
>     has 'utf8_string' => (
>         is      => 'rw',
>         isa     => 'Str',
>         default => sub { "ネットスーパー (Internet Shopping)" }
>     );
>
> means that the utf8_string is not flagged as character data.  That sets in
> motion the failure of the rest of the tests.
>
> So, I added "use utf8;" at the top of the test.
>
> When comparing the serialized data then need to encode the test character
> string to utf8 octets (because we are comparing to serialized octets).
>
>     is($json,
>        *encode_utf8(* '{"__CLASS__":"Foo","utf8_string":"ネットスーパー
> (Internet Shopping)"}*')*,
>        '... got the right JSON');
>
>
> But, I'm confused by the last test set that starts like this:
>
>     my $test_string;
>     {
>         use utf8;
>         $test_string = "ネットスーパー (Internet Shopping)";
>         no utf8;
>     }
>
> Ok, so now we have a character string.
>
> But, then the tests forces the utf8 bit off:
>
>     Encode::_utf8_off($test_string);
>
> So, I'm just not clear what these tests are trying to do.  What's the
> point of testing that a character string with its utf8 flag forced off
> works correctly?
>
>
> --
> Bill Moseley
> mose...@hank.org
>



-- 
Bill Moseley
mose...@hank.org

Reply via email to