On Tue, Oct 2, 2012 at 8:37 AM, Tomas Doran <bobtf...@bobtfish.net> wrote:

> I entirely agree - there should never be explicit code like this needed.
>
> However, currently, if you remove it - then the tests fail.
>

I just looked at the tests.  It's easy to make it mostly pass, but there's
parts of that test that I don't understand.

Just so we agree, the serialized data should be utf8-encoded octets, and
deserialized back into characters.  That's what JSON does automatically.
And it's expected that character data is correctly flagged -- e.g. utf8
octets brought in from, say a database, is correctly decoded
(pg_enable_utf8 for Postgresql as an example).

I also use JSON (or JSON::XS) so not clear if JSON::ANY behaves the same.


The test has inline utf8 *character* strings but fails to set "use utf8" at
the start of the test.  So, this attribute:

    has 'utf8_string' => (
        is      => 'rw',
        isa     => 'Str',
        default => sub { "ネットスーパー (Internet Shopping)" }
    );

means that the utf8_string is not flagged as character data.  That sets in
motion the failure of the rest of the tests.

So, I added "use utf8;" at the top of the test.

When comparing the serialized data then need to encode the test character
string to utf8 octets (because we are comparing to serialized octets).

    is($json,
       *encode_utf8(* '{"__CLASS__":"Foo","utf8_string":"ネットスーパー (Internet
Shopping)"}*')*,
       '... got the right JSON');


But, I'm confused by the last test set that starts like this:

    my $test_string;
    {
        use utf8;
        $test_string = "ネットスーパー (Internet Shopping)";
        no utf8;
    }

Ok, so now we have a character string.

But, then the tests forces the utf8 bit off:

    Encode::_utf8_off($test_string);

So, I'm just not clear what these tests are trying to do.  What's the point
of testing that a character string with its utf8 flag forced off works
correctly?


-- 
Bill Moseley
mose...@hank.org

Reply via email to