On Tue, Oct 2, 2012 at 8:37 AM, Tomas Doran <bobtf...@bobtfish.net> wrote:
> I entirely agree - there should never be explicit code like this needed. > > However, currently, if you remove it - then the tests fail. > I just looked at the tests. It's easy to make it mostly pass, but there's parts of that test that I don't understand. Just so we agree, the serialized data should be utf8-encoded octets, and deserialized back into characters. That's what JSON does automatically. And it's expected that character data is correctly flagged -- e.g. utf8 octets brought in from, say a database, is correctly decoded (pg_enable_utf8 for Postgresql as an example). I also use JSON (or JSON::XS) so not clear if JSON::ANY behaves the same. The test has inline utf8 *character* strings but fails to set "use utf8" at the start of the test. So, this attribute: has 'utf8_string' => ( is => 'rw', isa => 'Str', default => sub { "ネットスーパー (Internet Shopping)" } ); means that the utf8_string is not flagged as character data. That sets in motion the failure of the rest of the tests. So, I added "use utf8;" at the top of the test. When comparing the serialized data then need to encode the test character string to utf8 octets (because we are comparing to serialized octets). is($json, *encode_utf8(* '{"__CLASS__":"Foo","utf8_string":"ネットスーパー (Internet Shopping)"}*')*, '... got the right JSON'); But, I'm confused by the last test set that starts like this: my $test_string; { use utf8; $test_string = "ネットスーパー (Internet Shopping)"; no utf8; } Ok, so now we have a character string. But, then the tests forces the utf8 bit off: Encode::_utf8_off($test_string); So, I'm just not clear what these tests are trying to do. What's the point of testing that a character string with its utf8 flag forced off works correctly? -- Bill Moseley mose...@hank.org