[ 
https://issues.apache.org/jira/browse/AVRO-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035340#comment-17035340
 ] 

Martin Spevak commented on AVRO-1521:
-------------------------------------

Main issue is with default boolean, because when you have in schema:
{code:java}
{
  "type" : "record",
  "name" : "Some",
  "namespace" : "some.namespace",
  "fields" : [ {
    "name" : "fieldname",
    "type" : "boolean",
    "default" : false
  }, {
  ....
}

{code}
Schema is parsed from string (Avro::Schema) as json, so all booleans are 
transformed into JSON::PP::Boolean objects:
{code:java}
 28 sub parse {
 29     my $schema      = shift;
 30     my $json_string = shift;
 31     my $names       = shift || {};
 32     my $namespace   = shift || "";
 33 
 34     my $struct = try {
 35         $json->decode($json_string);
 36     }
 37     catch {
 38         throw Avro::Schema::Error::Parse(
 39             "Cannot parse json string: $_"
 40         );
 41     };
 42     return $schema->parse_struct($struct, $names, $namespace);
 43 }

 {code}
finally validation in method is_data_valid for all booleans failings, because 
ref of $data is JSON::PP::Boolean 
{code:java}
313     if ($type eq 'boolean') {
314         return 0 if ref $data; # sometimes risky
315         return 1 if $data =~ m{yes|no|y|n|t|f|true|false}i;
316         return 0;
317     } {code}
quick fix is to prepend test before line 314:
{code:java}
return 1 if ref $data eq 'JSON::PP::Boolean';{code}

> Inconsistent behavior of Perl API with 'boolean' type
> -----------------------------------------------------
>
>                 Key: AVRO-1521
>                 URL: https://issues.apache.org/jira/browse/AVRO-1521
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: perl
>            Reporter: John Karp
>            Assignee: John Karp
>            Priority: Major
>             Fix For: 1.10.0
>
>
> The perl boolean serialization code in BinaryEncoder.pm encodes anything 
> false to perl, such as 0, '0', '', () and undef, as false, and anything true 
> to perl, which is literally everything else, as true.
> Inconsistent with the above serialization, the code used in Schema.pm to 
> determine which union branch to use, is checking for boolean-ness with:
> {noformat}
> m{yes|no|y|n|t|f|true|false}i
> {noformat}
> meaning only those particular strings are considered booleans.
> So all those values, including 'no' 'n' 'f' and 'false', still get serialized 
> to true.
> We could just standardize on one of the two and use it consistently. But 
> neither works that well in unions, because unless you put the boolean type 
> last in the union definition, a wide variety of data will be downcast to 
> boolean type.
> Perl has no built-in or standardized boolean type, so there's no solution 
> like we have in the other language Avro APIs. But we could do as the perl 
> JSON module does, and define objects for true and false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to