[ 
https://issues.apache.org/jira/browse/PIG-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated PIG-4447:
---------------------------------
    Description: 
Here's an example of an avro schema containing nullable values in a map
{noformat}
{
    "name" : "nullableRecordInMap",
    "namespace" : "org.apache.pig.test.builtin",
    "type" : "record",
    "fields" : [
        {"name" : "key", "type" : "string"},
        {"name" : "value", "type" : "int"},
        {
            "name" : "parameters",
            "type": [
                "null",
                {
                    "type": "map",
                    "values": [
                        "null",
                        {
                            "type": "record",
                            "name": "nullable_record",
                            "fields": [
                                {
                                    "name": "id",
                                    "type": [
                                        "null",
                                        "string"
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
{noformat}

Here's the corresponding Pig resource schema on running it through 
org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
{noformat}
key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
{noformat}

Note that Pig should unpack the underlying schema from the nullable union and 
the Pig schema should be
{noformat}
key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
{noformat}

There's similar behavior if the nullal map value is of type array

I've created a patch with a few testcases written.

  was:
Here's an example of an avro schema containing nullable values in a map
{noformat}
{
    "name" : "nullableRecordInMap",
    "namespace" : "org.apache.pig.test.builtin",
    "type" : "record",
    "fields" : [
        {"name" : "key", "type" : "string"},
        {"name" : "value", "type" : "int"},
        {
            "name" : "parameters",
            "type": [
                "null",
                {
                    "type": "map",
                    "values": [
                        "null",
                        {
                            "type": "record",
                            "name": "nullable_record",
                            "fields": [
                                {
                                    "name": "id",
                                    "type": [
                                        "null",
                                        "string"
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
{noformat}

Here's the corresponding Pig resource schema on running it through 
org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
{noformat}
key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
{noformat}

Note that Pig should unpack the underlying schema from the nullable union and 
the Pig schema should be
{noformat}
key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
{noformat}

There's similar behavior if the nullal map value is of type Avro

I've created a patch with a few testcases written.


> Pig Cannot handle nullable values (arrays and records) in avro records
> ----------------------------------------------------------------------
>
>                 Key: PIG-4447
>                 URL: https://issues.apache.org/jira/browse/PIG-4447
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Ratandeep Ratti
>            Assignee: Ratandeep Ratti
>             Fix For: 0.15.0
>
>         Attachments: pig.patch
>
>
> Here's an example of an avro schema containing nullable values in a map
> {noformat}
> {
>     "name" : "nullableRecordInMap",
>     "namespace" : "org.apache.pig.test.builtin",
>     "type" : "record",
>     "fields" : [
>         {"name" : "key", "type" : "string"},
>         {"name" : "value", "type" : "int"},
>         {
>             "name" : "parameters",
>             "type": [
>                 "null",
>                 {
>                     "type": "map",
>                     "values": [
>                         "null",
>                         {
>                             "type": "record",
>                             "name": "nullable_record",
>                             "fields": [
>                                 {
>                                     "name": "id",
>                                     "type": [
>                                         "null",
>                                         "string"
>                                     ]
>                                 }
>                             ]
>                         }
>                     ]
>                 }
>             ]
>         }
>     ]
> }
> {noformat}
> Here's the corresponding Pig resource schema on running it through 
> org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
> {noformat}
> key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
> {noformat}
> Note that Pig should unpack the underlying schema from the nullable union and 
> the Pig schema should be
> {noformat}
> key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
> {noformat}
> There's similar behavior if the nullal map value is of type array
> I've created a patch with a few testcases written.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to