[
https://issues.apache.org/jira/browse/PIG-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ratandeep Ratti updated PIG-4447:
---------------------------------
Description:
Here's an example of an avro schema containing nullable values in a map
{noformat}
{
"name" : "nullableRecordInMap",
"namespace" : "org.apache.pig.test.builtin",
"type" : "record",
"fields" : [
{"name" : "key", "type" : "string"},
{"name" : "value", "type" : "int"},
{
"name" : "parameters",
"type": [
"null",
{
"type": "map",
"values": [
"null",
{
"type": "record",
"name": "nullable_record",
"fields": [
{
"name": "id",
"type": [
"null",
"string"
]
}
]
}
]
}
]
}
]
}
{noformat}
Here's the corresponding Pig resource schema on running it through
org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
{noformat}
key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
{noformat}
Note that Pig should unpack the underlying schema from the nullable union and
the Pig schema should be
{noformat}
key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
{noformat}
There's similar behavior if the nullal map value is of type array
I've created a patch with a few testcases written.
was:
Here's an example of an avro schema containing nullable values in a map
{noformat}
{
"name" : "nullableRecordInMap",
"namespace" : "org.apache.pig.test.builtin",
"type" : "record",
"fields" : [
{"name" : "key", "type" : "string"},
{"name" : "value", "type" : "int"},
{
"name" : "parameters",
"type": [
"null",
{
"type": "map",
"values": [
"null",
{
"type": "record",
"name": "nullable_record",
"fields": [
{
"name": "id",
"type": [
"null",
"string"
]
}
]
}
]
}
]
}
]
}
{noformat}
Here's the corresponding Pig resource schema on running it through
org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
{noformat}
key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
{noformat}
Note that Pig should unpack the underlying schema from the nullable union and
the Pig schema should be
{noformat}
key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
{noformat}
There's similar behavior if the nullal map value is of type Avro
I've created a patch with a few testcases written.
> Pig Cannot handle nullable values (arrays and records) in avro records
> ----------------------------------------------------------------------
>
> Key: PIG-4447
> URL: https://issues.apache.org/jira/browse/PIG-4447
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.13.0
> Reporter: Ratandeep Ratti
> Assignee: Ratandeep Ratti
> Fix For: 0.15.0
>
> Attachments: pig.patch
>
>
> Here's an example of an avro schema containing nullable values in a map
> {noformat}
> {
> "name" : "nullableRecordInMap",
> "namespace" : "org.apache.pig.test.builtin",
> "type" : "record",
> "fields" : [
> {"name" : "key", "type" : "string"},
> {"name" : "value", "type" : "int"},
> {
> "name" : "parameters",
> "type": [
> "null",
> {
> "type": "map",
> "values": [
> "null",
> {
> "type": "record",
> "name": "nullable_record",
> "fields": [
> {
> "name": "id",
> "type": [
> "null",
> "string"
> ]
> }
> ]
> }
> ]
> }
> ]
> }
> ]
> }
> {noformat}
> Here's the corresponding Pig resource schema on running it through
> org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities
> {noformat}
> key:chararray,value:int,parameters:[nullable_record:(union:(id:chararray))]
> {noformat}
> Note that Pig should unpack the underlying schema from the nullable union and
> the Pig schema should be
> {noformat}
> key:chararray,value:int,parameters:[nullable_record:(id:chararray)]
> {noformat}
> There's similar behavior if the nullal map value is of type array
> I've created a patch with a few testcases written.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)