Re: Not able to resolve union for array type

Giri P Mon, 06 Jun 2016 11:23:34 -0700

Here is the schema I'm working on

{
"type" : "record",
"name" : "Test",
"namespace" : "com.cp.avro",
"doc" : "v0",
"fields" : [
            {"name" : "reMatchedAttributes", "type" : ["null",{"type" :
"array",
"items" :  {
                        "type": "record",
                        "name": "MatchedAttrTest",
                        "fields": [
                            {"name" : "id", "type": "long", "doc":
"Attribute id" },
{"name" : "updateDate",  "type": "long", "doc": "Update date of the
attribute" },
{"name" : "value", "type" : "string", "doc": "Value of the attribute" }
]
                    }}], "doc": "Attributes matched to the user that were
used  to serve the message. In the form [id, updateDate, value]" }]
}


I was able to get that working. I think issue was when I pass null variable
it was not treated as actual null

On Mon, Jun 6, 2016 at 6:17 AM, Maulik Gandhi <[email protected]> wrote:

> Hi Giri,
>
> Can you share a simplified implementation code and the model you are
> trying to emit.
>
> Thanks.
> - Maulik
>
> On Sat, Jun 4, 2016 at 3:04 PM, Giri P <[email protected]> wrote:
>
>> I need the other way either null or array of records. I was able to
>> insert not null records but when I try inserting nulls it throws me error
>>
>> I'm using spark to write avro files
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure: Failed
>> to serialize task 16, not attempting to retry it. Exception during
>> serialization: java.io.NotSerializableException:
>> org.apache.avro.Schema$UnionSchema
>> Serialization stack:
>>         - object not serializable (class:
>> org.apache.avro.Schema$UnionSchema, value:
>> ["null",{"type":"array","items":{"type":"record","name":"MatchedAttrTest","namespace":"com.conversantmedia.data.cp.avro","fields":[{"name":"id","type":"long","doc":"Attribute
>> id"},{"name":"updateDate","type":"long","doc":"Update date of the
>> attribute"},{"name":"value","type":"string","doc":"Value of the
>> attribute"}]}}])
>>         - writeObject data (class: java.lang.Throwable)
>>         - object (class org.apache.avro.UnresolvedUnionException,
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"array","items":{"type":"record","name":"MatchedAttrTest","namespace":"com.conversantmedia.data.cp.avro","fields":[{"name":"id","type":"long","doc":"Attribute
>> id"},{"name":"updateDate","type":"long","doc":"Update date of the
>> attribute"},{"name":"value","type":"string","doc":"Value of the
>> attribute"}]}}]: )
>>         - writeObject data (class: java.lang.Throwable)
>>         - object (class java.io.IOException, java.io.IOException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"array","items":{"type":"record","name":"MatchedAttrTest","namespace":"com.conversantmedia.data.cp.avro","fields":[{"name":"id","type":"long","doc":"Attribute
>> id"},{"name":"updateDate","type":"long","doc":"Update date of the
>> attribute"},{"name":"value","type":"string","doc":"Value of the
>> attribute"}]}}]: )
>>         - writeObject data (class:
>> org.apache.spark.rdd.ParallelCollectionPartition)
>>         - object (class org.apache.spark.rdd.ParallelCollectionPartition,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b0)
>>         - element of array (index: 0)
>>         - array (class [Ljava.lang.Object;, size 10)
>>         - field (class: scala.collection.mutable.ArraySeq, name: array,
>> type: class [Ljava.lang.Object;)
>>         - object (class scala.collection.mutable.ArraySeq,
>> ArraySeq(org.apache.spark.rdd.ParallelCollectionPartition@7b0,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b1,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b2,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b3,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b4,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b5,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b6,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b7,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b8,
>> org.apache.spark.rdd.ParallelCollectionPartition@7b9))
>>         - writeObject data (class:
>> org.apache.spark.rdd.CoalescedRDDPartition)
>>         - object (class org.apache.spark.rdd.CoalescedRDDPartition,
>> CoalescedRDDPartition(0,MapPartitionsRDD[8] at map at
>> <console>:81,[I@19cc3c95,None))
>>         - field (class: org.apache.spark.scheduler.ResultTask, name:
>> partition, type: interface org.apache.spark.Partition)
>>         - object (class org.apache.spark.scheduler.ResultTask,
>> ResultTask(4, 0))
>>
>>
>> On Fri, Jun 3, 2016 at 10:53 PM, Doug Cutting <[email protected]> wrote:
>>
>>> Your schema permits null or an array of records. I suspect you want an
>>> array containing nulls or records, e.g.,
>>>
>>> {"type":"array","items":["null",{"type":"record"...
>>> On Jun 3, 2016 5:54 PM, "Giri P" <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm getting below error when I try to insert null into union of array
>>>>
>>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"array","items":{"type":"record","name":"DeMatchedAttr","namespace":"cp","fields":[{"name":"id","type":"long","doc":"Attribute
>>>> id"},{"name":"updateDate","type":"long","doc":"Update date of the
>>>> attribute"},{"name":"value","type":"string","doc":"Value of the
>>>> attribute"}]}}]: null
>>>>
>>>> Is there any issue with the schema ?
>>>>
>>>> Thanks for the help
>>>>
>>>> -Giri
>>>>
>>>
>>
>

Re: Not able to resolve union for array type

Reply via email to