Hi Elliot,

Thanks for that bit of info. It is helpful. Where do you draw the line between 
complex unions versus simple unions? In other words, what criteria do you use 
to say this union is too complex?

Thanks,

Scott
________________________________
From: Elliot West <[email protected]>
Sent: Saturday, May 26, 2018 1:58 AM
To: [email protected]
Subject: Re: Avro Schema Question

A word of caution on the union type. You may find support for unions very 
patchy if you are hoping to process records using well known data processing 
engines. We’ve been unable to usefully read union types in both Apache Spark 
and Hive for example. The simple null union construct is the exception: [null, 
typeA], as it is usually represented by a nullable columns of typeA. We’ve 
resorted to prohibiting schemas with complex unions so that our producers can’t 
create data that is not fully readable by our consumers.

Elliot.

On Fri, 25 May 2018 at 22:30, Motoko Kusanagi 
<[email protected]<mailto:[email protected]>> 
wrote:
Hi Michael,

Thanks!! Yes, it does.

Scott
________________________________
From: Michael Smith <[email protected]<mailto:[email protected]>>
Sent: Friday, May 25, 2018 2:21 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Avro Schema Question

{"type": "int"}, {"type": "string"} is not valid json, so you definitely can't 
do that. But

[{"type": "int"}, {"type": "string"}] is a valid schema -- it can encode a 
single value that is either an int or a string. At the highest level, your 
schema can only be one type, but that type may be (and in fact probably will 
be) a complex type -- a union of records or a single record.

Does that answer your question?

On Fri, May 25, 2018 at 5:08 PM Motoko Kusanagi 
<[email protected]<mailto:[email protected]>> 
wrote:

Hi,


I read the specification multiple times. In the specification, it says "A 
Schema is represented in JSON<http://www.json.org/> by one of:" in the Schema 
Declaration section. The "one" confuses me as I am interpreting it as exactly 
one of the 3 that it listed.


In short, can I do this as a single schema?

{type : int},

{type : string},

{type : int},


Or do the following as a single schema?

{type : int},

{type : record ....},

{type : record ....}, // Not the same as the previous.

{type : string},


Or do I have to "embed" the above under a complex type like a record if I want 
complex schema? Or does "one of" mean I have to choose one and exactly one for 
the high top-most level of the schema?


Thanks!!



--


Michael A. Smith — Senior Systems Engineer

________________________________

[email protected]<mailto:[email protected]>
syapse.com
<http://www.syapse.com/>100 Matsonford 
Road<https://maps.google.com/?q=100+Matsonford+Rd&entry=gmail&source=g>
Five Radnor Corporate Center
Suite 444
Radnor, PA 19087
https://www.linkedin.com/in/michaelalexandersmith


[https://lh3.googleusercontent.com/8OwE1TeaqeIeUgpNi5sD9LKfc0Zl8IoENh1w5JbTbmluiHFjMqEPDL_Fl-0ulgaUPxTKEXoYlY2GIdVBSHaqLihzqQCLtJR-gwZWJt9ri0rHgb7rn0hKtqYv5m9iVMdjIUv4xlOx]

Reply via email to