Hi Elliot, Thanks for that bit of info. It is helpful. Where do you draw the line between complex unions versus simple unions? In other words, what criteria do you use to say this union is too complex?
Thanks, Scott ________________________________ From: Elliot West <[email protected]> Sent: Saturday, May 26, 2018 1:58 AM To: [email protected] Subject: Re: Avro Schema Question A word of caution on the union type. You may find support for unions very patchy if you are hoping to process records using well known data processing engines. We’ve been unable to usefully read union types in both Apache Spark and Hive for example. The simple null union construct is the exception: [null, typeA], as it is usually represented by a nullable columns of typeA. We’ve resorted to prohibiting schemas with complex unions so that our producers can’t create data that is not fully readable by our consumers. Elliot. On Fri, 25 May 2018 at 22:30, Motoko Kusanagi <[email protected]<mailto:[email protected]>> wrote: Hi Michael, Thanks!! Yes, it does. Scott ________________________________ From: Michael Smith <[email protected]<mailto:[email protected]>> Sent: Friday, May 25, 2018 2:21 PM To: [email protected]<mailto:[email protected]> Subject: Re: Avro Schema Question {"type": "int"}, {"type": "string"} is not valid json, so you definitely can't do that. But [{"type": "int"}, {"type": "string"}] is a valid schema -- it can encode a single value that is either an int or a string. At the highest level, your schema can only be one type, but that type may be (and in fact probably will be) a complex type -- a union of records or a single record. Does that answer your question? On Fri, May 25, 2018 at 5:08 PM Motoko Kusanagi <[email protected]<mailto:[email protected]>> wrote: Hi, I read the specification multiple times. In the specification, it says "A Schema is represented in JSON<http://www.json.org/> by one of:" in the Schema Declaration section. The "one" confuses me as I am interpreting it as exactly one of the 3 that it listed. In short, can I do this as a single schema? {type : int}, {type : string}, {type : int}, Or do the following as a single schema? {type : int}, {type : record ....}, {type : record ....}, // Not the same as the previous. {type : string}, Or do I have to "embed" the above under a complex type like a record if I want complex schema? Or does "one of" mean I have to choose one and exactly one for the high top-most level of the schema? Thanks!! -- Michael A. Smith — Senior Systems Engineer ________________________________ [email protected]<mailto:[email protected]> syapse.com <http://www.syapse.com/>100 Matsonford Road<https://maps.google.com/?q=100+Matsonford+Rd&entry=gmail&source=g> Five Radnor Corporate Center Suite 444 Radnor, PA 19087 https://www.linkedin.com/in/michaelalexandersmith [https://lh3.googleusercontent.com/8OwE1TeaqeIeUgpNi5sD9LKfc0Zl8IoENh1w5JbTbmluiHFjMqEPDL_Fl-0ulgaUPxTKEXoYlY2GIdVBSHaqLihzqQCLtJR-gwZWJt9ri0rHgb7rn0hKtqYv5m9iVMdjIUv4xlOx]
