Hi Peter,
I have recently implemented this with the logicalType concept introduced
recently in avro.
(I have my own fork (https://github.com/zolyfarkas/avro ) that I use until I
find some time to merge Ryan's implementation, but I have other improvements
that I rely on like idl forward declarations, improved json encoding...)
Here is how I implemented the any type:
/** a unknown serialized java object */
@logicalType("unknown")
record Unknown {
/** maven schema ID (optional for future extension, with different ID
types) */
union {null, MavenSchemaId} mavenSchemaId = null;
/** the avro serialized object */
union {null, string, bytes} serObj;
}
The maven schema ID contains enough info to retrieve the schema that the record
is serialized into.(the serObj field).
In my case I store all schemas in a maven repo, and my MavenSchemaId looks like:
/** A maven artifact ID */
record MavenArtifactId {
/** The maven group id */
string groupId;
/** The maven artifactId */
string artifactId;
/** The schema version */
string version;
}
/** A maven schema ID*/
record MavenSchemaId {
/** The maven artifact */
MavenArtifactId artifactId;
/** The record name (namespace + name) */
string recordName;
}
But a schemaID can really be anything, (a number, a string...), as long as you
have a system/service to resolve it. You can even put the schema in the Unknown
record if that works for you...
So every time I need a "Any"(Unknown) field I use it like:
Import idl "common.avdl"
record MyRecord {
...
Unknown any;
...
}
The generated DTOs set and get an Object (just like unions), when you
deseralize you will get either a SpecificRecord (if you have a generated DTO..)
or a GenericRecord...
Let me know if you have any questions...
(would be interested to know if you encounter any issues implementing this with
the official avro logical type implementation...)
cheers
--Z
-----Original Message-----
From: Peter Amstutz [mailto:[email protected]]
Sent: Friday, August 28, 2015 6:26 AM
To: [email protected]
Subject: handling fields with "any" structure
Hello everyone,
I am using Avro to load and validate JSON documents. Mostly this works very
well and it is straightforward to express the structure of my document using
Avro schema. However, I have a few fields which can have "any" content. It is
impossible to declare all possible structures in advance, and I can't use a
union type of primitives because the fields may also contain complex types
(nested lists/maps) and Avro doesn't allow named unions.
So far as I have been able to determine, this is impossible with standard Avro
schema, so I am curious if anyone else has dealt with this problem and can
suggest any workarounds. Currently my best (least bad) idea is to preprocess
the JSON to pull out the "any"
fields and store them on the side before handing the document to Avro for
loading. This is awkward so I would love to hear if anyone has any other ideas.
Thanks,
Peter
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee, you should not
disseminate, distribute, alter or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and delete
this e-mail from your system. E-mail transmissions cannot be guaranteed to be
secure or without error as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The sender,
therefore, does not accept liability for any errors or omissions in the
contents of this message which arise during or as a result of e-mail
transmission. If verification is required, please request a hard-copy version.
This message is provided for information purposes and should not be construed
as a solicitation or offer to buy or sell any securities or related financial
instruments in any jurisdiction. Securities are offered in the U.S. through
PIMCO Investments LLC, distributor and a company of PIMCO LLC.