I think the general idea is that if you store your schema in some kind of
schema registry you are not supposed to get back exactly what you entered
but something that is equivalent. The doc field is for sure something that
is not supposed to go into a normalized schema

http://blogs.impetus.com/big_data/big_data_technologies/AVRO.do

http://grokbase.com/p/avro/user/133kfp4c6n/parsing-canonical-form-of-protocol-definitions





2015-05-20 15:10 GMT+02:00 Pierre de Frém <[email protected]>:

> Hello,
>
> I posted the patch for the trunk branch of the git there (for it to be
> reviewed):
> https://issues.apache.org/jira/browse/AVRO-1256
>
> Pierre
>
> ------------------------------
> From: [email protected]
> To: [email protected]
> Subject: RE: Not able to load avro schema fully with all its contents
> Date: Wed, 20 May 2015 10:08:22 +0000
>
>
> Hello,
>
> Sam is right in his previous answer.
> More precisely, the field "doc" is read by the Compiler, but not stored
> at the moment in the Node object. The reason might be that the field "doc"
> is optional is the avro specification (see:
> https://avro.apache.org/docs/1.7.7/spec.html, Complex types).
>
> If you want to store the field doc, you'll have to modify the source code
> yourself to:
> - create a new member "doc" in the Node API (Node.hh),
> - store the doc field in Node as it is read by the Compiler (Compiler.cc),
> - serialize the field doc in NodeImpl.cc
>
> I did a patch for my own use were I store and read fields "doc" for a
> NodeRecord, and I serialize fields doc for the root Node of a NodeRecord.
>
> You can find it at:
> the corresponding branch (created for the patch):
> https://github.com/pidefrem/avro/tree/branch-1.7-specificrecord
>
> the corresponding commit for the field doc:
>
> https://github.com/pidefrem/avro/commit/795a0805b8ea8d3228bd92a483c9cbb405e11a62
>
> Rem: if you want to serialize all fields doc of a NodeRecord, just change
> line 195 of NodeImpl.cc from
> if (depth == 1 && getDoc().size()) {
>
> to
>
> if (getDoc().size()) {
>
> (Maybe my patch could be added in the trunk of the source code if it is
> useful?)
>
> Hope this helps.
>
> Pierre
>
> ------------------------------
> Date: Tue, 19 May 2015 18:37:56 +0000
> From: [email protected]
> To: [email protected]
> Subject: Re: Not able to load avro schema fully with all its contents
>
> Just a guess, but I would assume that the schema object only stores fields
> that it cares about. This would exclude your docs. If you want to know for
> sure, the source code is here:
> https://github.com/apache/avro/tree/trunk/lang/c%2B%2B
>
>
> Sam
>
>
>
>   On Tuesday, May 19, 2015 1:13 PM, Check Peck <[email protected]>
> wrote:
>
>
> Can anyone help me with this?
>
> On Mon, May 18, 2015 at 2:04 PM, Check Peck <[email protected]>
> wrote:
>
> Does anyone have any idea on this why it is behaving like this?
>
> On Mon, May 18, 2015 at 1:03 PM, Check Peck <[email protected]>
> wrote:
>
> And this is my to_string method I forgot to provide.
>
> std::string DataSchema::to_string() const
> {
>     ostringstream os;
>     if (valid())
>     {
>         os << "JSON data: ";
>         m_schema.toJson(os);
>     }
>     return os.str();
>
> }
>
>
> On Mon, May 18, 2015 at 12:54 PM, Check Peck <[email protected]>
> wrote:
>
> I am working with Apache Avro in C++ and I am trying to load avro schema
> by using Avro C++ library. Everything works fine without any issues, only
> problem is - I have few "doc" in my Avro schema which is not getting shown
> at all in my AvroSchema when I try to load it and also print it out.
>
>     DataSchema_ptr schema_data(new DataSchema());
>     schema_data->m_schema = load(avro_schema_file_name.c_str());
>     const avro::NodePtr node_data_ptr = schema_data->m_schema.root();
>     if (node_data_ptr && node_data_ptr->hasName())
>     {
>         // is there any problem with this node_data_ptr usage here?
>         schema_data->m_name = node_data_ptr->name().fullname().c_str();
>
>         // this line prints out whole AVRO but it doesn't have doc which
> is there in my AVRO
>         cout<<"File String : " << schema_data->to_string() << endl;
>     }
>
> Here "m_schema" is "avro::ValidSchema m_schema;"
>
> Can anyone help me with this. In general I don't see my doc which I have
> in Avro Schema getting shown when I print it out.
>
>
>
>
>
>
>

Reply via email to