Understood. So to hold the schema stable you should have an external reference 
to an avrc url (eg registry) which can evolve. And checking new Avro against 
registry is made easy because avrc is embedded. And if changed you can easily 
create a new version.
Is this the idea ?

Br,
Dennis
________________________________
Von: David <dam6...@gmail.com>
Gesendet: Samstag, 31. Oktober 2020 14:52:04
An: user@hive.apache.org
Betreff: Re: Hive Avro: Directly use of embedded Avro Scheme

What would your expectation be?  That Hive reads the first file it finds and 
uses that schema in the table definition?

What if the table is empty and a user attempts an INSERT?  What should be the 
behavior?

The real power of Avro is not so much that the schema can exist (optionally) in 
the file itself but that the schema can mutate over time.  In such cases the 
table can be ALTERED, for example to add a new column, and the existing schema 
will still work.

Thanks.

On Sat, Oct 31, 2020, 6:57 AM Dennis Suhari 
<dennis.suh...@ilab.nordlb.de<mailto:dennis.suh...@ilab.nordlb.de>> wrote:
Hello Support,  currently I have created the following AVRO Hive table which 
works fine.  CREATE EXTERNAL TABLE blahblah.blublub
STORED AS AVRO LOCATION "/***/in" TBLPROPERTIES 
('avro.schema.url‘=‚/.../schema/blublub.avsc')  As you can see I need to use 
the schema 'avro.schema.url' property which points to the AVRO schema 
blublub.avsc. This blublub.avsc I simply extract from the AVRO files.   How is 
it possible to work without 'avro.schema.url' and directly use the Avro scheme 
which is actually already delivered within Avro itself (that the strength from 
Avro) ? I want to have all columns that are included in the Avro specification, 
but without having them in the create statement.  Br,  Dennis


Reply via email to