[
https://issues.apache.org/jira/browse/AVRO-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907288#action_12907288
]
Doug Cutting commented on AVRO-659:
-----------------------------------
Jeff, I'm still trying to understand the use case you have in mind.
Most folks writing data to files should use an Avro data file, which includes
the schema. If folks are doing RPC, then the protocol they use to write data
is typically a file in their source code tree, and the protocol they use to
read data is determined through the handshake. If folks are writing
individual records to a database then a best practice is to maintain a registry
of schemas used in the database as a separate table, and have each instance
refer to its schema in the registry via its MD5 hash. The application would
still probably store or create the schemas it uses for new database records
with the source code. The registry is updated when writing records and
accessed when reading them.
We do not want to encourage folks to write data without also storing the schema
used to write that schema in the same repository as the data. I don't feel a
path-based schema registry is a good idea. Keeping a copy of the schema with
source code that writes data is a good practice: the schema is part of the
writing code and should be versioned with it. Generating schemas on the fly
when writing data is a fine practice too. But whenever data is persisted, its
schema should be stored with it.
> Portable specification of the location of schema and protocol files
> -------------------------------------------------------------------
>
> Key: AVRO-659
> URL: https://issues.apache.org/jira/browse/AVRO-659
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hammerbacher
>
> Avro doesn't require code generation, which is great. However, if you want to
> use a protocol or a schema, your code needs to know where to find it. When
> your code is ported to new systems, the protocol or schema file must be
> placed in the same place as on the previous system for things to work
> correctly.
> For importing modules in a portable fashion, Python provides a default set of
> places it will look for modules and an environment variable called PYTHONPATH
> that programs can use to override these defaults. It may be useful to explore
> similar constructs for Avro implementations that don't do code generation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.