[jira] Commented: (AVRO-258) Higher-level language for authoring schemata

Todd Lipcon (JIRA) Fri, 18 Dec 2009 12:23:42 -0800

    [ 
https://issues.apache.org/jira/browse/AVRO-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792626#action_12792626
 ]


Todd Lipcon commented on AVRO-258:
----------------------------------

bq. First, I don't think we want to make such a tool a part of the spec

Fair enough - I'm ambivalent there.

bq. Perl or Python might thus be preferable to Java. 

I looked at some Python based parsers, but the issue is that many of them rely 
on libraries rather than code generation. Many of those libraries are GPL or 
LGPL license, and also aren't available on CentOS/RHEL 5, which means that in a 
lot of ways it's _less_ deployable than Java. Pyparsing, which I like a lot and 
have used before, is a friendly license but still has the library requirement, 
and would still have to bundled with the script. Having recently worked on some 
python software that bundles a lot of library dependencies, it's a huge huge 
_huge_ pain. :)

I actually almost did this in C/C++ with straight lex/yacc, but went towards 
Java since it was easier for a quick first pass. Moving to C in the long run 
would be fine by me for the reasons you outlined.

bq. Another approach, rather than trying to make the syntax more Java-like, 
implementing a full parser, is to just remove the most annoying things from 
JSON... more complex JSON transformations ... etc

So, maybe I'm misunderstanding you, but it seems like you're proposing either 
(a) writing a custom JSON parser that has some extensions to make the syntax 
more palatable, or (b) writing a text-based preprocessor that outputs JSON 
which is then fed into the parser. Solution (a) seems to me like it has all the 
same difficulties as writing our own language, but with a less familiar syntax. 
Solution (b) seems hackish, and has the downside that it inherits the syntactic 
strangeness of using JSON while not getting the benefits of using a standard 
language (editor support, preexisting familiarity, etc).

bq. Beyond that, it starts to become lisp-versus-algol, unresolvable and a 
tremendous time sink.

I'm not convinced that implementing our own language is really that tough. In 
about 3 hours of work I got the above stuff done, and I'd never used JavaCC 
before. As for the religious lisp-versus-algol question, I think it's already 
been resolved in the sense that most existing protocol/data description 
languages are more algol-like than JSON-like (eg xdr, CORBA IDL, protobufs, 
Apache Thrift, Apache Etch). The counterexamples are things like WSDL which no 
one seems to really like.


To reiterate, I'm definitely _not_ suggesting than JSON be supplanted as the 
definitive schema definition language for AVRO. It's great in that there are 
existing parsers in most languages and readily machine-readable.

> Higher-level language for authoring schemata
> --------------------------------------------
>
>                 Key: AVRO-258
>                 URL: https://issues.apache.org/jira/browse/AVRO-258
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> Early users of Avro have noted that authoring schemas and especially 
> protocols in JSON feels unnatural. This JIRA is to work on a higher-level 
> language that feels more like defining interfaces and classes in Java/C/etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (AVRO-258) Higher-level language for authoring schemata

Reply via email to