Re: why...

Bryan Duxbury Mon, 25 Aug 2008 18:12:22 -0700

To the idea of using multiple language generators instead of C++,I've been thinking that if the compiler itself generated to somecommon intermediate language like JSON, it would be really easy towrite a generator. JSON (or XML or YAML or something like it)probably already has a parser in most languages, so you'd just treatit like an AST and generate code however you want. It could be hookedup via stdin/stdout. Then, I could generate my Ruby classes with aRuby script :).


On Aug 25, 2008, at 5:11 PM, Chad Walters wrote:

Some quick thoughts:
1. Somewhat historical - Facebook's language of choice for backendstuff was C++ and they were not using Java very much (althoughtheir usage seems to have expanded somewhat, what with their use ofHadoop and Zookeeper and their development of Cassandra).
2. That would be great. However, the current belief is that thereis a lot of special-casing for the specifics of each targetlanguage and that it's not clear how much commonality could befound to help here.
3. The current seqid mechanism guarantees uniqueness and alsoallows the seqid's to be small, which is better for theDenseProtocol and other compact protocols.
4. Yep, sounds like a PITA. Does it buy that much? Can it besupported across all the languages we are trying to support?
5. They are available for use by protocols if desired but the seqidis really the important piece of data -- the names are not actuallyused in the binary protocol or other compact protocols.
6 and 7. I'll let someone else speak to these issues.
WRT 1 and 2, I would actually love to see some mechanism to allowfor the compiler to be abstracted to the point where we couldimplement it in a broad choice of languages (C++, Java, Ruby, etc.)and still produce the same target language bindings. This wouldfree non-C++ shops from needing the C++ tool chain. Sounds like apretty interesting and extensive project in and of itself -- if youcan figure out how to make this happen, more power to you.
Chad


On 8/25/08 4:36 PM, "Torsten Curdt" <[EMAIL PROTECTED]> wrote:

Hey guys,

I've looked into Thrift recently and a few questions came up:

1. Why a native compiler? Would it me a little bit simpler to have the
compiler/code generator written in java? No language debate - just a
curious question for the reason :)

2. Wouldn't it make sense to have a bit of better separation than
having all code mixed up in the t_*_generator.cc files? Maybe more a
template approach so adjusting the code that gets generated becomes a
little bit easier?
3. Why not use the hash code of the attribute names as the sequenceid?
4. Why only composition? Even a flattening model of multiple
inheritance should be quite easy to implement (if overloading is
forbidden). While in OOP I am big fan of composition over inheritance
it makes the generated API kind of ugly. Maybe a include mechanism
would be another way of simplifying composed structures. (Although I
do realize that with the current model of sequence ids that might be a
PITA to maintain)

5. If I noticed correctly the names of the attributes are included
when serialized. Why is that? Shouldn't knowing the sequence id be
good enough?

6. How do you guys suggest to deal with deterministic semantical
changes. Let's say you have

struct test {
   required string a;
   required string b;
}

and then you want to combine those values into one attribute

struct test {
   required string ab; // = a + b
}

There are a couple of problems I see here. For one ab will have to
have a different sequence id. And I guess then the 'required' will
become a problem for sequence of a and b(?). And finally the
conversion of ab = a+b needs to be handle on the application level
while rule is very straight forward and deterministic and *could* be
expressed in more generic manner.

7. Wouldn't it make sense to separate out the service and exception
stuff from the actual message versioning/serialization code?

cheers
--
Torsten

Re: why...

Reply via email to