Re: why...

Chad Walters Mon, 25 Aug 2008 17:12:30 -0700

Some quick thoughts:

1. Somewhat historical - Facebook's language of choice for backend stuff was 
C++ and they were not using Java very much (although their usage seems to have 
expanded somewhat, what with their use of Hadoop and Zookeeper and their 
development of Cassandra).


2. That would be great. However, the current belief is that there is a lot of 
special-casing for the specifics of each target language and that it's not 
clear how much commonality could be found to help here.

3. The current seqid mechanism guarantees uniqueness and also allows the 
seqid's to be small, which is better for the DenseProtocol and other compact 
protocols.

4. Yep, sounds like a PITA. Does it buy that much? Can it be supported across 
all the languages we are trying to support?

5. They are available for use by protocols if desired but the seqid is really 
the important piece of data -- the names are not actually used in the binary 
protocol or other compact protocols.

6 and 7. I'll let someone else speak to these issues.

WRT 1 and 2, I would actually love to see some mechanism to allow for the 
compiler to be abstracted to the point where we could implement it in a broad 
choice of languages (C++, Java, Ruby, etc.) and still produce the same target 
language bindings. This would free non-C++ shops from needing the C++ tool 
chain. Sounds like a pretty interesting and extensive project in and of itself 
-- if you can figure out how to make this happen, more power to you.

Chad


On 8/25/08 4:36 PM, "Torsten Curdt" <[EMAIL PROTECTED]> wrote:

Hey guys,

I've looked into Thrift recently and a few questions came up:

1. Why a native compiler? Would it me a little bit simpler to have the
compiler/code generator written in java? No language debate - just a
curious question for the reason :)

2. Wouldn't it make sense to have a bit of better separation than
having all code mixed up in the t_*_generator.cc files? Maybe more a
template approach so adjusting the code that gets generated becomes a
little bit easier?

3. Why not use the hash code of the attribute names as the sequence id?

4. Why only composition? Even a flattening model of multiple
inheritance should be quite easy to implement (if overloading is
forbidden). While in OOP I am big fan of composition over inheritance
it makes the generated API kind of ugly. Maybe a include mechanism
would be another way of simplifying composed structures. (Although I
do realize that with the current model of sequence ids that might be a
PITA to maintain)

5. If I noticed correctly the names of the attributes are included
when serialized. Why is that? Shouldn't knowing the sequence id be
good enough?

6. How do you guys suggest to deal with deterministic semantical
changes. Let's say you have

struct test {
   required string a;
   required string b;
}

and then you want to combine those values into one attribute

struct test {
   required string ab; // = a + b
}

There are a couple of problems I see here. For one ab will have to
have a different sequence id. And I guess then the 'required' will
become a problem for sequence of a and b(?). And finally the
conversion of ab = a+b needs to be handle on the application level
while rule is very straight forward and deterministic and *could* be
expressed in more generic manner.

7. Wouldn't it make sense to separate out the service and exception
stuff from the actual message versioning/serialization code?

cheers
--
Torsten

Re: why...

Reply via email to