On Aug 26, 2008, at 12:30, Ian O'Connell wrote:
On Tue, Aug 26, 2008 at 10:50 AM, Torsten Curdt <[EMAIL PROTECTED]>
wrote:
On Aug 26, 2008, at 10:33, David Reiss wrote:
At least if you expect java to be installed on the system it
can't be
much easier than that ;)
It's been my experience that it is easier to get a stable lexx/
yacc/g++
working on a Linux system than Java.
Ahem ...no offense - but sounds like you are a C guy that just
hates java :)
I doubt he is given facebook do use java too. But there are those of
us(like me) who don't use java at all, just C/C++ and some scripting
languages.
C/C++ compilers are on every dev platform pretty much, I don't have
java on a single one of mine.
Not that I am one of those but ... do you think windows user agree? :-p
Anyway discussing this point will not lead to anything useful. I just
would not have picked C/C++ and wondered why it has been used. Done.
Regarding hash codes, I think they would complicated the data model
and make it more difficult to visually parse binary dumps of
structures.
Why? The data model should be the exact same thing. Just that the
sequence id would be 343422352 instead of 2 for example.
Also, check out the output of thrift_dump in contrib/. That would
also be less readable.
Again - how is that supposed to be different? 343422352 vs 2.
What you really want is to pass in the mapping somehow and get back
the attribute name for the sequence id.
Finally, the codes would be larger and would
not lend themselves as well to the variable-length encoding used in
TDenseProtocol.
Indeed variable length encoding might not work as good. You would
probably have to store the full integer most of the time. On the
other hand it could left as a choice to the user:
I fail to see the big benefit here? using the sequence id's simplfies
the implementation which makes it less bug prone across languages.
Frankly speaking I haven't looked into the implementation in detail.
But I am
not sure the actual cross language implementation really differs?!
What happens when today I write
struct Test {
required string something [2324532];
}
Unless the sequence id refers to a dedicated slot (and uses a list/
array underneath ) there should be no difference at all.
And
has been mentioned in the denseprotocol given most people's sequence
id's remain very low the cost of a hash based method would be pretty
high.
See above. Especially if you consider an include mechanism there is
also the maintenance cost.
Adding it in as a choice to the user just complicates the
interfaces and means more code/bugs.
But that's something I would like get some more details on.
string something [#1] // sequence number 1 (for those who wants to
maintain
it)
string something // sequence number "something".hashCode (for those
who
don't)
string something [somename] // sequence number "somename".hashCode
I think inheritance complicates the data model without adding much
value.
Why? What is the problem. For me this would give tremendous value.
If you blend in the the attributes it's merrily (/no) more than an
include.
If we were going to do it, I'd want to do it like Protocol Buffers
do,
which is basically a #include. (We'd have to have better checking
for
things like duplicate field ids.) This could be done by the
parser and
instantly work in all languages.
Not sure that really makes a big difference in terms of checking.
But either way - something like that would be nice.
If what you want to achieve is doing the inheritance/include for
structures and then flatten them surely one could just generate the
sequence numbers in a deterministic way from the includes rather than
jumping to hashes?
Suggestions? If you start using the order you are asking for trouble.
What else than the hash would you use?
Now you focused more on the optional/required and so on. Indeed all
correct. But my focus was more on the fact that ab can be derived
from
a and b. That means that even old struct implicitly have ab. So
when
you make the switch you either have support this logic in (every)
client
or you switch over to only rely on ab and can no longer read older
structs.
See what I mean now?
My understanding is that you want Thrift to generate code for
combining
values? This would basically make it a programming language, and
we are
not going to be doing that.
I am talking about a problem here. And I wanted to discuss how to
solve this
best.
While support for renaming fields is great, renaming really is only
part of
the problem.
What is the problem here? if you want to swap to the alternate method
of representing your data you should use a language specific wrapper
that will take a and b and pass ab onto your function which processes
ab. it doesn't sound like a thrift problem?
See my other mail.
Well, if you only use Thrift for serialization and versioning you
might not
always have a need for the service stub generation. While this
isn't
really a big problem I am wondering if these aren't two separate
things.
If you don't declare any services, no stubs will be generated.
That was not my point. This is about the code base and the focus of
the
project.
What is your point exactly in this regard? i'm not sure how its a
problem given it doesn't generate the stubs if you don't need them?
Most of the code in the platform libraries is well seperated so one
could rebuild lacking the network aspects if one wished..
It has been raised the issue of feature bloat. Do one thing and do it
well.
Currently I see two things in Thrift - that's my point.
David, with all due respect. If you guys want grow a community
around this
project
you might want to consider becoming a little more open to
discussions.
This has come up time and again recently, and I honistly don't really
see where people are coming from all that much. Thrift was designed as
a lightweight protocol, and of late people have suggested alot of
things which would turn it into a much slower clunky library which
would be useless for alot of us.
I don't see my suggestions to match that description.
Sure people have suggested making
their X suggestion optional given it will slow things down by a huge
margin. But given a few revisions, layer of new features on these
'slow' aspects and eventually we'll end up some overly featured slow
RPC library like most of those out there.
Not sure what else has been proposed - but maybe you exaggerating a
little here?
Will check the archives...
I am not criticizing your baby here. I am trying to understand
where it came
from and try
to make suggestions that might make it work better for others (like
me).
And given the number of people doing this of late I think its pretty
fair to cut them some slack in this regard. There have been lots of
suggestions of late, and honistly most of them bad... with the
original developers taking time to answer all of them in full rather
than shutting them straight off. Maybe abrupt at times but no one on
here is a kid who needs to be coddled over their bad idea?
I am not sure what has been proposed before but if Thrift wants to
have a healthy community around this:
Big deal - you will have to learn living with that. Every open source
projects has to. And "shutting off" people is not the Apache way. Sorry.
cheers
--
Torsten