On Thu, Jul 22, 2010 at 6:10 PM, Jeffrey DeCew <jeff...@decew.org> wrote:
> There's something to be said for simplifying the generated code and moving > responsibilities to the library. It would certainly make bug fixes easier > to incorporate if they only affected library code. > I agree in principle, but it remains to be seen if there this could be done without negative impact in other areas. > > Thrift has done a great job of adapting to the standards of various > languages, but I think that it has been harder to reach the bar set by > Java, > with it's much more pervasive and defined conventions. One such > 'convention' of Java (or perhaps it's a backlash against EJB and other > codegen-heavy Java frameworks that have lost favor) is the lack of > generated > source code in favor of greater use of interfaces as well as annotations > and > reflection. I think Thrift could gain a lot more Java support (certainly > in > my company) if it embraced reflection (where it can remain high > performance) > to reduce the massive amounts of (often highly redundant) generated code. > I don't mean to sound dismissive, but if you were able to make changes to start using reflection in non-trivial parts of Thrift (like the serialization and deserialization code), and it *didn't* have a negative performance impact, I'd be astounded. I've spent a lot of time staring at a profiler, and the first thing I usually do is get rid of reflection. Maybe annotations aren't as much of an issue, but where would we use them in particular? Is the lack of reflection use seriously hampering Thrift adoption at your company? Or do you just mean that the bloated code turns people off, and reflection is a solution? > > Regardless, it would be good to see some benchmark data comparing this > reflection-based system with the current Thrift baseline, just to see the > real performance cost we are starting with. > I'd also love to see this benchmark. Is this something you could put together? I may be able to assist somewhat. > > I work in a group that is primarily Java-based, but we occasionally use > Thrift because we have some C++ apps (current and future) that desperately > want to be better connected to our environment. The main technology that > we > pitted Thrift against is Spring remoting, which is pretty much Java-only, > and uses dynamic proxies on the client side to implement a plain Java > interface at runtime and has library classes which use reflection to > process > messages on the server side. These may be strange concepts to Thrift (and > certainly to C++), but they are more and more common in Java, and we should > encourage Java developers to help make Thrift more comfortable for > themselves to work in, assuming it doesn't sacrifice performance too much. > I don't personally have experience with Spring remoting, and it's not in the benchmarks I've seen, but maybe you could find something similar here: http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 However, if you were so inclined, you could absolutely use reflection to serialize arbitrary objects with Thrift protocols, and you've have interoperability with a bunch of languages. I think this would be a really interesting contribution to Thrift. > > -- > Jeff DeCew > > > On Thu, Jul 22, 2010 at 1:35 PM, Bryan Duxbury <br...@rapleaf.com> wrote: > > > This is pretty interesting. I think that you're right in your assessment > > that these changes would adversely affect performance, but I don't agree > > that the performance changes are unimportant on the client. In my > > experience, using Java Reflection for anything that is vaguely > > performance-critical gets you in a lot of trouble. All the generated > > metadata stuff is designed with this in mind. > > > > I'm a bit surprised that you got that degree of size reduction from your > > changes. I would be open to incorporating some of them - the toString > > helper > > method, for instance - and perhaps adding switches for things like equals > > and compareTo, if it's reasonable to remove them at times. I don't think > it > > makes sense to branch the serialization code, though. > > > > One question I have in general, though, is why does the size of the jar > > matter at all? It's not like it's going to take up tons of memory or slow > > you down particularly when being transferred around. > > > > -Bryan > > > > On Thu, Jul 22, 2010 at 11:34 AM, Emmanuel Bourg <ebo...@apache.org> > > wrote: > > > > > Hi all, > > > > > > I'm new to Thrift and I started playing lately with a generated Java > > > client. I was somewhat struck by the size of the resulting jar, about > 3MB > > > for an interface consisting of about 30 service methods and 150 structs > > of > > > variable size. > > > > > > The generated classes contain code that isn't always useful, like the > > > equals() and compareTo() methods. The implementation of toString() > could > > > easily be delegated to an utility class. Annotations could be used to > > > represent the metadata. The service implementation contains the client > > and > > > the server, but I rarely need both at the same time. > > > > > > From these observations I tweaked the compiler to generate minimal > > classes. > > > For example a simple struct like this one: > > > > > > struct Person { > > > 1:i32 id, > > > 2:string name, > > > 3:string email > > > } > > > > > > will translate into the following class: > > > > > > @ThriftStruct("Person") > > > public class Person implements Serializable { > > > > > > @ThriftField(name="id", type=TType.I32, id=1) > > > public int id; > > > > > > @ThriftField(name="name", type=TType.STRING, id=2) > > > public String name; > > > > > > @ThriftField(name="email", type=TType.STRING, id=3) > > > public String email; > > > > > > public int getId() { > > > return this.id; > > > } > > > > > > public void setId(int id) { > > > this.id = id; > > > } > > > > > > public String getName() { > > > return this.name; > > > } > > > > > > public void setName(String name) { > > > this.name = name; > > > } > > > > > > public String getEmail() { > > > return this.email; > > > } > > > > > > public void setEmail(String email) { > > > this.email = email; > > > } > > > > > > public String toString() { > > > return ToStringBuilder.toString(this); > > > } > > > } > > > > > > The read() and write() methods are now implemented in a utility class > > using > > > reflection to find out the fields being serialized. That's probably > > slower > > > than the static code generated by the original compiler but that's less > > > critical on the client side. > > > > > > With these modifications the resulting jar was about 10 times smaller > in > > my > > > case. > > > > > > I uploaded the code on Github if someone is interested to look into it. > > > It's derived from the 0.2 branch. > > > > > > http://github.com/ebourg/thrift > > > http://wiki.github.com/ebourg/thrift > > > > > > > > > Emmanuel Bourg > > > > > > > > >