To wrap things up - I'm considering this proposal as having failed; so I've marked the 3 associated RBs as discarded.
I'll be working on https://issues.apache.org/jira/browse/THRIFT-3583 to introduce an immutable builder-style of java codegen that carries over thrift annotations to Apache Thrift. Once that is complete and released, I'll circle back and re-do an RB like 3/3 ( https://reviews.apache.org/r/42756/) that flips over the codebase to the new apache thrift gen. On Thu, Jan 28, 2016 at 10:08 AM, John Sirois <jsir...@apache.org> wrote: > > > On Thu, Jan 28, 2016 at 8:26 AM, John Sirois <j...@conductant.com> wrote: > >> On Thu, Jan 28, 2016 at 6:45 AM, Jake Farrell <jfarr...@apache.org> >> wrote: >> >> > +1 to making this apart of Thrift, i'm happy to help shepard this on the >> > Thrift side and get it in as soon as its ready >> > >> >> I've filed https://issues.apache.org/jira/browse/THRIFT-3583 to use as >> the >> basis for discussion of this feature over in the Apache Thrift project. >> I looked at the problem a bit and noted some challenges. >> > > And started a d...@thrift.apache.org thread here if anyone wants to follow > along or participate: http://markmail.org/message/mlxyyauyvlvuxjsf > > >> >> >> > >> > -Jake >> > >> > On Wed, Jan 27, 2016 at 8:08 PM, Maxim Khutornenko <ma...@apache.org> >> > wrote: >> > >> >> I am +1 to making immutable thrift objects solely based on perf >> numbers. >> >> >> >> My biggest concern though is maintenance of a pretty intricate >> codebase, >> >> especially when it comes to upgrading any of the frameworks involved. >> >> Bill's suggestion to explore paths to make this a part of Apache Thrift >> >> sounds great to me. >> >> >> >> On Wed, Jan 27, 2016 at 5:00 PM, Bill Farner <wfar...@apache.org> >> wrote: >> >> >> >> > Tony - this would not be a technical fork so much as a spiritual >> fork. >> >> > While we would have our own bugs to work out, the only upstream >> exposure >> >> > would be IDL or wire format changes. >> >> > >> >> > On Wed, Jan 27, 2016 at 4:58 PM, Tony Dong <td...@twitter.com.invalid >> > >> >> > wrote: >> >> > >> >> > > Awesome performance numbers! I don't particularly know the >> logistics >> >> of >> >> > > upstreaming a change like this, but optimistically I would suggest >> >> > > upstreaming it to Apache Thrift if possible. >> >> > > >> >> > > As someone that maintains a fork of a thrift compiler(fork of >> >> scrooge), I >> >> > > have to say that it's not that fun. There's a lot of custom code >> that >> >> > needs >> >> > > to be maintained and a bunch of work to rebase the code >> periodically. >> >> > > >> >> > > >> >> > > >> >> > > On Wed, Jan 27, 2016 at 4:51 PM, Bill Farner <wfar...@apache.org> >> >> wrote: >> >> > > >> >> > > > Firstly - thanks for the clean organization and delineation of >> >> steps in >> >> > > > this change. Top notch work! >> >> > > > >> >> > > > Some of the performance improvements are very nice; and in a >> >> > particularly >> >> > > > hot code path. I will wager a guess that the majority of the >> >> savings >> >> > is >> >> > > in >> >> > > > avoiding what amounts to copy constructors between mutable and >> >> > immutable >> >> > > > types. I further wager there are alternative approaches we could >> >> weigh >> >> > > to >> >> > > > achieve those performance improvements. As an example - you note >> >> above >> >> > > > that we could provide a patch to Apache Thrift. Depending how >> much >> >> > > > performance inspires our decision here, it will be prudent to >> >> evaluate >> >> > > > alternatives. >> >> > > > >> >> > > > I think there are (at least) two major issues worth discussing - >> >> code >> >> > > > volume (which you note) and an increase in logical complexity. >> This >> >> > will >> >> > > > leave us with a bifurcation in code generation tooling >> (custom+swift >> >> > for >> >> > > > Java, Apache Thrift for python and js). It's difficult to >> quantify >> >> the >> >> > > > downside of that, but it seems like an unfortunate state with >> >> potential >> >> > > for >> >> > > > compatibility risks. >> >> > > > >> >> > > > >> >> > > > On Wed, Jan 27, 2016 at 2:45 PM, Zameer Manji <zma...@apache.org >> > >> >> > wrote: >> >> > > > >> >> > > > > Some high level comments without looking at the code. >> >> > > > > >> >> > > > > I'm in favor from abandoning the thrift generated java code in >> >> favor >> >> > of >> >> > > > > immutable objects. I think it is easier to reason about and >> will >> >> > ensure >> >> > > > we >> >> > > > > have less errors in our code. If I understand correctly, the >> >> ProtoBuf >> >> > > > > format does this by default, so there some precedent for this >> >> style >> >> > of >> >> > > > code >> >> > > > > generation already. >> >> > > > > >> >> > > > > I think using Facebook's swift is the best approach here. I >> would >> >> be >> >> > > > > hesitant to accept any custom code generation that involved us >> >> > parsing >> >> > > > > thrift IDL files or thrift formats over the wire because I >> poses a >> >> > very >> >> > > > > high maintenance burden. >> >> > > > > >> >> > > > > I also think generating the MyBatis mutable classes is >> superior to >> >> > our >> >> > > > > current strategy of manually creating them. >> >> > > > > >> >> > > > > Finally, the performance improvements look fantastic. As an >> >> operator >> >> > > of a >> >> > > > > large cluster I am excited to see wholesale performance >> >> improvements >> >> > > as I >> >> > > > > am always concerned that my cluster is approaching the limits >> of >> >> what >> >> > > > > Aurora can handle safely. >> >> > > > > >> >> > > > > Overall, I think this change merits a serious discussion from >> all >> >> > > > > contributors. >> >> > > > > >> >> > > > > On Tue, Jan 26, 2016 at 9:19 PM, John Sirois < >> jsir...@apache.org> >> >> > > wrote: >> >> > > > > >> >> > > > > > On Tue, Jan 26, 2016 at 8:47 PM, John Sirois < >> >> jsir...@apache.org> >> >> > > > wrote: >> >> > > > > > >> >> > > > > > > Context: Aurora uses the official Apache Thrift compiler >> today >> >> > > plus a >> >> > > > > > > home-grown python code generator [1] for immutable "entity" >> >> (I*) >> >> > > > > > wrappers. >> >> > > > > > > >> >> > > > > > > The proposal is to switch from using the Apache Thrift code >> >> > > generator >> >> > > > > to >> >> > > > > > a >> >> > > > > > > home grown generator. The proposal comes with a concrete >> >> example >> >> > > in >> >> > > > > the >> >> > > > > > > form of the actual RBs to effect this change: >> >> > > > > > > 1. A custom java thrift code generator: >> >> > > > > > > https://reviews.apache.org/r/42748/ >> >> > > > > > > 2. A custom MyBatis binding code generator powered by 1 >> above: >> >> > > > > > > https://reviews.apache.org/r/42749/ >> >> > > > > > > 3. Integration of 1 & 2 above into the Aurora codebase: >> >> > > > > > > https://reviews.apache.org/r/42756/ >> >> > > > > > > >> >> > > > > > > Since the RBs are large, I wanted to provide some extra >> >> context >> >> > on >> >> > > > the >> >> > > > > > > idea at a higher level. I provide rationale, pros and cons >> >> below >> >> > > for >> >> > > > > > those >> >> > > > > > > interested in the idea but wary of diving in on code review >> >> until >> >> > > the >> >> > > > > > idea >> >> > > > > > > itself passes a sniff test. >> >> > > > > > > Thanks in advance for your feedback - and if we get there - >> >> for >> >> > > your >> >> > > > > > > review effort. >> >> > > > > > > >> >> > > > > > >> >> > > > > > I just added wfarner and zmanji as reviewers for the 3 RBs >> above >> >> > > since >> >> > > > > > they've expressed direct interest. Happy to add others, just >> >> speak >> >> > > up >> >> > > > or >> >> > > > > > else just comment on the reviews as you see fit. >> >> > > > > > I'll formally only submit if 1st this email thread reaches >> >> > consensus, >> >> > > > and >> >> > > > > > second, reviews are approved. >> >> > > > > > >> >> > > > > > == >> >> > > > > > > >> >> > > > > > > In the course of an initial run at creating a first-class >> >> > REST-like >> >> > > > > > > scheduler interface [2] I came to the conclusion generating >> >> the >> >> > > json >> >> > > > > API >> >> > > > > > > from the thrift one might be a good path. That idea has >> been >> >> > > > scrapped >> >> > > > > > with >> >> > > > > > > community feedback, but an initial experiment in custom >> thrift >> >> > > > code-gen >> >> > > > > > for >> >> > > > > > > java that accompanied that idea seemed worth pursuing for >> its >> >> own >> >> > > > > > > independent benefits, chief among these being 1st class >> >> immutable >> >> > > > > thrift >> >> > > > > > > structs and the ability to leverage thrift annotations. >> >> > > > > > > >> >> > > > > > > Immutability: >> >> > > > > > > The benefits of having an immutable by default data model >> are >> >> the >> >> > > > > > standard >> >> > > > > > > ones; namely, its trivial to reason about safety of >> concurrent >> >> > > > > operations >> >> > > > > > > on the data model, stability of collections containing data >> >> model >> >> > > > > > entities >> >> > > > > > > and it opens up straight-forward optimizations that are >> easy >> >> to >> >> > > > reason >> >> > > > > > > about. >> >> > > > > > > An example optimization is caching hashCodes for the >> immutable >> >> > > thrift >> >> > > > > > > structs. This was done after comparing jmh benchmarks run >> >> > against >> >> > > > > master >> >> > > > > > > and then again against the proposal branch. Perf was >> >> comparable >> >> > - >> >> > > > > within >> >> > > > > > > 10% plus and minus depending on the benchmark, but with the >> >> > > > > optimization >> >> > > > > > > added many benchmarks showed pronounced improvement in the >> >> > proposal >> >> > > > > > branch >> >> > > > > > > [3]. The optimization is clearly safe and was quick and >> easy >> >> to >> >> > > > > > > implement. Further optimizations can be experimented with >> in >> >> a >> >> > > > > > > straightforward way. >> >> > > > > > > >> >> > > > > > > Thrift Annotations: >> >> > > > > > > The thrift IDL grammar has supported these for quite some >> >> time, >> >> > but >> >> > > > > they >> >> > > > > > > are not plumbed to the generated java code. Uses are many >> and >> >> > > > > varied. I >> >> > > > > > > initially had my eye on annotation of thrift services with >> >> REST >> >> > > > verbs, >> >> > > > > > > routes, etc - but immediately we can leverage these >> >> annotations >> >> > to >> >> > > > kill >> >> > > > > > > AnnotatedAuroraAdmin and reduce the amount of MyBatis >> binding >> >> > code >> >> > > > that >> >> > > > > > > needs to be maintained. >> >> > > > > > > >> >> > > > > > > There are a few downsides to switching to our own java >> thrift >> >> > code >> >> > > > gen: >> >> > > > > > > 1. We own more code to maintain: Even though we have the >> >> custom >> >> > > > python >> >> > > > > > > "immutable" wrapper generator [1] today, this new >> generator - >> >> > even >> >> > > > with >> >> > > > > > the >> >> > > > > > > python generator removed - represents a 5-6x increase in >> line >> >> > count >> >> > > > of >> >> > > > > > > custom code (~4.1k lines of code and tests in the new >> custom >> >> gen, >> >> > > > ~700 >> >> > > > > > > lines in the existing python custom gen) >> >> > > > > > > 2. We conceptually fork from a sibling Apache project. >> >> > > > > > > >> >> > > > > > > The fork could be mitigated by turning our real experience >> >> > > iterating >> >> > > > > the >> >> > > > > > > custom code generator into a well-founded patch back into >> the >> >> > > Apache >> >> > > > > > Thrift >> >> > > > > > > project, but saying we'll do that is easier than following >> >> > through >> >> > > > and >> >> > > > > > > actually doing it. >> >> > > > > > > >> >> > > > > > > == >> >> > > > > > > Review guide / details: >> >> > > > > > > >> >> > > > > > > The technology stack: >> >> > > > > > > The thrift IDL parsing and thrift wire parsing are both >> >> handled >> >> > by >> >> > > > the >> >> > > > > > > Facebook swift project [4]. We only implement the middle >> bit >> >> > that >> >> > > > > > > generates java code stubs. This gives higher confidence >> that >> >> the >> >> > > > > tricky >> >> > > > > > > bits out at either edge are done right. >> >> > > > > > > The thrift struct code generation is done using Square's >> >> javapoet >> >> > > [5] >> >> > > > > in >> >> > > > > > > favor of templating for the purpose of easier to read >> >> generator >> >> > > code. >> >> > > > > > This >> >> > > > > > > characterization is debatable though and template are >> >> certainly >> >> > > more >> >> > > > > > > flexible the minute you need to gen a second language (say >> we >> >> > like >> >> > > > this >> >> > > > > > and >> >> > > > > > > want to do javascript codegen this way too for example). >> >> > > > > > > The MyBatis codegen is forced by the thrift codegen for >> >> technical >> >> > > > > > > reasons. In short, there is no simple way to teach >> MyBatis to >> >> > read >> >> > > > and >> >> > > > > > > write immutable objects with builders. So the MyBatis >> code is >> >> > > > > generated >> >> > > > > > > via an annotation processor that runs after thrift code >> gen, >> >> but >> >> > > > > reading >> >> > > > > > > thrift annotations that survive that codegen process. >> >> > > > > > > The codegen unit testing is done with the help of Google's >> >> > > > > compile-tester >> >> > > > > > > [6]. NB that this has an expected output comparison that >> >> checks >> >> > > the >> >> > > > > > > generated AST and not the text, so its fairly lenient. >> >> > Whitepsace >> >> > > > and >> >> > > > > > > comments certainly don't matter. >> >> > > > > > > >> >> > > > > > > Review strategy: >> >> > > > > > > The code generator RBs (#1 & #2 in the 3 part series) are >> >> > probably >> >> > > > > easier >> >> > > > > > > to review looking at samples of the generated code. Both >> the >> >> > > thrift >> >> > > > > > > codegen and MyBatis codegen samples are conveniently >> >> contained in >> >> > > the >> >> > > > > > > MyBatis codegen RB (#2: >> https://reviews.apache.org/r/42749/). >> >> > The >> >> > > > > unit >> >> > > > > > > test uses resource files that contain both the thrift >> codegen >> >> > > inputs >> >> > > > > the >> >> > > > > > > annotation processor runs over and the annotation processor >> >> > > expected >> >> > > > > > > outputs - the MyBatis peer classes. So have a look there >> if >> >> you >> >> > > > need >> >> > > > > > > something concrete and don't want to patch the RBs in and >> >> > actually >> >> > > > run >> >> > > > > > the >> >> > > > > > > codegen (`./gradlew api:compileJava`). >> >> > > > > > > The conversion RB (#3) is large but the changes are mainly >> >> > > mechanical >> >> > > > > > > conversions from the current mutable thrift + I* wrappers >> to >> >> pure >> >> > > > > > immutable >> >> > > > > > > thrift mutated via `.toBuilder` and `.with`'er methods. >> The >> >> main >> >> > > > > changes >> >> > > > > > > of note are to the portions of the codebase tightly tied to >> >> > thrift >> >> > > > as a >> >> > > > > > > technology: >> >> > > > > > > + Gson/thrift converters >> >> > > > > > > + Shiro annotated auth param interception >> >> > > > > > > + Thrift/Servlet binding >> >> > > > > > > >> >> > > > > > > [1] >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/tools/java/thrift_wrapper_codegen.py >> >> > > > > > > [2] https://issues.apache.org/jira/browse/AURORA-987 >> >> > > > > > > [3] >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://docs.google.com/spreadsheets/d/1-CYMnEjzknAsY5_r_NVX8r85wxtrEByZ5YRiAbgMhP0/edit#gid=840229346 >> >> > > > > > > [4] https://github.com/facebook/swift >> >> > > > > > > [5] https://github.com/square/javapoet >> >> > > > > > > [6] https://github.com/google/compile-testing >> >> > > > > > > >> >> > > > > > >> >> > > > > > -- >> >> > > > > > Zameer Manji >> >> > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> > >> > >> >> >> -- >> John Sirois >> 303-512-3301 >> > >