Hi Eric, Thanks for your detailed information!
> Here is the corraberating benchmark I saw recently: > http://microbenchmarks.appspot.com/run/[email protected]/com.publicobject.blog.TreeListBenchmark > described here: > http://blog.publicobject.com/2010/07/caliper-confirms-reality-linkedlist-vs.html > However looking at the code, I'm not totally thrilled with this as an > apples-to-apples comparison. If you are interested, I can find and > attach my own code which for Arrayist, LinkedList, and ArrayDeque did > exactly the samething...fill it up and then simply iterated through it > from beginning to end in the same manner. I think I timed the > iteration seperately from the filling. >From what I understand, these benchmarks seem to involve removing the first objects from the Collection types. Removing is what's expensive in ArrayList, and what's cheap in the other Collection types. Adding and iterating is probably not as expensive, so I'm not sure if these benchmarks are relevant for jOOQ...? However, what I probably should do, is initialise ArrayLists to their expected size, where this size is known: https://sourceforge.net/apps/trac/jooq/ticket/1344 > [...] The first improvement was to page the > results in the UI. Limit and offset proved not to be as useful as one > would think. The only thing they save you is the network traffic time > of the db sending you all the results; That probably depends heavily on the database. With Oracle, I would never do any such operation outside of the database. With proper indexing for sorting, you can hardly outperform Oracle's window functions. I'm talking about paging on 2M+ records from a 1G-row table, joined several times, including the calculation of a running total, for instance. What database did you base your experience on? > as with a count query, the db > still has to do all the work of figuring out the results that it has > to do if you ask for the full result set. Count for paging is best done using window functions in a single query. Some insight: http://stackoverflow.com/questions/6033080/speed-of-paged-queries-in-oracle http://jeffkemponoracle.com/2011/11/25/3-reasons-to-hate-hibernate/ The latter being a quite refreshing rant :-) Again, this may be database-specific. Of course, if you don't calculate the max page, you can always safe a lot on performance. > To minimze the that transfer time I asked only for ids and performed > another query later to get other information based on those ids for > filling out a single page of results at a time. The set of ids I > persisted in memory across pages (disposed after a certain amount of > time) and since many users could potentially be using this at once I > needed it to take as little memory as possible. I ended up using trove > instead of the java collections so I could store a very compact list > of primitives, but the testing I did converted me from using ArrayList > as my "default" collection (as it is for most people, I think) to > ArrayDeque, unless I really need random access. > > Sorry to be long-winded, but I thought that use-case might be of > interest in thinking about scalability with jooq. This seems to be quite a particular use-case, to me. jOOQ embraces fine-tuning of SQL in a way that you can optimise a query for your database. Modern databases ship with very good cost-based optimisers. If you need that sort of client-side fine-tuning, I suggest you use an org.jooq.Cursor with a slim fetchSize, and collect the IDs in your trove IntList or ArrayDeque for now. Anyway, I think you have a good point there, in general. I haven't given this subject much thought in the beginning of jOOQ, as I didn't foresee result sets as large as yours. I doubt that the API should be changed incompatibly for these remote cases, as most users probably don't have these problems. But I'm open for API enhancements or corrections to jOOQ's internals, which may not be optimal. Note, adding support for primitives (without a round-trip to the wrapper), internally and externally, is on the roadmap: https://sourceforge.net/apps/trac/jooq/ticket/402 Any implementation hints welcome. Thanks again for your detailed explanations! Cheers Lukas
