Cutting over to the -dev list... On Mon, Dec 12, 2011 at 5:44 PM, Dan Scott <[email protected]> wrote: > On Mon, Dec 12, 2011 at 04:46:24PM -0500, Bill Erickson wrote: >> On Mon, Dec 12, 2011 at 02:22:06PM -0500, Dan Scott wrote:
<snip> >> I admit, passing around piles of XML (inside of JSON (inside of XML)) >> makes me queasy. And the lack of fine-grained fleshing/sorting/paging >> control is something we have to overcome. My hope is the rewards will >> outweigh the cost in the long run, particularly as more Evergreen >> components use unapi. (This is some potent reverse psychology ;) As >> always, though, if it turns out to be a poor tool for the job, we >> should reconsider. > > I gave the same sort of data-elements->XML(JSON(XML))->data-elements > description with a bit of a nervous laugh when describing in-db unapi > last week at the Conifer TPAC dev session. That said, I had assumed that > it was the anointed future; if that's up for debate, then I'm willing to > withdraw the "cut record details over to unapi" branch and focus on the > "one configurable json_query to rule them all" approach - or possibly a > better approach. <snip> So, perhaps spurred by this discussion, Mike Rylander went ahead and created a branch to rewrite unapi to provide better paging controls: collab/miker/unapi2_subobject_improvement in the working repo. I've consequently pushed a few commits onto the branch to address a regression from LP 893315, to fix TPAC to use the new HSTORE-based limit support, and to merge some of the functionality for the bug that began this discussion (LP 903015). One process problem is that I don't really know how to handle this situation of conflicting branches. There isn't a bug open in Launchpad to track Mike's branch - I'm only aware of it from IRC - yet I have bugs/branches with the "pullrequest" tag set that, if they get merged, will likely end up causing conflicts and essentially being rewritten if/when Mike's branch gets merged. Mike was good enough to merge one of my outstanding branches (LP 901976: adding circ due dates to copy information) into his branch, but as neither his branch nor my branch has been committed to master, master currently has less functionality than it probably should (assuming, of course, that my circ due date branch was actually okay). My current approach (of trying to make both my standalone branches still apply to master while also trying to ensure that the same functionality will be available & functional in Mike's branch) won't really scale well if more developers start pushing out more collab branches touching on the same areas; it already feels like I'm wasting effort chasing compatibility with just master and Mike's branch. I don't have a solution to put forward, unfortunately, at least not one that is practical given our current levels of automated testing and numbers of testers/developers pushing signed-off branches that would make a rapid iterative process in master feasible; maybe others have an idea of how this can best be handled in our current environment. (I'm also well aware that my worries about outstanding merge requests, particularly after Bill's merge efforts last week, pales in comparison to Thomas' outstanding branches...). And one practical problem: I'm not sure the improved paging support in Mike's new branch fully addresses the limit problem that we're trying to solve, at least not in search results. I _think_ what we're looking for is to be able to set a limit for the number of copies that should be shown across the visible range of libraries that you care about, and being able to control the number of call numbers and copies that are returned certainly helps towards that goal - but what we have is the ability to set a maximum number of copies per call number, and a maximum number of call numbers, not a total number of copies across all call numbers. For example, you can set a limit of "acp=>5" for a limit of five copies, but that's going to be a limit of 5 copies per call number. You can also set a limit of, say, "acn=>5,acp=1" to limit the results to 5 call numbers with a maximum of 1 copy per call number, but if there's only 1 visible call number that has 5 attached copies, we'll only end up displaying 1 copy. We could set a limit of "acn=>5,acp=5" to cover our bases and then filter the results as we like in O:WWW::EGCatLoader::Util, but grabbing a possible total of 25 copies when we just want the best 5 seems to defeat the purpose of trying to keep the work that the database has to do, and the amount of network traffic between the database and the Web server, to a minimum. Still, that might be the best approach for the short term.
