On 5/1/07, David Karger <[EMAIL PROTECTED]> wrote: > Johan Sundström wrote: > > I've crafted a CVS checkins browser in Exhibit, which shows checkins > > to the Pike programming language: > > > > http://exhibit.ecmanaut.googlepages.com/cvsview.html > > > > [...] > > > > In making this exhibit, a concat() exhibit expression function was > > added. I also noted that the add() (and, conversely, multiply()) > > functions available are so far somewhat broken by design; the data you > > operate on will be turned into a set prior to performing the > > operations, meaning that equal operands will only be visited once. > > > > In practice, that means that add(.files.lines_added, > > .files.lines_removed) will actually perform the operation > > add(uniq(.files.lines_added), uniq(.files.lines_removed)) which is > > often substantially lower than the total number of changed lines, when > > a checkin touched multiple files. Ergo: no Changed lines facet at this > > time. > > Nice. Looking at it, I could see value to faceting according to the > "number of lines" property (distinguish large and small checkins).
Exactly my motive for wanting to add it. The miscalculated facet Exhibit does support actually works for that, but it feels wrong to list invalid information (or come up with a facet label like "touched at least this many lines"). > This would call for something we've talked about before---a numeric > range facet, rather than an enumerated values facet. You might have not noticed it, but the "Changed files" facet is one such facet. It is grouping them in intervals of five. In most of my numeric data sets it makes more sense grouping logarithmically than by fix intervals. Changed line counts would make sense with ranges like 1-1, 2-9, 10-99, 100-999, for instance. > Would also be nice to facet on which module (string after > /modules/ in the file path) and filename. The Pike tree isn't particularly well organized; I quite intentionally stopped at extracting the pike versions touched as a facet. > This connects to another research interest of mine. "Structured > commenting" involves using properties and values, instead of english, on > software comments. We already see some of this in javadoc. For > example, we might use tags to annotate a particular checkin as a bugfix. This is another feature that must be added before it replaces the present pike cvs browser, which presently links to ticket numbers in the Pike issue tracker. Trac has some particularly neat prior art in that department, with ticket links decorated (overstrike) for closed tickets, and much more. (There are syntax conventions for checkin comments, like "Closes #4711", for instance.) > Or, there could be a "enhancement" property whose matching value on > a number of distinct checkins indicates that all those checkins are > directed towards the same code enhancement. Once these comments > existed, they would enhance faceted browsing. This sounds better suited for post-commit tagging of change sets, in the web interface. It's one of many features I'm somewhat likely to eventuelly implement, once there is a handy free JSONP or MQL (metaweb query language, used in Freebase; ingenuously well-engineered) database available. I've been dragging my feet for too long writing up the specs for it. :-} -- / Johan Sundström, http://ecmanaut.blogspot.com/ _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
