On 5/1/07, David Karger <[EMAIL PROTECTED]> wrote:
> Johan Sundström wrote:
> > I've crafted a CVS checkins browser in Exhibit, which shows checkins
> > to the Pike programming language:
> >
> >   http://exhibit.ecmanaut.googlepages.com/cvsview.html
> >
> > [...]
> >
> > In making this exhibit, a concat() exhibit expression function was
> > added. I also noted that the add() (and, conversely, multiply())
> > functions available are so far somewhat broken by design; the data you
> > operate on will be turned into a set prior to performing the
> > operations, meaning that equal operands will only be visited once.
> >
> > In practice, that means that add(.files.lines_added,
> > .files.lines_removed) will actually perform the operation
> > add(uniq(.files.lines_added), uniq(.files.lines_removed)) which is
> > often substantially lower than the total number of changed lines, when
> > a checkin touched multiple files. Ergo: no Changed lines facet at this
> > time.
>
> Nice.  Looking at it, I could see value to faceting according to the
> "number of lines" property (distinguish large and small checkins).

Exactly my motive for wanting to add it. The miscalculated facet
Exhibit does support actually works for that, but it feels wrong to
list invalid information (or come up with a facet label like "touched
at least this many lines").

> This would call for something we've talked about before---a numeric
> range facet, rather than an enumerated values facet.

You might have not noticed it, but the "Changed files" facet is one
such facet. It is grouping them in intervals of five. In most of my
numeric data sets it makes more sense grouping logarithmically than by
fix intervals. Changed line counts would make sense with ranges like
1-1, 2-9, 10-99, 100-999, for instance.

> Would also be nice to facet on which module (string after
> /modules/ in the file path) and filename.

The Pike tree isn't particularly well organized; I quite intentionally
stopped at extracting the pike versions touched as a facet.

> This connects to another research interest of mine.  "Structured
> commenting" involves using properties and values, instead of english, on
> software comments.  We already see some of this in javadoc.  For
> example, we might use tags to annotate a particular checkin as a bugfix.

This is another feature that must be added before it replaces the
present pike cvs browser, which presently links to ticket numbers in
the Pike issue tracker. Trac has some particularly neat prior art in
that department, with ticket links decorated (overstrike) for closed
tickets, and much more. (There are syntax conventions for checkin
comments, like "Closes #4711", for instance.)

> Or, there could be a "enhancement" property whose matching value on
> a number of distinct checkins indicates that all those checkins are
> directed towards the same code enhancement.  Once these comments
> existed, they would enhance faceted browsing.

This sounds better suited for post-commit tagging of change sets, in
the web interface. It's one of many features I'm somewhat likely to
eventuelly implement, once there is a handy free JSONP or MQL (metaweb
query language, used in Freebase; ingenuously well-engineered)
database available. I've been dragging my feet for too long writing up
the specs for it. :-}

-- 
 / Johan Sundström, http://ecmanaut.blogspot.com/

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to