Stefano Mazzocchi wrote:
> Tim Churches wrote:
>> Tim Churches wrote:
>>> Courtesy of the Infosthetics blog, a very nice facetted browser using
>>> "elastic lists", in which metadata about the frequency of each category
>>> in a facet is conveyed by the size and shading of the item's background.
>>> Also, be sure to enable the sparklines to see a temporal dimension as
>>> well. All done in Flash, not Javascript, but there are some nice ideas
>>> there: http://well-formed-data.net/experiments/elastic_lists/
>> Thinking more about this, the idea of making the area of the bar for
>> each category in a facet proportional to its conditional univariate
>> frequency, as illustrated in the "elastic lists" demo above, is pretty
>> cute. 
> 
> hmmm, linearly proportional seems like a really bad idea, since those
> facet values are going to be pretty much all power-law distributed.
> Logarithmically proportional, maybe.

Yes. That's why mosaic plots work reasonably well - the area of each
square is a linear function of the frequency, so each side is the square
root of the frequency.

> In laying out a faceted browser, reducing the amount of space facets
> move away from the items is already a big challenge... giving more space
> to the items just to indicate their size (which is already indicated by
> the number count) could be just redundant. That said, some people relate
> to visual semantics (like shape color size) better than to typographic
> value, so it might be worth exploring.

Yes, agreed, and I think that these sorts of displays and the mosaic and
related displays I mentioned work better as a top-level data exploration
system rather than as a set of selection menus for individual item
browsing. It depends on the use-case. In the health domain, as a public
health epidemiologist, I tend to think of data as sets defined by
categories, but it is most often the count of items in each set (or the
mean etc of some numerical attribute of the items in each set) which is
of greatest interest, rather than the details of each individual item
(although drilling down to and browsing individual records is also
important). By contrast, my clinical colleagues who focus on individual
patient care are more interested in the individual characteristics of
each patient in a subset. Exhibit and Timeline currently cater for the
latter weltanshauung, which is fine, but with the cross-tabulation view
and the mooted time-series frequency views it is starting to move into
displays of aggregate information as well. Both views are valid and useful.

> My problem with size-based listings (say, tagcoulds) is that my eye gets
> attracted by the two/three big players and the rest becomes much harder
> to parse out.

Agreed, even simple legibility soon become problematic.

> Also there is already a discussion (at least between some of us) of
> whether or not it's useful to show all items in a facet (as the facet
> long tail is rarely selected when slicing and dicing at least until the
> facet values are less than ~10 total).

Automatic grouping into an "Others" category if the number of categories
is too large? But should that grouping into other be based on the
unconditional univariate frequency of each category in a facet, or on
the conditional frequency based on what other facet restrictions or
selections are currently active. For example, various "prog rock" bands
are likely to be included in the "Others" category in a "Popular Bands"
facet, unless "1970s" is selected in the Decade facet, in which case
Yes, Emerson, Lake and Palmer, Genesis, Jethro Tull and so on would
appear as individual entries in the popular bands facet.

I suspect all this involves too much processing for the current set of
browser Javascript implementations to hand?

> Personally, I'm a big fan of size-ordered facet lists with a cutoff at
> the half size (you show the head of the tail until you cover at least
> half the items).
> 
> David like alphabetical-ordered facet lists better.
> 
> I wish we could do a usability study to prove which one is better (if at
> all!) ;-)

Yes, empiricism rules.

>> But I wonder if that idea can be extended to multivariate
>> frequencies, with shading based on the degree of variation from
>> expectation, as in mosaic displays (see
>> http://www.math.yorku.ca/SCS/Papers/asa92.html for a good introduction)
>>  - could that ever be a useful alternative interface for a facetted data
>> browser?
>>
>> Sparklines are also a very neat idea, and might be a useful addition to,
>> say, the thumbnails view in Exhibit, for time series of quantities
>> (although the x-axis of a sparkline does not have to be time) - see
>> http://en.wikipedia.org/wiki/Sparkline and particularly
>> http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001OR&topic_id=1
> 
> We have used sparklines before with great success and sexy appeal (see
> http://simile.mit.edu/gadget/), but I'm not sure that 'condensed
> timeline' overlapped with the facet value is that useful. For example,
> not all data has a time dimension to it and even if so, we have highly
> multidimensional data: which axis should you choose to show there?
> configurable? different one per facet? would that increase user
> confusion? how would that relate to the timeline view? (and an eventual
> future time series view?)

I agree that the sparkline superimposed on the facet categories is not
always successful, which is presumably why the "elasctic lists" example
allowed them to be toggled off.

What I had in mind was as a complement to the proposed time series view
in which sparklines could be shown in a grid, a bit like the current
thumbnail view, but with little graphs in each cell, not a picture. The
quantity for the y-axis (and perhaps the x-axis) of the sparklines could
be selectable.

This picks up on the idea of "multiples" and panelling proposed by
Cleveland and Becker for their Trellis statistical graphs in the S/Plus
stats suite - see
http://cm.bell-labs.com/cm/ms/departments/sia/project/trellis/ - (which
is also implemented as "lattice" graphs in the open source R environment
- see http://www.r-prject.org or
http://addictedtor.free.fr/graphiques/search.php?q=lattice&engine=RGG
for some examples (and many other interesting graphs on that site)).

> Don't get me wrong, it's great to have such stimulating discussions and
> we very much suggest people to take exhibit and make derivatives to show
> potential innovations that we will then factor back in if the community
> finds them useful.

Yup. All of the forgoing are just ideas, not requests for implementation
in Exhibit.

Tim C

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to