On 23 Feb 2005 at 15:30, Brad Beyenhof wrote:

> On Wed, 23 Feb 2005 18:07:32 -0500, Dennis Bathory-Kitsz
> <[EMAIL PROTECTED]> wrote:
> > At 04:16 PM 2/23/05 -0600, Noel Stoutenburg wrote:
> > >Jef did not originally, and you do not in your supporting post,
> > >address the issue of merging the lists that Jef seems to propose.
> > 
> > I'm just being supplementary. But think of a Palm and how it
> > organizes addresses by categories. It's a droplist of choices. You
> > can choose any category you name, or "all", and you see those
> > elements. So you could have the lists sorted the "old way" or any
> > way, and it would all be available. Opera's mail client does that,
> > and I think Gmail as well.
> 
> Even better than merely categories, Gmail has "Labels." You can attach
> any number of labels to a particular message, rather than having to
> pigeonhole it into on specific category.

There's nothing about "categories" that requires there be only one 
per object. Microsoft's Outlook uses categories, and you can assign 
as many as you like to any kind of object you like. And then you can 
adjust your view of the objects to be sorted by categories or any 
other way you like.

> This kind of "flat hierarchy" is gaining ground for Internet-based
> information storage; an online social bookmarking site I use, called
> del.icio.us ( http://del.icio.us ) does a similar thing but calls the
> groups "tags" instead.

This is highly problematic, in my opinion, for two reasons:

1. it's *too* simply (non-hierarchical where the natural organization 
actually is hierarchical.

2. there is no "authority" anywhere to enforce consistency in use of 
the tags.

> I really like that method of organization. What if an email message or
> bookmark fits equally under the headings "notation," "music," and
> "finale"? With the flat hierarchy you can separate all three of these
> concepts but still place single items into all three groups if it's
> logical to do so.

This is what I wrote about the current Internet darling, "tags," in 
another forum:

I am amused by this whole discussion of tagging on websites, because 
it seems to me that my 15 years of experience in designing database 
applications and looking at other people's implementations of 
categorization systems has taught me that this kind of thing is *not* 
simple.  

There are several problems:

1. proper categorization is not flat -- it is hierarchical. With a 
flat tagging system, you need to double up the categories. If your 
item is about "Museums--Musical Instruments" you need to have a tag 
for "Museums," a tag for "Museums--Musical Instruments" and a tag for 
"Musical Instruments." In a hierarchically organized categorization 
system, that isn't required -- you need only the one tag, "Museums-- 
Musical Instruments" (which is a combination of category and 
subcategory). As Joe Celko has showed us in many SQL articles, it's 
easy enough to create n-level hierarchical trees and retrieve data 
from them without needing to have a complex data storage structure. 
This allows both the construction of the full complex tree, as well 
as easy searching of all levels of the tree.  

2. for categorization to truly work, there needs to be some 
enforcement of rules for the encoding of the categories. Otherwise, 
you need to build a lot of smarts into your search engine. For 
example, if you want to search for "Music" a simple dumb search will 
not return items tagged only with "Symphony" unless the person doing 
the tagging has insured that the item is also tagged with "Music" 
(it's super-category). Now, a smart search engine could use a 
heirarchical category list as a lookup to find related subjects, but 
the result will only be as good as the quality of that lookup list. A 
perfect example of a failed lookup list is TiVo's recommendations. 
The that works is that the TiVo looks at the categories of the 
programs you've chosen to record (and have given thumbs up to, i.e., 
told the TiVo that it's a program that you really like), and then 
finds other programs in the same categories and suggests them to you. 
When I first got the TiVo, the whole reason I bought it was so as to 
not miss Babylon 5. Well, TiVo said "Aha! B5 is science fiction, so 
I'll suggest more SciFi!" That's certainly quite correct, but the 
SciFi category also included things like Third Rock from the Sun 
(which I despise) and Tales from the Crypt (which doesn't interest me 
at all). In the former case, Third Rock, the problem is that the 
category is not fine enough. In the latter, it's that non-SciFi has 
been mis-categorized (in my opinion, and from my point of view). 
There's also the relatively trivial issue of spelling and formatting 
and synonyms. Google is good at this (I'll ignore for the course of 
the discussion below that Google is a full-text search, not a 
category search, only because it's a familiar tool I can use to 
illustrate the problems), but it only works well when you mis-type a 
word in your search string, relative to the body of web pages out 
there. That is, if you type a word that is mis-spelled, Google knows 
that there's a variation that has far more matches, and gives you the 
option with the "Did you mean..." link. But if you type the *correct* 
one, it doesn't offer you the "Did you mean..." link unless there are 
large numbers of such pages. This means that you might miss important 
sources that were simply prepared by someone who failed to adequately 
spellcheck before posting. Second, Google is wrong in the way it 
returns certain kinds of results. If you search for "Goedel's 
Theorem" you get one set of results, whereas if you search for 
"G�del's Theorem" you get a completely different set, whereas what 
you've actually requested is identical (Goedel is simply a proper, 
accepted and common alternate spelling of G�del; as with all umlauts 
in German, vowel + e is always equivalent to the single letter -- 
�=ae, �=ue, �=oe). On the other hand, Google is just a little bit 
smart and forgiving of mis-spellings. If you search for "Godel's 
Theorem" (which is a mis-spelling, as it omits the umlaut), you get 
all the results from people who've mis-spelled "G�del", but also all 
the pages with the correct spelling. No single search gets you the 
full result set of articles on Goedel's Theorem except if you tell 
Google to search for all the choices. So, there are three choices 
(though they can also be combined):  

1. the search engine corrects for human error (Google partly does 
this, as exemplified by Godel).

2. the user has to have all the smarts and know all the variations 
that need to be searched for.

3. the categorization system has to prevent the use of uniform 
categories (to insure uniform categorization) and then have synonym 
lookup to insure that typos do not prevent the fallible human user 
from getting correct results.

The current tagging mechanism that is all the rage has all the 
problems I've illustrated above.

And I haven't even gotten into the issue of specialized 
categorization systems -- Amazon is terribly difficult to search for 
classical music (do a search for "Schumann chamber music" under 
classical music and see if you consider the results usable) because 
the categorization system does not reflect the actual organization 
structure that people seeking classical music use to categorize the 
music.  

Are the tags useful?

Yes, because they can help you find things you might otherwise miss.

Can they ever be comprehensive?

Not in their current structureless implementation.

Will they go the way of META tags, polluted by pornographers?

I don't know, but I can't see how it can be prevented.

-- 
David W. Fenton                        http://www.bway.net/~dfenton
David Fenton Associates                http://www.bway.net/~dfassoc


_______________________________________________
Finale mailing list
[email protected]
http://lists.shsu.edu/mailman/listinfo/finale

Reply via email to