John Hudson wrote:

Michael Everson wrote:

Classification is an arbitrary process in which one produces useful categories into which to arrange an otherwise unwieldy body of knowledge.


I dispute this. It is not arbitrary. Sometimes the cuts are difficult to make, because there is messiness in the data, but classification puts like with like and separates like from unlike. If it were arbitrary, we would not be able to distinguish abugidas from syllabaries, or trace the relationships between scripts and name the nodes on the tree.


*All* classification is arbitrary. This is a basic philosophical proposition. Note that arbitrary does not mean baseless or capricious, it just means that systems of classification are determined by the classifier, not by the thing classified.

It's something Machine Learning researchers deal with from the get-go: learning bias. (Come to think of it, I worked with a psychology professor once whose main field was categorization, both from a philosophical and psychological standpoint). You can't learn *anything* without applying some "bias", some prejudice to the process. Prejudice is usually considered a Bad Thing, but it isn't. According to Watanabe's(?) Ugly Duckling Theorem (note that it's a theorem, that means there's a proof for it), *everything* is exactly as similar and exactly as dissimilar from, well, everything else. The only way you can make meaningful categories is somehow to stack the deck and consider only some aspects and decide what's important.


~mark




Reply via email to