On Fri, 1 Feb 2002, Kossmann, Bill wrote: >The article below may be of interest to members of this list. [An article on categorizing textual strings by appending them on reference documents and measuring aggregate compressibility snipped.]
This shouldn't be a big surprise, considering how close to the estimated entropy of various sources current compression algorithms get. In essence, compressors are statistical learners, and classification problems can be formulated as partitionings based on statistical similarity. I just wonder if the overhead of doing a significant number of compression runs against known sources isn't a bit expensive compared to current methods of identification. Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
