On Wednesday 15 February 2006 05:21, David Kovar wrote:
>
> 2) Ability to develop a "finger print" of a particular writing style and
> search for it. This sort of thing has been done to find other works by
> authors, or to search for copyright violations.

David,

In his presentation at What The Hack[1], Rudi Cilibrasi[2] described 
techniques that could be used to group things (music, animals, literature) 
using clustering based on compression. 

In his paper, [3], he gives some examples where Russian literature was grouped 
- by the original author (when in Russian), but also by the translator when 
the english translations were tested.

You might want to take a look at his CompLearn software[4] - it would probably 
make a good starting point if you're looking to develop your own tool to look 
at irc/chat-rooms.

Cheers,

Steve.

[1] http://program.whatthehack.org/event/101.de.html
[2] http://cilibrar.com/
[3] http://www.cwi.nl/~paulv/papers/cluster.pdf
[4] http://www.complearn.org/

-- 
--------------------------------------------------------------
Steve Wilson
Senior Security Consultant
QinetiQ, St Andrews Road
Malvern,  WR14 3PS
Tel: (01684 89) 4153
Fax: (01684 89) 7417
---------------------------------------------------------------
'The views expressed herein are entirely those of the writer and do not
represent the views, policy or understanding of any other person or
official body.'
---------------------------------------------------------------
'The information contained in this e-mail and any subsequent
correspondence is private and is intended solely for the intended
recipient(s).  For those other than the intended recipient any
disclosure, copying, distribution, or any action taken or omitted to be
taken in reliance on such information is prohibited and may be
unlawful.'
---------------------------------------------------------------

Attachment: pgpz18Xwv1Q3V.pgp
Description: PGP signature

Reply via email to