Tue, 23 Oct 2001 18:40:27 +0100 (BST), Markus Kuhn <[EMAIL PROTECTED]> pisze:
> - You can do a bit more with character and tuple frequency > analysis. You need for various languages (English, German, > French, C, Lisp) and their transliterations a library of > frequency tables for the various UCS characters/pairs, > and then you try all Something->UCS conversions > until you find the best match of the resulting histogram > with one in the library (read up on "index of coincidence" > [Friedman, ~1920] in introductory cryptanalysis textbooks > such as Stinson). I've done this (using frequencies of single letters only). Always worked in practice when I needed it. The program at <http://qrczak.ids.net.pl/programy/linux/konwert/konwert-1.8.tar.gz> contains it (it's really old and rusty, haven't got time to polish it). Usage: e.g. konwert any/pl-iso2 Currently supported languages are cs de el eo es fr he it pl pt ru sv, each in a couple of encodings. For Latin-based scripts it makes use of frequencies of only non-English letters of course. -- __("< Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/ \__/ ^^ SYGNATURA ZASTĘPCZA QRCZAK - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
