- - - - - - - - - - - - - - - - - - - - - - - - - - - -
Name: franck
Subject: Re: Problem with iso-8859-1 accent ?


Ok, I did that:
# ./indexer -qamv4 -u "news://yxc/[EMAIL PROTECTED]"
indexer[3928]: {00} indexer from dpsearch-4.37-08022006-mysql started with 
'/usr/local/dpsearch/etc/indexer.conf'
indexer[3928]: {00} Chinese dictionary with 0 entries
indexer[3928]: {00} Korean dictionary with 0 entries
indexer[3928]: {00} Thai dictionary with 0 entries
indexer[3928]: {01} URL: news://yxc/[EMAIL PROTECTED]
indexer[3928]: {01} Status: 200 OK
indexer[3928]: {01} Guesser: Lang: fr, Charset: UTF-8
indexer[3928]: {01} Done (2 seconds, 1 documents, 1119 bytes,  0.55 Kbytes/sec.)
indexer[3928]: {00} Total 3 seconds, 1 documents, 1119 bytes,  0.36 Kbytes/sec, 
 3.00 sec/doc, 1119 bytes/doc.

and search again, but same result: "..D?l?je l'ai mis..."

Strangly, dpsearch detect UTF-8 but this news use ISO-8859-1:
--------------------------------------------
Subject: Re: Matlab
From: toto <[EMAIL PROTECTED]>
Date: Mon, 02 Oct 2006 14:58:38 +0200
Message-ID: <[EMAIL PROTECTED]>
..
Newsgroups: epfl.humour
User-Agent: Thunderbird 1.5.0.7 (Windows/20060909)
MIME-Version: 1.0
..
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Xref: ...

Désolé, je l'ai mis par erreur.....
----------------------------------------------------------------

# cat  langmap.conf
LangMapFile langmap/en.ascii.lm
LangMapFile langmap/fr.latin1.lm
LangMapFile langmap/fr.latin1.bible.lm
LangMapFile langmap/fr.utf-8.lit.lm


Thx,
franck


- - - - - - - - - - - - - - - - - - - - - - - - - - - -

Read the full topic here:
http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=03;topic_id=1158767168;reply=1162380187

Reply via email to