Problem between r2netcmd and htdig.

for indexing rtf file , I have replaced rtf2html by the r2net software in
doc2html.pl .I am working under linux (Suze) to make conversion of rtf in
French language.
To use r2netcmd with htdig I have routing the input in a fichier .rtf in /
tmp for the external parser:

****** Partie de doc2html.pl *************************  
# RTF documents
  if ((defined $RTF2HTML) and (length $RTF2HTML)) {
    $mime_type = "application/msword|application/rtf|text/rtf";
        open (Tampon, "> $TMP/tampon.rtf");
        open (F,"< $Input");
        while ($ligne = <F>) {
                print Tampon ($ligne);
        };
        close (Tampon);
        close (F);
    $cmd = $RTF2HTML;
    # Rtf2html uses filename as title, change this:
    $cmdl = "$cmd '$TMP/tampon.rtf'| $ED
's#^<TITLE>$Input</TITLE>#<TITLE>[$Name]</TITLE>#'";
    $magic = '^{\134rtf';
    &store_html_method('RTF',$cmd,$cmdl,$mime_type,$magic);
  }
*****************************************************

the tampon.rtf file give me for example :

****************************************************
bb et d\'e9ziper ce fichier dans ce nouveau r\'e9pertoire.
**************************************************** 

and the tampon.html file gives me the accents but my database db.wordlist
gives me only e9pertoire.

someone could help me. 

Thanks 

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to