On Tue, 12 Jun 2001 16:04:03 +0200
[EMAIL PROTECTED] wrote:
> Problem between r2netcmd and htdig.
>
> for indexing rtf file , I have replaced rtf2html by the r2net software in
> doc2html.pl .I am working under linux (Suze) to make conversion of rtf in
> French language.
> To use r2netcmd with htdig I have routing the input in a fichier .rtf in /
> tmp for the external parser:
>
> ****** Partie de doc2html.pl *************************
> # RTF documents
> if ((defined $RTF2HTML) and (length $RTF2HTML)) {
> $mime_type = "application/msword|application/rtf|text/rtf";
> open (Tampon, "> $TMP/tampon.rtf");
> open (F,"< $Input");
> while ($ligne = <F>) {
> print Tampon ($ligne);
> };
> close (Tampon);
> close (F);
> $cmd = $RTF2HTML;
> # Rtf2html uses filename as title, change this:
> $cmdl = "$cmd '$TMP/tampon.rtf'| $ED
> 's#^<TITLE>$Input</TITLE>#<TITLE>[$Name]</TITLE>#'";
> $magic = '^{\134rtf';
> &store_html_method('RTF',$cmd,$cmdl,$mime_type,$magic);
> }
> *****************************************************
>
> the tampon.rtf file give me for example :
>
> ****************************************************
> bb et d\'e9ziper ce fichier dans ce nouveau r\'e9pertoire.
> ****************************************************
>
> and the tampon.html file gives me the accents but my database db.wordlist
> gives me only e9pertoire.
>
> someone could help me.
I think this is a LOCALE problem, which I am not capable of
answering.
As regards your modification of doc2html.pl, this does seem
unnecessarily involved, simply to provide a *.rtf file for
r2netcmd. How about simply changing the command line to:
my $rtf = quotemeta("$TMP/$Name.rtf");
$cmdl = "(cp $Input $rtf; $RTF2HTML $rtf; $RM $rtf)";
I havn't tried this, but it should work.
Even better, you might consider creating a symbolic link to
$Input rather making a copy.
----------------------
David Adams
[EMAIL PROTECTED]
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html