On Thu, 6 Jun 2002, Bley, Josef wrote:

> Hi,
> 
> we work with  Htdig 3.1.5 running on a Solaris 8 workstation.
> We want to integrate an external parser for indexing Tiff-raster files with
> the ocr-Software OCRshop from vividata.
> We have made following entry in htdig.conf:
> external_parsers: image/tiff->text/plain /ya/cadim/htdig/bin/tif2text
> 
> tif2text is a shell-script, which starts the OCR-software. the output of the
> OCR-software is a plain-textfile with extension txt in the htdig temporary
> directory.
> 
> Following happens:
> When rundig starts indexing of a Tiff-File, it starts the OCR-software,
> creates the textfile, but after this, htdig tries to index only the
> tiff-file and not the text-file.
> The result is, that no words are in the htdig database.
> Our goal is , that hdtig should index the txt-file instead of the tiff-file.
> How can we achieve this ?

Did you read http://www.htdig.org/attrs.html#external_parsers  ?
There are two ways: converting and parsing. Converting converts
a content-type to one that htdig can parse itself, parsing is that
your parser will output the parsed file in a special format
described in this attribute file.

--jesse
--------------------------------------------------------------------
J. op den Brouw                           Johanna Westerdijkplein 75
Haagse Hogeschool                                  2521 EN  DEN HAAG
Faculty of Engeneering                                   Netherlands
Electrical Engeneering                                +31 70 4458936
-------------------- [EMAIL PROTECTED] --------------------

Linux - because reboots are for hardware changes


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to