After installing the dictionary I go into the database and remove all
words which are less than three characters long.
The text extraction is still lacking but I find it much more useful with
this modification.
[email protected] wrote on 05/19/2011 05:14:53
PM:
>
> I'm trying out the OCR-->text function and I'm getting about 25%
recognizable
> words and 75% gibberish.
>
> My dictionaries were registered into the database and I see the tables
have
> about 20K entries.
>
> Any hints on debugging this? I'm using the default workflow.
>
> There are a lot of words missing, too. Does the 3rd party tools
> package produce
> reasonable quality text extraction?
>
> Hank
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users