After installing the dictionary I go into the database and remove all 
words which are less than three characters long.
The text extraction is still lacking but I find it much more useful with 
this modification.

[email protected] wrote on 05/19/2011 05:14:53 
PM:
> 
> I'm trying out the OCR-->text function and I'm getting about 25% 
recognizable 
> words and 75% gibberish.
> 
> My dictionaries were registered into the database and I see the tables 
have 
> about 20K entries.
> 
> Any hints on debugging this? I'm using the default workflow.
> 
> There are a lot of words missing, too. Does the 3rd party tools 
> package produce 
> reasonable quality text extraction?
> 
> Hank
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Reply via email to