I have found that there is a scale-dependency in the curly quotes
handling.
If I create 300 dpi versions of my scans, then Tess3.01 begins working
much better.  That is a huge relief and makes tess usable.
I wish I could use the 600dpi scans.  I have them.
Seems like this might be a little bug.
Maybe there are some other little bugs that
only go away when dpi around 300 is used.

One annoyance: the box maker never joins the double-curly-quotes
into a single box.  I always have to merge it, and it takes a long
time.

Finally, even though Tess is usable with 300dpi for the curlys,
there is  another problem where it occasionally will try
to include the right-double-curly-quote into the line above it.
Luckily this happens rarely, and is easy-ish to edit to fix
the output.  But it would be better not to need this.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to