Hi, forgive my presumptions I'm a n00b on this forum.

A couple of months ago I wrote a patch for tesseract 2.04 so I could
pipe tif files from stdin as part of a python app to extract text from
pdf files (some of the images in pdfs contained text I wanted to
index).

This works very well, but I'd like to take it a step further, so I'd
appreciate your advice...

The step further I have in mind is to extend the api with python
bindings so that I can pass a tif image directly to the api in-memory.
This would leverage the Python Image Library because it does nice
format conversion from any other standard format to tif.

However, I'm no longer working on the python app mentioned above, so
this would be more a labour of love (& respect for Tesseract of
course), and therefore I would like your advice to make sure I did
something that didn't cpnflict with your plans, and was generally
usefull.

--

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.


Reply via email to