I am wanting to create an online application that takes a large pdf file and extracts information that is valuable for the user. The key to the application is going to be speed - I am basically wanting to provide a minimal service for free that builds up an e-mail address. I know when I OCRed one of these files in FoxIt it takes about 20 minutes. Here is my question: most of the information that I need is in the bookmarks but not all. One piece of info I need is an address that I could either get from accessing an API in Google Maps os something, or doing a partial OCR . I can see OCRing 10-12 pages to get my info. I am wondering about speed - anyone have ideas about what approach would be the fastest?
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e09451d3-16d1-4aa9-bdd8-8ca21a19f418%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

