Simply install tesseract and re-run your testing again and it should
instantly work as tika will detect tesseract is available. I've used tika
and tesseract recently with success so I know 1.13 works.

John


On 18 July 2016 at 21:43, Gordon Schneider <[email protected]>
wrote:

> Timothy
>
>
>
> That looks promising. It will be ugly to work with but extracting text
> from a PDF can be no fun either.
>
>
>
> I will download the tesseract and see if I can get it working. I will let
> you know how it works.
>
>
>
> Thanks
>
>
>
>
>
> *From:* Allison, Timothy B. [mailto:[email protected]]
> *Sent:* July 18, 2016 2:25 PM
> *To:* [email protected]
> *Subject:* RE: Extract Text from a TIFF image
>
>
>
> You’ll need to set up tesseract to run Optical Character Recognition.
> While we have an integration with OCR, it is not bundled within the app.
>
>
>
> See https://wiki.apache.org/tika/TikaOCR
>
>
>
> For kicks, I ran this through Tika+Tesseract; this is the output you get
> once you’ve set up Tesseract:
>
>
>
> SUPPLIER: 3177  Invoice Date Description Amount Discount Net Amount
> 015-28339 06/08/2015 21,318.54 0.00 21,318.54 C15-28837 06/04/2015 1,529.75
> 0.00 1,529.75 01528978 06/04/2015 1,238.18 0.00 1,238.18 015-28978-01
> 06/04/2015 1,182.85 0.00 1,182.85 015-28439 06/01/2015 1,113.86 0.00
> 1,113.86 C15-29707 06/11/2015 886.84 0.00 886.64 C15-28978-02 06/04/2015
> 526.91 0.00 526.91 01529385 06/09/2015 199.29 0.00 199.29 C15~28439~01
> 06/03/2015 157.34 0.00 157.34 C15-28670 06/03/2015 136.52 0.00 136.52
> C15—28314-01 06/03/2015 132.81 0.00 132.81 015-28576 06/02/2015 61.26 0.00
> 61.26 015-29413 06/11/2015 22.37 0.00 22.37 Cheque #: 83077 Cheque Date
> 7/14/2015 28,506.32 0.00 28,506.32  SUPPLIER: 3177  Invoice Date
> Description Amount Discount Net Amount C15-28339 06/08/2015 21,318.54 0.00
> 21,318.54 015-28837 06/04/2015 1,529.75 0.00 1,529.75 015-28978 06/04/2015
> 1,238.18 0.00 1,238.18 015-28978-01 06I04/2015 1 ,18285 0.00 1,182.85
> C15-28439 06/01/2015 1,113.86 0.00 1,113.86 015-29707 06l11/2015 886.64
> 0.00 886.64 C15-28978~02 06/04/2015 526.91 0.00 526.91 015-29385 06/09/2015
> 199.29 0.00 199.29 C15-28439-01 06/03/2015 157.34 0.00 157.34 015-28670
> 06/03/2015 136.52 0.00 136.52 015-28314—01 06/03/2015 132.81 0.00 132.81
> C15-28576 06/02/2015 61.26 0.00 61.26 015-29413 06/11/2015 22.37 0.00 22.37
> Cheque #1 83077 Check Daie: 7/14/2015 28,506.32 0.00 28,506.32  07142015
> MMDDYYYY  TWENTY-EIGHT THOUSAND FIVE HUNDRED SIX CAD AND 32/ 100 $
> "******28,506.32  Trans Am Piping Canada
>
>
>
> *From:* Gordon Schneider [mailto:[email protected]
> <[email protected]>]
> *Sent:* Monday, July 18, 2016 4:05 PM
> *To:* '[email protected]' <[email protected]>
> *Subject:* Extract Text from a TIFF image
>
>
>
> I have tried using the GUI for tika-app-1.13 but it shows nothing. I can
> see the metdata but that does not give me the information I need. I have
> attached the file.
>
>
>
> Maybe it is not possible to extract the text. If so what should I be
> looking for to tell me that it cannot extract the text.
>
>
>
> Thanks
>
>
>
>
>
> Gordon Schneider
>
> 403-236-0601
>
> Trans Am Piping Products Ltd.
>
>
>

Reply via email to