After rebooting the server tesseract complains  as follows:

Error opening data file /usr/local/tesseract-ocr/tessdata/deu.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the 
parent directory of your "tessdata" directory.
Failed loading language 'deu'
Tesseract couldn't load any languages!
Could not initialize tesseract.

I manually copied  deu.traineddata to that folde and chmod'ed it to 777, 
but that just works until next reboots.

I think I'll  give up soon with Tesseract and stay with OCR in Acrobat 
pro...

Am Freitag, 9. Januar 2015 18:34:25 UTC+1 schrieb C.:
>
> I did not succeed in completely reinstalling so I reinstalled the server 
> again and  installed just the latest version of tesseract from the source.
>
> Now everything worked fine again "tesseracting": all lines are shown in 
> the resulting pdf-file. So it has to be a bug in tesseract 3.03.
>
> Hope that the latest version goes to to ubuntu-repos soon (cause I had 
> some problems after compiling with the TESSDATA_PREFIX thing).
>
> Am Freitag, 9. Januar 2015 13:16:03 UTC+1 schrieb shree:
>>
>> please see https://code.google.com/p/tesseract-ocr/issues/detail?id=1278
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Fri, Jan 9, 2015 at 5:44 PM, ShreeDevi Kumar <[email protected]> 
>> wrote:
>>
>>> you should *uninstall the old version fully* and then build the version 
>>> from git. It is possibly referring to some older libraries.
>>>
>>> Also, this needs leptonica 1.71. Not sure if the documentation mentions 
>>> it or not.
>>>
>>> ShreeDevi
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> On Fri, Jan 9, 2015 at 5:37 PM, C. <[email protected]> wrote:
>>>
>>>> I tried to compile the version you mentioned (after having installed 
>>>> the dependencies of the readme), but make stops with the following error:
>>>>
>>>> ./.libs/libtesseract.so: undefined reference to `l_generateCIDataForPdf'
>>>> ./.libs/libtesseract.so: undefined reference to `l_CIDataDestroy'
>>>> collect2: error: ld returned 1 exit status
>>>> make[2]: *** [tesseract] Fehler 1
>>>>
>>>>
>>>> Am Freitag, 9. Januar 2015 09:28:53 UTC+1 schrieb shree:
>>>>>
>>>>> As far as I know, pdf creation is a new addition and the issues were 
>>>>> ironed out only recently. There have been over 100 commits to the code 
>>>>> since 3.03 rc. 
>>>>>
>>>>> If you want the new functionality, you can try compiling the code from 
>>>>> https://code.google.com/p/tesseract-ocr/source/checkout
>>>>>
>>>>> Instructions are at https://code.google.com/p/
>>>>> tesseract-ocr/wiki/Compiling
>>>>>
>>>>> ShreeDevi
>>>>> ____________________________________________________________
>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>
>>>>> On Fri, Jan 9, 2015 at 1:53 PM, C. <[email protected]> wrote:
>>>>>
>>>>>> First of all: thanks for your help.
>>>>>>
>>>>>> Concerning my problem I did a complete reinstall of the Ubuntu 
>>>>>> 14.04-Server, installed tesseract 3.03 from the repos again and the 
>>>>>> failure 
>>>>>> still exists ! As 3.03 does not seem to be that old, I did not and - to 
>>>>>> be 
>>>>>> honest - do not want to install a newer version from github.
>>>>>>
>>>>>> Is this a know bug?
>>>>>>
>>>>>> Am Freitag, 9. Januar 2015 06:33:01 UTC+1 schrieb shree:
>>>>>>>
>>>>>>> I am using the git version -- output and messages attached. pdf 
>>>>>>> seems to have all the lines.
>>>>>>>
>>>>>>> User@HP ~/tesseract-ocr/testing
>>>>>>> $ tesseract 5.tif 5 pdf
>>>>>>> Tesseract Open Source OCR Engine v3.04.00 with Leptonica
>>>>>>> Page 1
>>>>>>> OSD: Weak margin (5.78), horiz textlines, not CJK: Don't rotate.
>>>>>>> Page 2
>>>>>>> Too few characters. Skipping this page
>>>>>>> OSD: Weak margin (0.00) for 0 blob text block, but using orientation 
>>>>>>> anyway: 0
>>>>>>> Empty page!!
>>>>>>> Too few characters. Skipping this page
>>>>>>> OSD: Weak margin (0.00) for 0 blob text block, but using orientation 
>>>>>>> anyway: 0
>>>>>>> Empty page!!
>>>>>>> Warning in pixReadMemTiff: tiff page 2 not found
>>>>>>>
>>>>>>> User@HP ~/tesseract-ocr/testing
>>>>>>> $ tesseract -v
>>>>>>> tesseract 3.04.00
>>>>>>>  leptonica-1.71
>>>>>>>   libgif 5.1.0 : libjpeg 8d : libpng 1.6.14 : libtiff 4.0.3 : zlib 
>>>>>>> 1.2.8 : libwebp 0.4.2
>>>>>>>
>>>>>>>
>>>>>>> ShreeDevi
>>>>>>> ____________________________________________________________
>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>>>
>>>>>>> On Thu, Jan 8, 2015 at 9:24 PM, C. <[email protected]> wrote:
>>>>>>>
>>>>>>>> sorry, meant: 5.pdf is the resulting file.
>>>>>>>>
>>>>>>>> Am Donnerstag, 8. Januar 2015 16:53:31 UTC+1 schrieb C.:
>>>>>>>>
>>>>>>>>> tesseract 3.03, example is attached (5.tif is the original, 5.tig 
>>>>>>>>> the result).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am Donnerstag, 8. Januar 2015 16:02:31 UTC+1 schrieb shree:
>>>>>>>>>>
>>>>>>>>>> I don't think that's the supposed behavior. What version of 
>>>>>>>>>> tesseract are you using? Please post a sample image for testing?
>>>>>>>>>>
>>>>>>>>>> ShreeDevi
>>>>>>>>>> ____________________________________________________________
>>>>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 8, 2015 at 8:00 PM, C. <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> If I do a simple "tesseract 1.tif 2 pdf ", all vertical and 
>>>>>>>>>>> horizontal lines (and grahics with small lines) in the source-file 
>>>>>>>>>>> dissapear in the resulting PDF-file (Ubuntu server 12.04, tesseract 
>>>>>>>>>>> 3.03).
>>>>>>>>>>>
>>>>>>>>>>> Is that the supposed behavior?
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr
>>>>>>>>>>> .
>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/dcbb0e46-b29
>>>>>>>>>>> b-447a-a5f4-d634b4371725%40googlegroups.com 
>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/dcbb0e46-b29b-447a-a5f4-d634b4371725%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/6637bf0e-bf2
>>>>>>>> 3-4ac8-a5bf-8add588ca9be%40googlegroups.com 
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/6637bf0e-bf23-4ac8-a5bf-8add588ca9be%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>>> msgid/tesseract-ocr/3363264f-ba7e-41d7-a866-57a395d09755%
>>>>>> 40googlegroups.com 
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/3363264f-ba7e-41d7-a866-57a395d09755%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/e39afe04-6bcb-4b04-9697-a9e702440f37%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/e39afe04-6bcb-4b04-9697-a9e702440f37%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a15c4b73-248f-4eca-acbc-1d9dfb7cc174%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to