[tesseract-ocr] How to calculate the tesseract OCR time?

2020-11-03 Thread Kirankumar Chincholi
Hello everyone,

I need to calculate the tesseract OCR timing for different languages.

Thanks & Regards 
Kirankumar

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/bb6f7226-3e2a-4c2d-86c3-c73ac0730e75n%40googlegroups.com.


[tesseract-ocr] Tesseract poor output

2020-11-03 Thread Sina Azm
hello guys im' geting nonsense from tesseract anybody knows why ? here is 
the code : 
  let data = await myCamera.takePictureAsync({});  
  
  let path = data.uri.replace('file://', '');
  setPicture(data.uri)
  RNTesseractOcr.recognize(path,  'LANG_ENGLISH'   , 
tessOptions)
  .then((result) => {
console.log("OCR Result: ", result);
  })
  .catch((err) => {
console.log("OCR Error: ", err);
  }) 
  . 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5b58e0f9-768b-452c-97f3-2ec7dadc8d3fn%40googlegroups.com.


[tesseract-ocr] Re: Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread 吳明恩
Got it, if i use "tesseract installer from Mannheim University 
"  to can build to  read 
image for c++, i can't see library & h  file & c file for install script 
with teessract-ocr. 

吳明恩於 2020年11月3日星期二 UTC+8下午4時41分59秒寫道:
>
> Environment
> Tesseract Version:
> 1.tesseract 4.1.1
> 2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
> 3.libtiff 4.1.0
> Found AVX2
> Found AVX
> Found FMA
> Found SSE
> Platform: Windows 10 64-bit
>
> I use tesseract.exe to OCR jpg to show something error, so i wnat do 
> sothing?
>
> Error message & command:
> .\tesseract.exe .\Google.jpg eng
> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
> Error in pixReadStreamPng: function not present
> Error in pixReadStream: png: no pix returned
> Error in pixRead: pix not read
> Error during processing.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/108efb96-ec42-4c75-9933-00798c676d20o%40googlegroups.com.


[tesseract-ocr] Re: Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread 吳明恩

Got it, if i use "tesseract installer from Mannheim University 
"  to can build to  read 
image for c++, i can't see library & dll  for install script with 
teessract-ocr. 


吳明恩於 2020年11月3日星期二 UTC+8下午4時41分59秒寫道:
>
> Environment
> Tesseract Version:
> 1.tesseract 4.1.1
> 2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
> 3.libtiff 4.1.0
> Found AVX2
> Found AVX
> Found FMA
> Found SSE
> Platform: Windows 10 64-bit
>
> I use tesseract.exe to OCR jpg to show something error, so i wnat do 
> sothing?
>
> Error message & command:
> .\tesseract.exe .\Google.jpg eng
> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
> Error in pixReadStreamPng: function not present
> Error in pixReadStream: png: no pix returned
> Error in pixRead: pix not read
> Error during processing.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c5531b0b-8b5d-44f8-9e25-a6b763f5fc04o%40googlegroups.com.


Re: [tesseract-ocr] URGENT DEADLINE: NEED HELP WITH NEW LANGUAGE, PLEASE RESPOND

2020-11-03 Thread Cailey McVay
Thank you so much! I got it working. Didn't think about inverting the 
images.
Best,
Cailey

On Sunday, November 1, 2020 at 11:59:00 AM UTC-5 Cailey McVay wrote:

> How did you invert the image? And is there a code I can use to invert the 
> rest of my images to try with more sample data?
>
> On Sunday, November 1, 2020 at 10:55:00 AM UTC-5 shree wrote:
>
>> Invert the image. Results using tessdata_best/eng - LSTM engine
>>
>> $ tesseract legacy-invert.jpg - --psm 6
>> 063.433
>> $ tesseract legacy-300.jpg - --psm 6
>> 063.433
>> $ tesseract legacy-144.jpg - --psm 6
>> 063.433
>>
>>
>>
>> On Sun, Nov 1, 2020 at 8:37 PM Cailey McVay  
>> wrote:
>>
>>> Here is an example of the sample image. I believe we are using the 
>>> legacy engine. Does this help?
>>>
>>> On Saturday, October 31, 2020 at 11:15:46 PM UTC-4 shree wrote:
>>>
 >When we use tesseract on the images without the trained language we 
 receive outputs that are accurate about 50% of the time. 

 You haven't shared a sample image. Sometimes preprocessing the images, 
 using a whitelist in case of limited character set can be the solution 
 rather than training.

 On Sun, Nov 1, 2020, 03:29 Cailey McVay  
 wrote:

> Hello!
> I am working on a project that is trying to read borehole video 
> depths. We trained a new language to read these numbers called NTS. When 
> we 
> use tesseract on the images without the trained language we receive 
> outputs 
> that are accurate about 50% of the time. However when we use the new 
> language, we receive no output at all. Is it possible that we overtrained 
> tesseract to not recognize any of the images? I will attach below our box 
> file, unicharset file, box trained file, pffmtable file, and normproto 
> file. Our shapetable file processes but then returns an empty file. Could 
> something be wrong with our shapetable? And if so, how could we fix that?
>
> Box File for the first five images:
> 0 3 1 14 19 0
> 9 18 0 29 20 0
> 3 33 1 46 19 0
> . 50 1 56 19 0
> 2 64 1 75 19 0
> 5 76 1 93 19 0
> 2 92 1 111 19 0
> 0 4 1 15 19 1
> 8 19 1 30 19 1
> 3 34 1 46 19 1
> . 54 1 57 5 1
> 4 65 1 77 19 1
> 1 82 1 91 19 1
> 4 96 1 107 19 1
> 0 3 1 15 19 2
> 8 19 1 30 19 2
> 6 34 1 46 19 2
> . 53 1 57 5 2
> 8 65 1 77 19 2
> 3 80 1 91 19 2
> 9 95 1 107 19 2
> 0 4 1 15 19 3
> 8 17 1 31 19 3
> 8 32 1 46 19 3
> . 52 2 58 8 3
> 1 64 0 77 20 3
> 8 80 1 91 19 3
> 5 96 1 107 19 3
> 0 3 1 15 19 4
> 8 19 1 30 19 4
> 7 34 1 47 19 4
> . 53 1 58 9 4
> 5 65 1 77 19 4
> 6 80 1 92 19 4
> 4 95 0 109 20 4
> 0 4 1 15 19 5
> 7 19 1 30 19 5
> 5 34 1 46 19 5
> . 53 1 57 5 5
> 3 65 1 76 19 5
> 1 82 1 90 19 5
> 3 96 1 107 19 5
>
>
> Unicharset:
> 14
> NULL 0 Common 0
> Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined # Joined [4a 6f 69 
> 6e 65 64 ]a
> |Broken|0|1 21 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1 # 
> Broken
> 0 8 0,255,0,255,0,0,0,0,0,0 Common 3 2 3 0 # 0 [30 ]0
> 9 8 0,255,0,255,0,0,0,0,0,0 Common 4 2 4 9 # 9 [39 ]0
> 3 8 0,255,0,255,0,0,0,0,0,0 Common 5 2 5 3 # 3 [33 ]0
> . 22 0,255,0,255,0,0,0,0,0,0 Common 6 6 6 . # . [2e ]p
> 2 8 0,255,0,255,0,0,0,0,0,0 Common 7 2 7 2 # 2 [32 ]0
> 5 8 0,255,0,255,0,0,0,0,0,0 Common 8 2 8 5 # 5 [35 ]0
> 8 8 0,255,0,255,0,0,0,0,0,0 Common 9 2 9 8 # 8 [38 ]0
> 4 8 0,255,0,255,0,0,0,0,0,0 Common 10 2 10 4 # 4 [34 ]0
> 1 8 0,255,0,255,0,0,0,0,0,0 Common 11 2 11 1 # 1 [31 ]0
> 6 8 0,255,0,255,0,0,0,0,0,0 Common 12 2 12 6 # 6 [36 ]0
> 7 8 0,255,0,255,0,0,0,0,0,0 Common 13 2 13 7 # 7 [37 ]0
>
>
> NTS.font.exp0.tr file:
> font 0 3 1 14 19 0
>  4
> mf 16
>  -0.085041896 0.30783021 0.27617577 0 0 0
>  -0.25234067 0.27376649 0.089746617 0.13718249 0 0
>  -0.28155157 0.0045010448 0.47040343 0.25 0 0
>  -0.25234067 -0.26476437 0.08974655 0.36281759 0 0
>  -0.085041896 -0.29882804 0.27617577 0.5 0 0
>  -0.031931162 -0.21447986 0.1730229 0.96998096 0 0
>  -0.11690831 0.020721853 0.43796182 0.75 0 0
>  -0.031931162 0.23970276 0.1699543 0.5 0 0
>  0.24424461 0.072628468 0.47339222 0.76789355 0 0
>  0.1353676 0.30783021 0.16464323 0 0 0
>  0.10615671 0.18941826 0.14627755 0.37934926 0 0
>  0.15926743 -0.011719763 0.30170703 0.25 0 0
>  0.10615671 -0.19663697 0.12619166 0.090763755 0 0
>  0.1353676 -0.29882804 0.16464323 0.5 0 0
>  0.27079996 -0.26476437 0.12619169 0.59076369 0 0
>  0.29735535 -0.19663697 0.086383387 0.85538673 0 0
> cn 1
>  0.36328125 0.35781249 0.2421875 0.1484375
> if 73
>  133 69 248
>  119 72 248
>  104 75 248
>  97 82 192
>  97 95 192
>  97 107 192
>  97 120 192
>  97 132 192
>  97 145 192
>  97 157 192

Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Enzo Merotto
We found the problem it was because we used the whitelist of SetVariables 
without space in the previous version of tesseract and we forgot to add it. 
We do not use SetVariables anymore. Now it works thank you.

Enzo Merotto

Le mardi 3 novembre 2020 à 13:17:22 UTC+1, zdenop a écrit :

> tesseract "executable" (which is also an example how to use the tesseract 
> library) handles it correctly (for LSTM and legacy engine). So check the 
> source code
>
> Zdenko
>
>
> ut 3. 11. 2020 o 12:45 Enzo Merotto  napísal(a):
>
>> I'm not sure because in TESSERACT_ONLY mode there are spaces, so it 
>> works. It's not the case of LTSM mode.
>>
>> Le mardi 3 novembre 2020 à 12:31:31 UTC+1, zdenop a écrit :
>>
>>> IMO that is problem of your code. Have a look at tesseract code how to 
>>> handle spaces.
>>> Here is result for you image for different OEM:
>>>
>>> > tesseract test_2020-11-03_122112048.png - --oem 0 -l fra
>>>
>>> En votre aimable règlement,
>>> Cordialement,
>>>
>>> > tesseract test_2020-11-03_122112048.png - --oem 1 -l fra
>>>
>>> En votre aimable règlement,
>>> Cordialement,
>>>
>>> > tesseract test_2020-11-03_122112048.png - --oem 2 -l fra
>>>
>>> En votre aimable règlement,
>>> Cordialement,
>>>
>>>
>>>
>>>
>>>
>>>
>>> Zdenko
>>>
>>>
>>> ut 3. 11. 2020 o 11:56 Enzo Merotto  napísal(a):
>>>
 We have recently change the version of tesseract from 3.02 to 4.0 to 
 improve the performance and the rapidity, but when we use the LTSM mode, 
 firstly we have a warning about the dpi: "Invalid resolution 0 dpi. Using 
 70 instead". We know why this problem appears. I don't know if the problem 
 of non spaces detection comes from this warning. 
 Look this example that is a french text:
 [image: CaptureText.PNG]
 We can see the warning and the transcribed text on the terminal without 
 spaces. We expected:
 "En votre aimable règlement,
 Cordialement,"

 This is how we use tesseract:  
 [image: CaptureCode1.PNG]
 [image: CaptureCode3.PNG][image: CaptureCode2.PNG]
 The image is a cv::Mat with 1 channel (8UC1).

 Enzo Merotto

 Le mardi 3 novembre 2020 à 09:52:36 UTC+1, zdenop a écrit :

> Please provide reproducible example of what you are doing, how, what 
> is the result and desired result.
>
> Zdenko
>
>
> ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):
>
>> Hello,
>> I have a problem with the ltsm mode because it do not detect space 
>> and regroup every words in one.
>> Do you have an idea of why it does not detect spaces ?
>>
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, 
>> send an email to tesseract-oc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
>>  
>> 
>> .
>>
> -- 
 You received this message because you are subscribed to the Google 
 Groups "tesseract-ocr" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.

>>> To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/8e1189b6-929c-4ed3-8400-92a841c12fafn%40googlegroups.com
  
 
 .

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/9cad96b3-3d62-4f5b-b45a-70c50e539a90n%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e55ac896-d704-4514-8331-8f87d7857769n%40googlegroups.com.


Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Zdenko Podobny
tesseract "executable" (which is also an example how to use the tesseract
library) handles it correctly (for LSTM and legacy engine). So check the
source code

Zdenko


ut 3. 11. 2020 o 12:45 Enzo Merotto  napísal(a):

> I'm not sure because in TESSERACT_ONLY mode there are spaces, so it works.
> It's not the case of LTSM mode.
>
> Le mardi 3 novembre 2020 à 12:31:31 UTC+1, zdenop a écrit :
>
>> IMO that is problem of your code. Have a look at tesseract code how to
>> handle spaces.
>> Here is result for you image for different OEM:
>>
>> > tesseract test_2020-11-03_122112048.png - --oem 0 -l fra
>>
>> En votre aimable règlement,
>> Cordialement,
>>
>> > tesseract test_2020-11-03_122112048.png - --oem 1 -l fra
>>
>> En votre aimable règlement,
>> Cordialement,
>>
>> > tesseract test_2020-11-03_122112048.png - --oem 2 -l fra
>>
>> En votre aimable règlement,
>> Cordialement,
>>
>>
>>
>>
>>
>>
>> Zdenko
>>
>>
>> ut 3. 11. 2020 o 11:56 Enzo Merotto  napísal(a):
>>
>>> We have recently change the version of tesseract from 3.02 to 4.0 to
>>> improve the performance and the rapidity, but when we use the LTSM mode,
>>> firstly we have a warning about the dpi: "Invalid resolution 0 dpi. Using
>>> 70 instead". We know why this problem appears. I don't know if the problem
>>> of non spaces detection comes from this warning.
>>> Look this example that is a french text:
>>> [image: CaptureText.PNG]
>>> We can see the warning and the transcribed text on the terminal without
>>> spaces. We expected:
>>> "En votre aimable règlement,
>>> Cordialement,"
>>>
>>> This is how we use tesseract:
>>> [image: CaptureCode1.PNG]
>>> [image: CaptureCode3.PNG][image: CaptureCode2.PNG]
>>> The image is a cv::Mat with 1 channel (8UC1).
>>>
>>> Enzo Merotto
>>>
>>> Le mardi 3 novembre 2020 à 09:52:36 UTC+1, zdenop a écrit :
>>>
 Please provide reproducible example of what you are doing, how, what is
 the result and desired result.

 Zdenko


 ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):

> Hello,
> I have a problem with the ltsm mode because it do not detect space and
> regroup every words in one.
> Do you have an idea of why it does not detect spaces ?
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to tesseract-oc...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
> 
> .
>
 --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/8e1189b6-929c-4ed3-8400-92a841c12fafn%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/9cad96b3-3d62-4f5b-b45a-70c50e539a90n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xQZYG3XAwfx6M6iJVtWdjCW8WyEWHX7pOJ9X6PjR2Wrw%40mail.gmail.com.


Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Enzo Merotto
I'm not sure because in TESSERACT_ONLY mode there are spaces, so it works. 
It's not the case of LTSM mode.

Le mardi 3 novembre 2020 à 12:31:31 UTC+1, zdenop a écrit :

> IMO that is problem of your code. Have a look at tesseract code how to 
> handle spaces.
> Here is result for you image for different OEM:
>
> > tesseract test_2020-11-03_122112048.png - --oem 0 -l fra
>
> En votre aimable règlement,
> Cordialement,
>
> > tesseract test_2020-11-03_122112048.png - --oem 1 -l fra
>
> En votre aimable règlement,
> Cordialement,
>
> > tesseract test_2020-11-03_122112048.png - --oem 2 -l fra
>
> En votre aimable règlement,
> Cordialement,
>
>
>
>
>
>
> Zdenko
>
>
> ut 3. 11. 2020 o 11:56 Enzo Merotto  napísal(a):
>
>> We have recently change the version of tesseract from 3.02 to 4.0 to 
>> improve the performance and the rapidity, but when we use the LTSM mode, 
>> firstly we have a warning about the dpi: "Invalid resolution 0 dpi. Using 
>> 70 instead". We know why this problem appears. I don't know if the problem 
>> of non spaces detection comes from this warning. 
>> Look this example that is a french text:
>> [image: CaptureText.PNG]
>> We can see the warning and the transcribed text on the terminal without 
>> spaces. We expected:
>> "En votre aimable règlement,
>> Cordialement,"
>>
>> This is how we use tesseract:  
>> [image: CaptureCode1.PNG]
>> [image: CaptureCode3.PNG][image: CaptureCode2.PNG]
>> The image is a cv::Mat with 1 channel (8UC1).
>>
>> Enzo Merotto
>>
>> Le mardi 3 novembre 2020 à 09:52:36 UTC+1, zdenop a écrit :
>>
>>> Please provide reproducible example of what you are doing, how, what is 
>>> the result and desired result.
>>>
>>> Zdenko
>>>
>>>
>>> ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):
>>>
 Hello,
 I have a problem with the ltsm mode because it do not detect space and 
 regroup every words in one.
 Do you have an idea of why it does not detect spaces ?

 -- 
 You received this message because you are subscribed to the Google 
 Groups "tesseract-ocr" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
  
 
 .

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/8e1189b6-929c-4ed3-8400-92a841c12fafn%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9cad96b3-3d62-4f5b-b45a-70c50e539a90n%40googlegroups.com.


Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Zdenko Podobny
IMO that is problem of your code. Have a look at tesseract code how to
handle spaces.
Here is result for you image for different OEM:

> tesseract test_2020-11-03_122112048.png - --oem 0 -l fra
En votre aimable règlement,
Cordialement,

> tesseract test_2020-11-03_122112048.png - --oem 1 -l fra
En votre aimable règlement,
Cordialement,

> tesseract test_2020-11-03_122112048.png - --oem 2 -l fra
En votre aimable règlement,
Cordialement,





Zdenko


ut 3. 11. 2020 o 11:56 Enzo Merotto  napísal(a):

> We have recently change the version of tesseract from 3.02 to 4.0 to
> improve the performance and the rapidity, but when we use the LTSM mode,
> firstly we have a warning about the dpi: "Invalid resolution 0 dpi. Using
> 70 instead". We know why this problem appears. I don't know if the problem
> of non spaces detection comes from this warning.
> Look this example that is a french text:
> [image: CaptureText.PNG]
> We can see the warning and the transcribed text on the terminal without
> spaces. We expected:
> "En votre aimable règlement,
> Cordialement,"
>
> This is how we use tesseract:
> [image: CaptureCode1.PNG]
> [image: CaptureCode3.PNG][image: CaptureCode2.PNG]
> The image is a cv::Mat with 1 channel (8UC1).
>
> Enzo Merotto
>
> Le mardi 3 novembre 2020 à 09:52:36 UTC+1, zdenop a écrit :
>
>> Please provide reproducible example of what you are doing, how, what is
>> the result and desired result.
>>
>> Zdenko
>>
>>
>> ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):
>>
>>> Hello,
>>> I have a problem with the ltsm mode because it do not detect space and
>>> regroup every words in one.
>>> Do you have an idea of why it does not detect spaces ?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/8e1189b6-929c-4ed3-8400-92a841c12fafn%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xSuJqfw_HC9%2B22zpSRVNq0HOhTvhrc%2BG0fCveubsDQ8A%40mail.gmail.com.


Re: [tesseract-ocr] Re: Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread Zdenko Podobny
You can ignore that message. Some internal functions use zlib, png and tiff
but have no effect on OCR.

Your question (regarding leptonica and image types) indicates you are not
familiar with building sw from source. In such case use tesseract installer
from Mannheim University 

Zdenko


ut 3. 11. 2020 o 11:57 吳明恩  napísal(a):

> Ok, I read bmp image to get result and show something errors, but i can
> got image string.
> Error message & command :
>  .\tesseract.exe .\test.bmp eng
> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
> Warning: Invalid resolution 0 dpi. Using 70 instead.
> Estimating resolution as 284
> Error in pixWriteMemPng: function not present
> [image: error.png]
>
>
>
> And, I have a question how can i do build leptonica with  any external
> image library to can read jpg image?
>
>
> 吳明恩於 2020年11月3日星期二 UTC+8下午4時41分59秒寫道:
>>
>> Environment
>> Tesseract Version:
>> 1.tesseract 4.1.1
>> 2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
>> 3.libtiff 4.1.0
>> Found AVX2
>> Found AVX
>> Found FMA
>> Found SSE
>> Platform: Windows 10 64-bit
>>
>> I use tesseract.exe to OCR jpg to show something error, so i wnat do
>> sothing?
>>
>> Error message & command:
>> .\tesseract.exe .\Google.jpg eng
>> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
>> Error in pixReadStreamPng: function not present
>> Error in pixReadStream: png: no pix returned
>> Error in pixRead: pix not read
>> Error during processing.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/e05a2eaf-6964-4e00-8410-09320c89f209o%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yRiqE_Tb7sk7hXnPuptwzxygxiQYvPC-bZA3ctSrjhWw%40mail.gmail.com.


[tesseract-ocr] Re: Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread 吳明恩
Ok, I read bmp image to get result and show something errors, but i can got 
image string.  
Error message & command :
 .\tesseract.exe .\test.bmp eng
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 284
Error in pixWriteMemPng: function not present 
[image: error.png] 



And, I have a question how can i do build leptonica with  any external 
image library to can read jpg image?


吳明恩於 2020年11月3日星期二 UTC+8下午4時41分59秒寫道:
>
> Environment
> Tesseract Version:
> 1.tesseract 4.1.1
> 2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
> 3.libtiff 4.1.0
> Found AVX2
> Found AVX
> Found FMA
> Found SSE
> Platform: Windows 10 64-bit
>
> I use tesseract.exe to OCR jpg to show something error, so i wnat do 
> sothing?
>
> Error message & command:
> .\tesseract.exe .\Google.jpg eng
> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
> Error in pixReadStreamPng: function not present
> Error in pixReadStream: png: no pix returned
> Error in pixRead: pix not read
> Error during processing.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e05a2eaf-6964-4e00-8410-09320c89f209o%40googlegroups.com.


Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Enzo Merotto
We have recently change the version of tesseract from 3.02 to 4.0 to 
improve the performance and the rapidity, but when we use the LTSM mode, 
firstly we have a warning about the dpi: "Invalid resolution 0 dpi. Using 
70 instead". We know why this problem appears. I don't know if the problem 
of non spaces detection comes from this warning. 
Look this example that is a french text:
[image: CaptureText.PNG]
We can see the warning and the transcribed text on the terminal without 
spaces. We expected:
"En votre aimable règlement,
Cordialement,"

This is how we use tesseract:  
[image: CaptureCode1.PNG]
[image: CaptureCode3.PNG][image: CaptureCode2.PNG]
The image is a cv::Mat with 1 channel (8UC1).

Enzo Merotto

Le mardi 3 novembre 2020 à 09:52:36 UTC+1, zdenop a écrit :

> Please provide reproducible example of what you are doing, how, what is 
> the result and desired result.
>
> Zdenko
>
>
> ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):
>
>> Hello,
>> I have a problem with the ltsm mode because it do not detect space and 
>> regroup every words in one.
>> Do you have an idea of why it does not detect spaces ?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8e1189b6-929c-4ed3-8400-92a841c12fafn%40googlegroups.com.


Re: [tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Zdenko Podobny
Please provide reproducible example of what you are doing, how, what is the
result and desired result.

Zdenko


ut 3. 11. 2020 o 9:41 Enzo Merotto  napísal(a):

> Hello,
> I have a problem with the ltsm mode because it do not detect space and
> regroup every words in one.
> Do you have an idea of why it does not detect spaces ?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zLPJgjC9Or89kQR357Qaj%3D4KBsN3fyGg0o9ayEPcXc5A%40mail.gmail.com.


Re: [tesseract-ocr] Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread Zdenko Podobny
As you see: you build leptonica without any external image library (like
png, jpg, tiff), so
tesseract can read only simple image format like bmp, pgm and ppm

Zdenko


ut 3. 11. 2020 o 9:41 吳明恩  napísal(a):

> Environment
> Tesseract Version:
> 1.tesseract 4.1.1
> 2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
> 3.libtiff 4.1.0
> Found AVX2
> Found AVX
> Found FMA
> Found SSE
> Platform: Windows 10 64-bit
>
> I use tesseract.exe to OCR jpg to show something error, so i wnat do
> sothing?
>
> Error message & command:
> .\tesseract.exe .\Google.jpg eng
> Tesseract Open Source OCR Engine v4.1.1 with Leptonica
> Error in pixReadStreamPng: function not present
> Error in pixReadStream: png: no pix returned
> Error in pixRead: pix not read
> Error during processing.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/1baced08-737c-42f6-bb7c-db44901a0f29o%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8x3VuY2XTS8%2BfwwB%3DcwGyAN6Db72VyCNH3sbn8%3DVG6mPA%40mail.gmail.com.


[tesseract-ocr] Tesseract use cmake & Visual studio2019 build show something error.

2020-11-03 Thread 吳明恩


Environment
Tesseract Version:
1.tesseract 4.1.1
2.leptonica-1.76.0 (Nov 3 2020, 10:24:30) [MSC v.1927 LIB Release x64]
3.libtiff 4.1.0
Found AVX2
Found AVX
Found FMA
Found SSE
Platform: Windows 10 64-bit

I use tesseract.exe to OCR jpg to show something error, so i wnat do 
sothing?

Error message & command:
.\tesseract.exe .\Google.jpg eng
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Error in pixReadStreamPng: function not present
Error in pixReadStream: png: no pix returned
Error in pixRead: pix not read
Error during processing.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1baced08-737c-42f6-bb7c-db44901a0f29o%40googlegroups.com.


[tesseract-ocr] Tesseract remove space when I use LTSM mode

2020-11-03 Thread Enzo Merotto
Hello,
I have a problem with the ltsm mode because it do not detect space and 
regroup every words in one.
Do you have an idea of why it does not detect spaces ?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/41cb6003-55ad-43d3-b8da-699fae606625n%40googlegroups.com.