[tesseract-ocr] tesseract 4.00.00alpha with psm mode 0

2017-03-24 Thread Youcef
Hi,

I'm currently trying to use tesseract in page segmentation mode for 
orientation and script detection only (-psm 0).
Using tesseract 4.00.00alpha, i run this mode on the eurotext.tif example 
as follow :

api/tesseract testing/eurotext.tif eurotext -l eng -psm 0

the file eurotext.osd i obtained looks like this :

Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 24.31
Script: Latin
Script confidence: 35.19

The orientation in degrees is anormally 0. I also tried with a text image i 
rotated artificially with same results.
Any idea on how to explain this mismatch?
The information I found about this mode is only related to previous version 
of tesseract which gives different output.

Regards


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b78c92ab-6673-48fc-b0ec-e2952b69401f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Do I Need to Train Tesseract with Binarized Images?

2017-03-24 Thread Pedro Correia
Hello there!
I intend to train tesseract with some book pages, but I'm not sure if 
there's any difference between training it with the images and training it 
with binarized pages. Does anyone know?
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6b97e69e-bb85-44a0-8462-7a7baea71fb5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Keeping Less Number of Minimum Characters

2017-03-24 Thread Shahrukh Satti
Hi,

I'm working with vehicle number plate text extraction. The value of 
constant variable kMinCharactersToTry declared on line 36 at 
https://github.com/tesseract-ocr/tesseract/blob/master/ccmain/osdetect.cpp 
must be less since number plate has very fewer number of characters. 
Currently, it shows "Too few characters" error. Please guide me how do I 
change that value and bind that to R package ropensci/Tesseract. 

Any assistance is appreciated. 

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3fa8268a-82d3-406f-be64-2f008858d63d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: How to download the Tesseract trained data for Digital display numbers ( Seven Segments Data trained data )

2017-03-24 Thread komalagawade

Hello, 
I am basically working in electronics field and new to C#.Currently I am 
working on one project (Image processing in C#) where i am using C#,where 
in one of the part i have to detect text or digits of 7 segment display 
image for that on google i found Tesseract  solution.

For experiment i have first try to convert normal text image in to text 
file and it is working fine for some of the basic images but it is not 
working with 7 segment display.so i came to know i required trained data 
file for 7 segment.

For training 7 segment data i follow the steps which are shown in vidoe of 
below link:https://www.youtube.com/watch?v=i_1-hGsXxy8.
But the output.txt file showing in that video is not generating in my 
case.Due to which after using trained 7 segment data file ,i am getting 
garbage value in text file.So for checking that i am getting proper trained 
file or not , i have follow the procedure which is shown on that video but 
it is giving an error  like outpt.txt file not found.Is this happened 
because of missing otput.txt file or something else i am missing to do.I 
have follow all the steps which are shown in that video for training 7 
segment data.

Also i have installed jTessBoxEditorFX.jar, serak trainer & Tesseract-ocr 
v3.02.So at the end i am just stuck in the point where i don't know where i 
am going wrong,is my procedure is wrong or software installation is not 
proper because after installing tesseract there is red cross mark against 
tesseract.

Please somebody help me to figure it out.If possible please provide me 7 
segment trained data file and also the exact steps to trained 7 segment 
data as i have to trained some more files for various display icons and 
some specific messages.Its very urgent as my project is stuck and i am 
helpless because after trying so much solutions in image processing for 7 
segment display detection like pixel count & image comparison in C#, i came 
up on tesseract solution.
If you have any doubts on understanding  my query please let me know.

Please do the needful.


>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b7fc9a05-8d8d-4e68-ac02-2e71b0078557%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Does anyone has tessdata for base58 or base64?

2017-03-24 Thread Private Z
In my project i want to ocr for base58 string.But the eng  tessdata so 
big,So anyone has tessdata for base58?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ec5fe472-72d4-4159-a5e2-86ca0fffdf00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Having issue with Italic characters

2017-03-24 Thread ShreeDevi Kumar
Use Tesseract 4.0.0alpha and --oem 1 for LSTM. It works ok with that.
--oem 0 with legacy engine gives / instead of i.

you could test to see if a  better dpi image(300 dpi)  works with the
legacy engine.

ShreeDevi

भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Mar 24, 2017 at 8:01 AM, Muhammad Shamim 
wrote:

> Hi,
>
> I am using  tesseract-ocr-setup-3.05.00dev.exe
> 
> to do OCR and its working fine for me with default training data files .
> Only facing issue with Italic character .
> e.g
>  Italic "l"   => "/"
>  Italic "i"   => "/"
> Anybody has idea to deal with this issue ?
> Any extra step need to do ?
>
> Thankyou
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/0a801b3c-9dfd-48b0-ab81-af2d71e2ed91%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWV_Qk5rohX8uGqbOvE9wcL2PVH9juO8ExHvV3DhurGBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Having issue with Italic characters

2017-03-24 Thread Muhammad Shamim
Hi,

I am using  tesseract-ocr-setup-3.05.00dev.exe 
 
to do OCR and its working fine for me with default training data files .
Only facing issue with Italic character .
e.g
 Italic "l"   => "/"
 Italic "i"   => "/"
Anybody has idea to deal with this issue ?
Any extra step need to do ?

Thankyou

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0a801b3c-9dfd-48b0-ab81-af2d71e2ed91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Clue:

Like Jil/ette