onts then?
>>> Or is this purely to be able to train using these fonts?
>>>
>>> Might there be another way to use the training for such a large amount
>>> of fonts?
>>> Can training the fonts into multiple language files then be the solution?
>>>
>>>
>>>
Try psm 6, also 11, 12
https://github.com/tesseract-ocr/tesseract/issues/434
On 13 Oct 2016 1:13 p.m., "fuzzy7k" wrote:
> I tried psm 0-3
>
> On Thursday, October 13, 2016 at 1:46:45 AM UTC-4, shree wrote:
>>
>> Which page segmentation mode (psm) did you try?
>>
>> On 12 Oct
Which page segmentation mode (psm) did you try?
On 12 Oct 2016 11:21 p.m., "fuzzy7k" wrote:
> I have scanned some index pages that I would like to ocr for rapid
> searching. I am using tesseract from the command line. The problem is that
> tesseract ignores the whitespace
+ Ray Smith
On 16-Dec-2016 10:58 PM, "Kay-Michael Würzner" wrote:
> Yes, I did and in principle everything works like a charm which is great.
> What I want to accomplish now is some understanding: Why do I have to set a
> documented parameter in some undocumented way or to
Did you try out the commands as per the LSTM training tutorial?
On 16-Dec-2016 8:31 PM, "Kay-Michael Würzner" wrote:
> Dear @,
>
> I played around with training the new LSTM mode. According to the
> documentation of the network specification (https://github.com/tesseract-
>
Please see https://github.com/tesseract-ocr/tesseract/issues/83 and other
PDF related issues in GitHub repo with similar discussion.
- excuse the brevity, sent from mobile
On 13-Jan-2017 10:15 PM, "James R Barlow" wrote:
> Tesseract cannot rasterize PDFs. It is fairly
Try without the following line.
--eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Jan 14, 2017 at 3:47 AM, wrote:
> I tried to
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Jan 14, 2017 at 6:14 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> Try without the following line.
>
> --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \
Does anyone know of any utilities to convert a box file to ground truth
text file?
I am using tesstrain.sh which uses text2image for trying out LSTM training.
However, because unrenderable words are not included in the tifs, it is not
possible to use the training_text as ground truth.
Thanks!
the traineddata for norlayer0.853_1615.lstm i.e. 0.853
% character error rate at iteration number 1615.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Jan 6, 2017 at 5:59 PM, ShreeDevi Kumar <shreesh...@gmail.com>
Ray is planning to retrain the languages for the new 4.0.0 version sometime
in January. So it would be helpful if you could open an issue on
https://github.com/tesseract-ocr/langdata/issues with this information.
Also, if you can provide a sample representative Norwegian text including Æ,
I will
I will give it a try and let you know.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send
at 9:36 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> Peter,
>
> Please see https://github.com/tesseract-ocr/langdata/blob/master/swe/
> swe.training_text
>
> You can provide additional training text if some needed characters are
> missing in the abov
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Dec 29, 2016 at 12:26 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> Please rebuild leptonica with the latest source from github (
> https://github.com/DanBloomberg/leptonica)
> and then re
Please rebuild leptonica with the latest source from github (
https://github.com/DanBloomberg/leptonica)
and then rebuild tesseract with the latest source from github (
https://github.com/tesseract-ocr/tesseract) and try.
ShreeDevi
भजन
What about osd.traineddata and config files? Are they in your tessdata
directory?
- excuse the brevity, sent from mobile
On 01-Jan-2017 9:22 PM, wrote:
> Hi all,
>
> I'm in a time critical situation. I want to deliver a new software for our
> customer on 5th
Is TESSDATA _PREFIX variable set in the environment? If so, what is the
directory, it is pointing to?
- excuse the brevity, sent from mobile
On 01-Jan-2017 9:35 PM, "ShreeDevi Kumar" <shreesh...@gmail.com> wrote:
> What about osd.traineddata and config files? Are th
PC froze so
>> I rebooted and created the traineddata for norlayer0.853_1615.lstm i.e.
>> 0.853 % character error rate at iteration number 1615.
>>
>>
>> ShreeDevi
>> ____
>> भजन - कीर्तन - आरती @ http://bhaj
ta
>>>>
>>>> See attached log and info file for commands used in training. It took
>>>> about 9 hours on my pc - about 1700 iterations only and then my PC froze so
>>>> I rebooted and created the traineddata for norlayer0.853_1615.lstm i.e.
>>&g
Tried 'Finetune' - that does not help in addition of a character.
Trying 'Add a layer' now.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Jan 5, 2017 at 8:59 PM, Ludvig F Aarstad wrote:
>
Peter,
Please see
https://github.com/tesseract-ocr/langdata/blob/master/swe/swe.training_text
You can provide additional training text if some needed characters are
missing in the above. I can do a test training with it.
- excuse the brevity, sent from mobile
On 06-Jan-2017 5:01 PM, "Peter"
combine_tessdata -u ara.traineddata ara.
On 19-Dec-2016 1:57 PM, "universal reseller" wrote:
> this is not a zip file..
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop
You also need to add the location of tesseract binaries to PATH.
- sent from mobile phone
On 22-Dec-2016 9:50 AM, "Junmock Lee" wrote:
> How To Add/Edit Environment Variables in Windows 7
> https://www.nextofwindows.com/how-to-addedit-environment-
>
There might be some problem with your input file - all the following work
for me.
Please note that whitelist has no effect in 4.0
$ tesseract input.tif input
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
Page 1
$ tesseract input.tif input --psm 7
Tesseract Open Source OCR Engine
The initial 4.0alpha tag from November has cube in it. It was deleted later
and is no longer in master.
In fact, the OEM code for LSTM was originally 4 and now is 2.
Shouldn't semantic versioning require tagging at major updates?
- excuse the brevity, sent from mobile
On 22-Mar-2017 8:58 PM,
See
https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance
- excuse the brevity, sent from mobile
On 22-Mar-2017 8:58 PM, "universal reseller" wrote:
> how did you used cube engine on tesse 4 !?
>
> --
> You received this message because you are
Sorry, mentioned incorrect code for LSTM
OCR Engine modes:
0Original Tesseract only.
1Neural nets LSTM only.
2Tesseract + LSTM.
3Default, based on what is available
- excuse the brevity, sent from mobile
On 22-Mar-2017 9:02 PM, "ShreeDevi Kumar" <shreesh
Use Tesseract 4.0.0alpha and --oem 1 for LSTM. It works ok with that.
--oem 0 with legacy engine gives / instead of i.
you could test to see if a better dpi image(300 dpi) works with the
legacy engine.
ShreeDevi
भजन - कीर्तन - आरती @
https://github.com/tesseract-ocr/tesseract/wiki/AddOns
has link to traineddata for digital seven fonts.
https://github.com/arturaugusto/display_ocr
You can download various digital seven fonts, create traineing data images
and train - all in Jtessboxeditor. Use 3.0x version
ShreeDevi
Try latest version of tesseract - build from master. Use --psm 7 --oem 1
I get correct result for both.
tesseract unnamed1.png unnamed1 --psm 7 --oem 1
Tesseract Open Source OCR Engine v4.00.00alpha-347-g60c8b12 with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.
ShreeDevi
what version of tesseract are you running? If you built it, which commit
source have you used?
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 23, 2017 at 4:28 PM, Jenkar Smithy
wrote:
Ok. I am using an older version ...
git log -1
commit 0ff26ee3de166659970d80e50aef4000ff2557b2
Author: zdenop
Date: Fri Feb 3 08:15:15 2017 +0100
Merge pull request #698 from stweil/configure
configure: Run AVX test only with 64 bit compiler
Please try with that.
see https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage
also check that u have pdf.ttf in your tessdata folder
https://github.com/tesseract-ocr/tesseract/tree/master/tessdata
tesseract --tessdata-dir ./ ./testing/eurotext.png
./testing/eurotext-eng -l eng pdf
ShreeDevi
in https://github.com/tesseract-ocr/tesseract/tree/master/tessdata
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 23, 2017 at 7:04 PM, Saliaj Adrian wrote:
> No I don't have pdf.ttf in my
March 22, 2017 at 12:04:24 PM UTC-4, shree wrote:
>>
>> Sorry, mentioned incorrect code for LSTM
>>
>> OCR Engine modes:
>> 0Original Tesseract only.
>> 1Neural nets LSTM only.
>> 2Tesseract + LSTM.
>> 3Default, base
FYI - this was trained using eng.traineddata and finetuned with 7segment
fonts.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Wed, Mar 29, 2017 at 9:09 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
>
Hi,
I have built a 4.0 traineddata using some seven segment display fonts.
Trained mostly on numbers 0-9, capital letters A-Z, : etc.
It is uploaded as a zip file at
https://github.com/Shreeshrii/tessdata4alpha/raw/master/ssd1.zip
unzip to get ssd1.traineddata
I have not tested it much.
The problem is with the input image. It does not have correct information
about dpi.
Please preprocess image to 300 dpi for better output.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Wed, Mar 29, 2017 at 8:40 AM,
Added link in wiki -
https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM
@THintz, please fix your readme file,
>cd \petri mkdir Win64 cd Win64 git clone
https://github.com/tesseract-ocr/tesseract tesseract cd tesseract cppan (I
assume this wasn't necessary, but I'm trying to avoid
Egor (cc:ed) can provide guidance regarding cppan and cmake.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 16, 2017 at 6:30 PM, THintz wrote:
> I spoke too soon. Apparently I touched
Gui front-end for tesseract such as Vietocr and gimagereader will also
allow for batch processing of multiple files.
- excuse the brevity, sent from mobile
On 16-Mar-2017 9:13 PM, "Lako" wrote:
> Hi,
>
> Apologies for the beginner question, unfortunately I am fairly
Please inform what environment you are running in, Linux, windows, etc.
Basically, you need to to setup a loop which will process all .PNG files
and concatenate the OCR results.
- excuse the brevity, sent from mobile
On 16-Mar-2017 9:13 PM, "Lako" wrote:
> Hi,
>
>
https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage
On windows
Tesseract.exe loc.tif loc
Make sure tesseract.exe binary is in PATH and that tessdata_prefix variable
points to where u have the traineddata files.
- excuse the brevity, sent from mobile
On 20-Mar-2017 11:22 AM,
Make sure your input file phototest.tiff is in C:\Program
Files\Tesseract-OCR
Otherwise give full path to file.
Main error is
image file not found
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Tue, Mar 21, 2017
Thanks for sharing how you made the x64 solution for Visual Studio.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Wed, Mar 15, 2017 at 9:44 PM, THintz wrote:
> I follow the github instructions
You did not mention from where you installed leptonica and tesseract.
what info do you see when you type
tesseract -v
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 16, 2017 at 2:21 PM, Kazi Moinul Hossain
Please see https://github.com/tesseract-ocr/tesseract/issues/233
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 16, 2017 at 2:41 PM, Kazi Moinul Hossain
wrote:
> Tesseract
_
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Fri, Mar 17, 2017 at 7:09 PM, ShreeDevi Kumar <shree...@gmail.com>
>> wrote:
>>
>>> try
>>>
>>> sudo apt-get remove libleptonica-dev
>>>
>>> ShreeDe
Please see
https://github.com/tesseract-ocr/tesseract/issues/654#issuecomment-274574951
for more details about LSTM training.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Mon, Mar 13, 2017 at 8:35 PM, Martin
>Is there anything more you did in the "src" and "prog" directory under
leptonica folder like "make allheaders", "make xtractprotos"?
No.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving
I use the following batch files in the folders where I have cloned
tesseract and leptonica.
1. leptonica
#!/bin/bash
git pull origin
./autobuild
#./configure --disable-dependency-tracking
./configure
make
sudo make install
sudo ldconfig
cd prog
make
cd ..
2. tesseract
#!/bin/bash
./autogen.sh
PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> I use the following batch files in the folders where I have cloned
> tesseract and leptonica.
>
> 1. leptonica
>
> #!/bin/bash
> git pull origin
> ./autobuild
> #./configure --disable-dependency-tracking
> ./
Also you have not responded to zdenko's suggestion to provide output of
ldd tesseract
or
ldd /usr/local/bin/tesseract
(use the location of tesseract, which you can find by
which tesseract)
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
try
sudo apt-get remove libleptonica-dev
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Mar 17, 2017 at 6:06 PM, Kazi Moinul Hossain
wrote:
> how can i uninstall old leptonica fully? I
sudo apt-get remove libleptonica-dev libleptonica
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Mar 17, 2017 at 7:09 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> try
>
> sudo apt-get remo
You need to get vietocr 5.0 alpha for tesseract 4.0 alpha
https://sourceforge.net/projects/vietocr/files/vietocr.net/5.0alpha/
https://sourceforge.net/projects/vietocr/files/vietocr/5.0alpha/
ShreeDevi
भजन - कीर्तन - आरती @
Read
https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Finetune
4.0 is alpha software. Please use an older released version.
- excuse the brevity, sent from mobile
On 05-Apr-2017 1:55 PM, wrote:
> After u have said,
>
> I tried in two ways and i am stuck at lstm step:
>
> Training
>
> command used:
>
>
Have you tried just using the eng.traineddata directly with tess 3.04/ 3.05
/ 4.0?
You don't need to train unless it is a very special case. You can try
changing the dictionary dawg files with tess 3.0x.
ShreeDevi
भजन - कीर्तन -
You do not have the LSTM.train config file.
- excuse the brevity, sent from mobile
On 05-Apr-2017 1:55 PM, wrote:
> After u have said,
>
> I tried in two ways and i am stuck at lstm step:
>
> Training
>
> command used:
>
>
See
https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain.sh
https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain_utils.sh
https://github.com/tesseract-ocr/tesseract/blob/master/training/language-specific.sh
--
You received this message because you are
Tesstrain.sh generates a file called eng.training_files.txt
You are using command without .text extension
Check the name of generated file and use that.
I have found that editing that file also gives errors.
- excuse the brevity, sent from mobile
On 04-Apr-2017 7:01 PM,
Saurabh,
It depends on what you want to do with the bash script.
Here is a sample of a script I used to compare results using diff tessdata
files by looping thru a set of image files. Google the bash commands to
figure out what they do!
#!/bin/bash
set -vx
export
jpn.config in langdata/jpn is loading jpn_vert as a sublanguage
tessedit_load_sublangs jpn_vert
You can try without that
Also look at the settings for jpn in training/language_specific.sh
You may need to change the following also ..
# The following fonts will be rendered vertically in phase
Did you build it with debug option?
That number refers to the git revision of the code, so it is easy to know
what version of source commit it refers to.
Look in github for commit that begins with that number.
ShreeDevi
भजन - कीर्तन -
You can use jtessboxeditor to edit the box files. Make sure to mark EOL if
you are trying to train using scanned images.
Also note that this part of code is untested - training 4.0 using
pre-existing images and box files.
Ray has only explained method for using images created by text2image.
--linedata-only means that it will only try to create lstmf files and not
the files for 3.0x traing
- excuse the brevity, sent from mobile
On 12-Apr-2017 10:39 AM, "Ahmad Moawad" wrote:
> Hello All,
>
> I want help in trainingTesseract 4.00 Finetune
>
Lstm training is not like legacy training. Please read the wiki pages
regarding 4.0 training. I have given all sample commands there. There are 3
different ways of training.
Read the bash scripts regarding training to know more.
tesstrain.sh with --linedata-only creates the box tiff pairs but
Read the bash scripts in
tesstrain.sh
tesstrain_utils.sh
language_specific.sh
In training directory
To understand more detail about lstm training
- excuse the brevity, sent from mobile
On 12-Apr-2017 10:47 AM, "Ahmad Moawad" wrote:
> this is the part from
see
https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain.sh
if ((LINEDATA)); then
phase_E_extract_features "lstm.train" 8 "lstmf"
make__lstmdata
else
phase_E_extract_features "box.train" 8 "tr"
phase_C_cluster_prototypes "${TRAINING_DIR}/${LANG_CODE}.normproto"
if
Arabic was never trained with the legacy tesseract engine and I doubt you
will get any improvement over existing traineddata using cube or lstm.
You are free to experiment and see what you come up with.
I have pointed to the bash scripts for training. Please refer to them for
the correct
See https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage
Follow correct order of variables
tesseract imagename|stdin outputbase|stdout [options...] [configfile...]
ShreeDevi
भजन - कीर्तन - आरती @
I have added this at https://github.com/tesseract-ocr/langdata/issues/67
Please add more information there:
Which language code - arm or hye
Modern Armenian or Classical Armenian
Sources for primary texts in unicode the Armenian language to use for
training
Freely available unicode fonts to
You can ignore it. I get it too when using sudo 2nd time.
Host name must be the id for your computer under windows10.
Have u tried running tesseract after that?
- excuse the brevity, sent from mobile
On 11-Apr-2017 4:10 PM, "Ibr" wrote:
Hi,
I'm trying to install the
Also, if you want training tools, you need to build them separately - see
https://github.com/tesseract-ocr/tesseract/wiki/Compiling
make training
sudo make training-install
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
Please open as issue, as problem related to --psm 0.
- excuse the brevity, sent from mobile
On 13-Apr-2017 9:29 AM, "Pritam Dodeja" wrote:
> Find below - I can also ship my docker container to you if you want so you
> can see my exact setup, it's about 1.15GB
>
>
If you want to OCR an invoice like the sample you posted, just use the
eng.traineddata and OCR the page. You do not need to do any training.
Here is the output I get
8633 0410 NO RP 11 07122015 NYNN 01 01 0001 Page 2 Of 3
Did you know?
Your Comcast Business Internet
service gives
I haven't built 3.05 so cannot help. I would suggest that you try with
older commits of tesseract 3.05 branch to see which one works.
Hope that those who have built 3.05 on mac will help.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To
You can check that these are installed by entering the following
which text2image
The above will show u the location it is installed
If you don't have training tools, you will need to build them separately -
see https://github.com/tesseract-ocr/tesseract/wiki/Compiling
make training
sudo make
362b68e)
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sun, Apr 23, 2017 at 9:25 AM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
> Try training using more samples of 8, 9, B etc.
>
> What res
Try training using more samples of 8, 9, B etc.
What results do you get with the provided eng.traineddata? Are they better
or worse?
Have you tried changing DPI of image to 300?
- excuse the brevity, sent from mobile
On 22-Apr-2017 10:29 PM, "James Abney" wrote:
> Oh yes
I have added it as an issue. Please see
https://github.com/tesseract-ocr/tesseract/issues/754
You may want to create a pull request, if you have a solution.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sun, Mar 5,
The only public information regarding LSTM that has been shared by
Google/Ray is linked from the following pages:
https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM
https://github.com/tesseract-ocr/docs/tree/master/das_tutorial2016
Also see
https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Mar 2, 2017 at 8:46 PM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:
see
https://github.com/tesseract-ocr/tesseract/blob/master/ChangeLog
https://github.com/tesseract-ocr/tesseract/releases
https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes
ShreeDevi
भजन - कीर्तन - आरती @
screenshot of warning means that your image does not have resolution info.
Your OCR output file should have been created.
Training 4.0 is not easy. Please see
https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM
ShreeDevi
भजन
Arabic traineddata for 3.0x uses cube engine. Training process for that was
never shared. Now the cube engine has been removed for lstm 4.0, which is
still in alpha stage.
There is 4.0alpha traineddata for Arabic and you can train for it , but
accuracy is not great. Ray is doing another training
Have u tried --psm 6
- excuse the brevity, sent from mobile
On 06-Apr-2017 11:06 PM, "Mike Hall" wrote:
> We have a C# .Net app that is using Tesseract to do Optical Character
> Recognition (OCR) on .tiff files. I've attached a sample tiff file.
>
> We are then
You must be using an old version of traineddata which does not have LSTM.
- excuse the brevity, sent from mobile
On 07-Apr-2017 2:13 AM, wrote:
> I am following this link https://github.com/tesseract-ocr/tesseract/wiki/
> TrainingTesseract-4.00---Finetune
>
> For genaerating
Normally, for text output, the other config files should not impact.
- excuse the brevity, sent from mobile
On 07-Apr-2017 2:18 AM, "Mike Hall" wrote:
> Yes, we are using the -psm 6 command line argument. And it was not
> working.
>
> But I figured out the issue.
>
>
Use latest version of leptonica - 1.74.1
https://github.com/DanBloomberg/leptonica
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Mon, Apr 17, 2017 at 8:18 PM, Peter Reid wrote:
> I've done
Please see https://github.com/tesseract-ocr/tesseract/wiki/Compiling
If you are building tesseract 4.0, you need Lept 1.74
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Tue, Apr 18, 2017 at 2:25 PM, Peter Reid
add a line similar to following to your training command, pointing to where
you have your training text
--training_text ../langdata/eng/eng.training_text \
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Mon, Jul
You need to mv or rename the files with por. prefix
then when you use combine_tessdata command it will use all por. files to
create traineddata.
see
https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain_utils.sh
mv ${TRAINING_DIR}/inttemp
Ray has uploaded new traineddata files in
https://github.com/tesseract-ocr/tessdata/tree/master/best
Why don't you first try recognition with that
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Tue, Aug 1, 2017 at
Seems to work fine for me.
Are you sure that you have relevant files in the directories listed in
that command?
check tessdata, langdata location.
Use tessdata/best/*.traineddata as the existing models.
ShreeDevi
भजन - कीर्तन -
With English you should probably get close to 99% accuracy.
Is your png at 300 dpi?
Which version of tesseract did you use?
Which traineddata?
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Aug 12, 2017 at
for 3.05 don't you need to checkout the 3.05 branch??
master is for 4.0 development.
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Jul 7, 2017 at 9:22 PM, akhil katpally
wrote:
>
Forwarding update by Ray.
-- Forwarded message --
From: theraysmith
Date: Wed, Jul 12, 2017 at 5:55 AM
Subject: Re: [tesseract-ocr/tesseract] Tag a new version for LSTM 4.0 (#995)
To: tesseract-ocr/tesseract
I'm about
If using 3.05 branch
try configs such as
digits
whitelist
ShreeDevi
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sun, Jul 9, 2017 at 7:36 PM, Prav wrote:
> Any suggestions for any configuration which i
201 - 300 of 761 matches
Mail list logo