will work even
> if the MICR lines are mixed with the signature, stamps, annotations...
> There is an online webapp to check the accuracy at
> https://www.doubango.org/webapps/micr/
>
> On Saturday, April 4, 2020 at 11:59:34 AM UTC+2, Essam Zaky wrote:
>>
>> Hi @mamadou
&
Hi @mamadou
how did you collected the 17000 image are they real images ,
also which type of Tensorfolw models you used , LSTM line , or single
character model
Best Regards
Essam
بتاريخ الخميس، 2 أبريل، 2020 8:22:44 م UTC+2، كتب Ghada Aruri:
>
> Hi team,
>
> For CMC-7, I want to train it by
Hi Dears
Is there a tool to view lstmf , i would like to see the input image to
model and what is the input text
Best Regards
Essam
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from
that it mean to have 10pow15 digit which is very huge data
بتاريخ الأحد، 29 مارس، 2020 11:45:01 ص UTC+2، كتب shree:
>
>
> https://github.com/tesseract-ocr/tessdoc/blob/master/TrainingTesseract-4.00.md#introduction
>
>
> On Sun, Mar 29, 2020 at 12:53 PM Essam Zaky &g
set of the best traineddata
> file, but then you can't add any characters to it.
>
> On Sun, Mar 29, 2020, 11:08 Essam Zaky >
> wrote:
>
>> Hi@shreeshrii
>> attached is the bash script as described in the following page
>>
>> https://github.com/tesseract-ocr/
ar > wrote:
>
>> Please check that you have used the correct path for the traineddata file.
>>
>> Please share the lstmtraining command that you used before this for
>> training.
>>
>> On Sat, Mar 28, 2020, 22:56 Essam Zaky >
>> wrote:
>>
>&
What do you mean by "scan a pdf " ?
If you mean recognize pdf file , you can not recognize pdf file directly
because it's unsupported format by leptonica
see the following read me
https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc
The workarround is to find a tool which
ato 28 marzo 2020 18:32:26 UTC+1, Essam Zaky ha scritto:
>>
>> PLease attach the original image to check on my machine
>>
>> بتاريخ السبت، 28 مارس، 2020 7:24:07 م UTC+2، كتب Teo:
>>>
>>> Thanks for the reply.
>>> I just opened an issue on github/Te
pho.png pho-eng -l eng pdf
> but this is the result...
>
>
> Il giorno venerdì 27 marzo 2020 03:13:40 UTC+1, Essam Zaky ha scritto:
>>
>> So I guess the error in PDF generation module
>> you have one of the following option
>> -try to enhance the bug by your self
Dear @Shreeshrii
I had followed your bash script to add Andalus font in the Arabic lanaguage
here it the script url
https://github.com/tesseract-ocr/tesseract/issues/2695#issuecomment-539412948
all steps steps works except the last one which generate the traineddata
here it's the error
before tesseract add feature of generating pdf i used library called
itextsharp to generate the pdf and the result was very good for me
بتاريخ الخميس، 26 مارس، 2020 10:54:50 م UTC+2، كتب Teo:
>
> Ok coordinates seem correct.
>
> Il giorno giovedì 26 marzo 2020 19:13:52 UTC+1, Essam Zak
Ok I think that it's a pdf generation module, because the txt is almost
>>> the same with the exception of some "the" which tesseract sees as "thè".
>>>
>>> Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto:
>>>>
>>
essing and how punctuation and
> digits are handled. If your training text does not have them, you will have
> greater success.
>
> On Wed, Mar 25, 2020, 15:32 Essam Zaky >
> wrote:
>
>> Thanx @Loranzo and @Shree
>> i will give try to fine tune , and if the res
Thanx @Loranzo and @Shree
i will give try to fine tune , and if the result still not satisfied will
switch again to build from scratch
بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky:
>
> Hi Dears ,
>
> I would like to build *.traindata from scratch specially
es as "thè".
>
> Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto:
>>
>> You need to know which to improve tesserct engine or PDF generation
>>
>> so compare text file from abby and tesserct
>> if the result is highl
is done in English model and take it as a reference
to make new Arabic model
بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky:
>
> Hi Dears ,
>
> I would like to build *.traindata from scratch specially for English and
> Arabic
>
> So lets talk about English as
You need to know which to improve tesserct engine or PDF generation
so compare text file from abby and tesserct
if the result is highly different you need to improve image quality or
improve LSTM
if the result of tesseract is good so you need to enhance the PDF
generation module
بتاريخ
Thanks @shreeshrii
Would answer the questions depending on your experience ,
also is it possible to get help from Ray ?
بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky:
>
> Hi Dears ,
>
> I would like to build *.traindata from scratch specially for English and
>
Hi Dears ,
I would like to build *.traindata from scratch specially for English and
Arabic
So lets talk about English as example
my question how to prepare fonts folder?
i read the
https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh
file
i found the
Thanks @Shreeshrii
So the following commands recognize Arabic/English text
tesseract AE.jpg AE1 -l ara+eng
tesseract AE.jpg AE2 -l script/Arabic
بتاريخ الخميس، 19 مارس، 2020 6:42:19 م UTC+2، كتب Essam Zaky:
>
> Hi Dears
>
> What is the difference between script *.traineddat
Hi Dears
What is the difference between script *.traineddata and normal
*.traineddata models ?
example their are script Arabic.traineddata and ara.traineddata
and when to use them ?
Best Regards
Essam
--
You received this message because you are subscribed to the Google Groups
would you download the article you described and attach it here , because
the medium site needs payed registration
في الثلاثاء، 25 فبراير 2020 في 6:24 م تمت كتابة ما يلي بواسطة hux _0 <
hucker.mar...@gmail.com>:
> Check this out, its a guide how to do it
>
I can not build from source
i had download SW client and save it at "D:\Essam\Software\SW" the add to
Path
and i can run SW in command line and see WS information as follow
D:\Tutorial\Git\tesseract\build>sw --version
sw.client.sw version 1.0.0
git revision
Sorry the code for my controls are release in commercial product not
allowed to share
but you can find some opensource and enhance it to display the image and
text
here it's c# open source control to display the image
https://github.com/cyotek/Cyotek.Windows.Forms.ImageBox
and you can use
I did something nearly similar , I had created two user controls by using
COM you can use c# custom controls
the first control i used it to display the text returned from OCR and the
other control for display the image
when i select the text in text control i highlight the corresponding text
This is not tesseract issue , It's programing issue if you are in windows
you can build Custem User Control and do any action you need
بتاريخ الجمعة، 15 نوفمبر، 2019 12:56:22 م UTC+2، كتب varadharajan
venkatesan:
>
> In my C# application, user will select any particular text in tiff
>
Thanks Shree
Is there any Debugger which has GUI ,i had installed Ubuntu 19 VM
بتاريخ الأربعاء، 20 نوفمبر، 2019 9:16:49 ص UTC+2، كتب Essam Zaky:
>
> Dears sorry for this basic question
> I'm new in Linux world
> now i need to build ,debug , and trace tesseract code and see how it'
Thanks Shree
The link describes the build process
?but what is the IDE will be used to debug and trace the code ,In windows i
use Visual Studio what about Linux
بتاريخ الأربعاء، 20 نوفمبر، 2019 9:16:49 ص UTC+2، كتب Essam Zaky:
>
> Dears sorry for this basic question
> I'm new in Li
HI, try to use ignore error for multiple definition as follow
try to use /FORCE:MULTIPLE linker option in your project
here it the steps
To set this linker option in the Visual Studio development environment
Dears sorry for this basic question
I'm new in Linux world
now i need to build ,debug , and trace tesseract code and see how it's
working step by step in linux
what is tools required tools to build Tesseract from source in linux
also what are tools required to debug and trace tesseract code in
For the sample images i used
The accuracy for english is good
but for arabic the cube is still better than current LSTM
بتاريخ الخميس، 6 أبريل، 2017 9:25:53 م UTC+2، كتب peiman F.:
>
> what is accuracy of result for you!?
>
--
You received this message because you are subscribed to the
Hi @.peiman
thanks for reply
i found the problem
I was installed old build for v4 from DanBolomBerg site
and the TESSDATA_PREFIX was refering to old version with cube
now i updated the TESSDATA_PREFIXin system enviornment to the new
downloaded data it's working
Thanks again
بتاريخ الخميس، 6
Hi dears
i had build tesseract and training tools from source for windows and VS2015
when recognize English page it succeeded
but when try to recognize arabic page it fails
C:\Users\emz\tesseract\build\bin\Debug>tesseract eurotext.tif eurotext -l
eng
Tesseract Open Source OCR Engine
2017 at 7:07:03 PM UTC+3, Essam Zaky wrote:
>>
>> Hi Egor
>> Should i remove strorage and tesseract folders?
>> my last failed trial was to build leptonica 1.7.4 and tesseract 3.05 for
>> Visual studio 2010
>>
>> بتاريخ الخميس، 23 فبراير، 2017 3:04:40 م
Hi Egor
Should i remove strorage and tesseract folders?
my last failed trial was to build leptonica 1.7.4 and tesseract 3.05 for
Visual studio 2010
بتاريخ الخميس، 23 فبراير، 2017 3:04:40 م UTC+2، كتب Egor Pugin:
>
> Please, check again.
> I did some fixes, in general it should work now.
>
--
these binaries
> https://www.dropbox.com/s/obiqvrt4m53pmoz/tesseract-4.0.0-alpha.zip?dl=1
>
> On Saturday, February 11, 2017 at 10:15:07 PM UTC+3, Essam Zaky wrote:
>>
>>
>> I had downloaded the following version
>>
>> http://digi.bib.uni-mannheim.de/tesserac
umped
> over the lazy fox. The quick brown dog
> jumped over the lazy fox. The quick
> brown dog jumped over the lazy fox.
>
> Please, write:
> 1. your processor
> 2. your windows version
> 3. visual studio version
>
>
>
> On Saturday, February 11, 2017 at 7:44
680 (0x9330)
encountered.
Page 1
DotProductSSE can't be used on Android
بتاريخ السبت، 11 فبراير، 2017 5:42:04 م UTC+2، كتب Egor Pugin:
>
> That's ok.
> What about your android error? Does it still exist?
>
> On Saturday, February 11, 2017 at 6:31:58 PM UTC+3, Essam Zaky wrote:
>&g
y the way, what projects do you see in the solution?
> Probably you have some build errors, so some projects left unbuilt.
> Did you see any errors during the build?
>
> On Saturday, February 11, 2017 at 2:13:28 PM UTC+3, Essam Zaky wrote:
>>
>> Dear Egor
>>
>> In
C+2، كتب Essam Zaky:
>
> Dear All
> I have Windows and Visual Studio2010,2015
> Are there any tutorial to build Tesseract4.00 from source
> Also are there any tutorial to do the training process in windows
>
> any suggestion are welcome
>
> thanks
>
--
You
gt; about leptonica version, ask Zdenko or Ray.
>
> On Sunday, January 29, 2017 at 9:50:28 PM UTC+3, Essam Zaky wrote:
>>
>> thanks Egor , Shree
>>
>> Egor,, what do you thin about Shree openion which say "There are recent
>> changes in leptonica which cater t
mail.com >
> wrote:
>
>> Tess uses stable 1.74 leptonica, not the master branch. You don't need to
>> touch anything in cppan storage.
>>
>> On Sunday, January 29, 2017 at 6:08:13 PM UTC+3, Essam Zaky wrote:
>>>
>>> Thanks Egor
>>> Sorry
e found near those binaries
> (e.g tesseract-9fa26eb4.sln.lnk). You can open it, switch to debug and
> build.
> 3. For other questions, please, read
> https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows
>
> On Sunday, January 29, 2017 at 5:38:04 PM UTC+3, Essam Za
I see some bin files here
C:\Users\emz\.cppan\storage\bin\33e598b5\Release
and some bin files here
C:\Users\emz
Also whare i can find main *.sln
i would like to build the debug version of tesseract
بتاريخ الأحد، 29 يناير، 2017 4:27:18 م UTC+2، كتب Egor Pugin:
>
> What process?
>
>
--
You
Hi Egor
It's completed now without error or crash
0 Errors
140141 Warning
how to check that process is working fine?
بتاريخ الأحد، 29 يناير، 2017 1:32:08 م UTC+2، كتب Egor Pugin:
>
> No, try it without removing storage.
>
> On Sunday, January 29, 2017 at 2:31:10 PM UTC+3, Essa
;>>> I'm trying to track down that issue (crash), but still need more info.
>>>>> Could you please clear the storage, re-run 'cppan --build
>>>>> pvt.cppan.demo.google.tesseract-master' and attach log files
>>>>> from c:\Users\u\.cppan\
>>>>> cppan.
er'. Post the output here.
>
> On Wednesday, January 18, 2017 at 10:17:46 PM UTC+3, Essam Zaky wrote:
>>
>> Thanks Egor
>>
>> i removed
>> c:\users\emz\.cppan\storage
>> and ran cppan as follow
>> Run-->cmd
>> cppan --build pvt.cppan.demo.g
s produce the mentioned error in red
dependency 'pvt.cppan.demo.unicode.icu.data' not found
بتاريخ الأربعاء، 18 يناير، 2017 3:32:36 م UTC+2، كتب Essam Zaky:
>
> Dear All
> I have Windows and Visual Studio2010,2015
> Are there any tutorial to build Tesseract4.00 from source
> Also are
pan or
run cppan without clean storage
?
بتاريخ الأربعاء، 18 يناير، 2017 8:03:53 م UTC+2، كتب Egor Pugin:
>
> Hi,
>
> Try to remove directory c:\users\emz\.cppan\storage and re-run cppan again.
>
> On Wednesday, January 18, 2017 at 7:23:42 PM UTC+3, Essam Zaky wrote:
>>
ows
>
>
>
> ShreeDevi
>
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Wed, Jan 18, 2017 at 7:02 PM, Essam Zaky <essa...@gmail.com
> > wrote:
>
>> Dear All
>> I have Windows and Visual Studio2010,2015
>> Are there any
Dear All
I have Windows and Visual Studio2010,2015
Are there any tutorial to build Tesseract4.00 from source
Also are there any tutorial to do the training process in windows
any suggestion are welcome
thanks
--
You received this message because you are subscribed to the Google Groups
Hi
It's image processing problem
you can use OpenCV to find text , here there are some idea's
-Use SWT to find text
-Use color histogram, Quantize histogram, and the maximum color is the
background , convert all other colors to black , now all text will be black
and you can pass it to Tesseract
52 matches
Mail list logo