[tesseract-ocr] Re: cmc7.traineddata

2020-04-04 Thread Essam Zaky
will work even > if the MICR lines are mixed with the signature, stamps, annotations... > There is an online webapp to check the accuracy at > https://www.doubango.org/webapps/micr/ > > On Saturday, April 4, 2020 at 11:59:34 AM UTC+2, Essam Zaky wrote: >> >> Hi @mamadou &

[tesseract-ocr] Re: cmc7.traineddata

2020-04-04 Thread Essam Zaky
Hi @mamadou how did you collected the 17000 image are they real images , also which type of Tensorfolw models you used , LSTM line , or single character model Best Regards Essam بتاريخ الخميس، 2 أبريل، 2020 8:22:44 م UTC+2، كتب Ghada Aruri: > > Hi team, > > For CMC-7, I want to train it by

[tesseract-ocr] How to view lstmf file

2020-04-03 Thread Essam Zaky
Hi Dears Is there a tool to view lstmf , i would like to see the input image to model and what is the input text Best Regards Essam -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from

Re: [tesseract-ocr] Generate Arabic PLUS traineddat gives error

2020-03-29 Thread Essam Zaky
that it mean to have 10pow15 digit which is very huge data بتاريخ الأحد، 29 مارس، 2020 11:45:01 ص UTC+2، كتب shree: > > > https://github.com/tesseract-ocr/tessdoc/blob/master/TrainingTesseract-4.00.md#introduction > > > On Sun, Mar 29, 2020 at 12:53 PM Essam Zaky &g

Re: [tesseract-ocr] Generate Arabic PLUS traineddat gives error

2020-03-29 Thread Essam Zaky
set of the best traineddata > file, but then you can't add any characters to it. > > On Sun, Mar 29, 2020, 11:08 Essam Zaky > > wrote: > >> Hi@shreeshrii >> attached is the bash script as described in the following page >> >> https://github.com/tesseract-ocr/

Re: [tesseract-ocr] Generate Arabic PLUS traineddat gives error

2020-03-28 Thread Essam Zaky
ar > wrote: > >> Please check that you have used the correct path for the traineddata file. >> >> Please share the lstmtraining command that you used before this for >> training. >> >> On Sat, Mar 28, 2020, 22:56 Essam Zaky > >> wrote: >> >&

[tesseract-ocr] Re: Scan pdf file instead png

2020-03-28 Thread Essam Zaky
What do you mean by "scan a pdf " ? If you mean recognize pdf file , you can not recognize pdf file directly because it's unsupported format by leptonica see the following read me https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc The workarround is to find a tool which

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-28 Thread Essam Zaky
ato 28 marzo 2020 18:32:26 UTC+1, Essam Zaky ha scritto: >> >> PLease attach the original image to check on my machine >> >> بتاريخ السبت، 28 مارس، 2020 7:24:07 م UTC+2، كتب Teo: >>> >>> Thanks for the reply. >>> I just opened an issue on github/Te

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-28 Thread Essam Zaky
pho.png pho-eng -l eng pdf > but this is the result... > > > Il giorno venerdì 27 marzo 2020 03:13:40 UTC+1, Essam Zaky ha scritto: >> >> So I guess the error in PDF generation module >> you have one of the following option >> -try to enhance the bug by your self

[tesseract-ocr] Generate Arabic PLUS traineddat gives error

2020-03-28 Thread Essam Zaky
Dear @Shreeshrii I had followed your bash script to add Andalus font in the Arabic lanaguage here it the script url https://github.com/tesseract-ocr/tesseract/issues/2695#issuecomment-539412948 all steps steps works except the last one which generate the traineddata here it's the error

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-26 Thread Essam Zaky
before tesseract add feature of generating pdf i used library called itextsharp to generate the pdf and the result was very good for me بتاريخ الخميس، 26 مارس، 2020 10:54:50 م UTC+2، كتب Teo: > > Ok coordinates seem correct. > > Il giorno giovedì 26 marzo 2020 19:13:52 UTC+1, Essam Zak

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-26 Thread Essam Zaky
Ok I think that it's a pdf generation module, because the txt is almost >>> the same with the exception of some "the" which tesseract sees as "thè". >>> >>> Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto: >>>> >>

Re: [tesseract-ocr] Re: How to prepare fonts folder to train from scratch

2020-03-25 Thread Essam Zaky
essing and how punctuation and > digits are handled. If your training text does not have them, you will have > greater success. > > On Wed, Mar 25, 2020, 15:32 Essam Zaky > > wrote: > >> Thanx @Loranzo and @Shree >> i will give try to fine tune , and if the res

[tesseract-ocr] Re: How to prepare fonts folder to train from scratch

2020-03-25 Thread Essam Zaky
Thanx @Loranzo and @Shree i will give try to fine tune , and if the result still not satisfied will switch again to build from scratch بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky: > > Hi Dears , > > I would like to build *.traindata from scratch specially

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-25 Thread Essam Zaky
es as "thè". > > Il giorno mercoledì 25 marzo 2020 07:25:11 UTC+1, Essam Zaky ha scritto: >> >> You need to know which to improve tesserct engine or PDF generation >> >> so compare text file from abby and tesserct >> if the result is highl

[tesseract-ocr] Re: How to prepare fonts folder to train from scratch

2020-03-25 Thread Essam Zaky
is done in English model and take it as a reference to make new Arabic model بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky: > > Hi Dears , > > I would like to build *.traindata from scratch specially for English and > Arabic > > So lets talk about English as

[tesseract-ocr] Re: How to improve ocr reader?

2020-03-25 Thread Essam Zaky
You need to know which to improve tesserct engine or PDF generation so compare text file from abby and tesserct if the result is highly different you need to improve image quality or improve LSTM if the result of tesseract is good so you need to enhance the PDF generation module بتاريخ

[tesseract-ocr] Re: How to prepare fonts folder to train from scratch

2020-03-25 Thread Essam Zaky
Thanks @shreeshrii Would answer the questions depending on your experience , also is it possible to get help from Ray ? بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky: > > Hi Dears , > > I would like to build *.traindata from scratch specially for English and >

[tesseract-ocr] How to prepare fonts folder to train from scratch

2020-03-24 Thread Essam Zaky
Hi Dears , I would like to build *.traindata from scratch specially for English and Arabic So lets talk about English as example my question how to prepare fonts folder? i read the https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh file i found the

[tesseract-ocr] Re: What is the difference between script *.traineddata and normal *.traineddata models

2020-03-20 Thread Essam Zaky
Thanks @Shreeshrii So the following commands recognize Arabic/English text tesseract AE.jpg AE1 -l ara+eng tesseract AE.jpg AE2 -l script/Arabic بتاريخ الخميس، 19 مارس، 2020 6:42:19 م UTC+2، كتب Essam Zaky: > > Hi Dears > > What is the difference between script *.traineddat

[tesseract-ocr] What is the difference between script *.traineddata and normal *.traineddata models

2020-03-19 Thread Essam Zaky
Hi Dears What is the difference between script *.traineddata and normal *.traineddata models ? example their are script Arabic.traineddata and ara.traineddata and when to use them ? Best Regards Essam -- You received this message because you are subscribed to the Google Groups

Re: [tesseract-ocr] Re: how to use tesseract to detect table?

2020-02-25 Thread Essam Zaky
would you download the article you described and attach it here , because the medium site needs payed registration ‫في الثلاثاء، 25 فبراير 2020 في 6:24 م تمت كتابة ما يلي بواسطة ‪hux _0‬‏ <‪ hucker.mar...@gmail.com‬‏>:‬ > Check this out, its a guide how to do it >

[tesseract-ocr] Build tesseract for windows from source error

2020-02-02 Thread Essam Zaky
I can not build from source i had download SW client and save it at "D:\Essam\Software\SW" the add to Path and i can run SW in command line and see WS information as follow D:\Tutorial\Git\tesseract\build>sw --version sw.client.sw version 1.0.0 git revision

[tesseract-ocr] Re: Simulating text selection in tiff image

2019-11-23 Thread Essam Zaky
Sorry the code for my controls are release in commercial product not allowed to share but you can find some opensource and enhance it to display the image and text here it's c# open source control to display the image https://github.com/cyotek/Cyotek.Windows.Forms.ImageBox and you can use

[tesseract-ocr] Re: Simulating text selection in tiff image

2019-11-22 Thread Essam Zaky
I did something nearly similar , I had created two user controls by using COM you can use c# custom controls the first control i used it to display the text returned from OCR and the other control for display the image when i select the text in text control i highlight the corresponding text

[tesseract-ocr] Re: Simulating text selection in tiff image

2019-11-20 Thread Essam Zaky
This is not tesseract issue , It's programing issue if you are in windows you can build Custem User Control and do any action you need بتاريخ الجمعة، 15 نوفمبر، 2019 12:56:22 م UTC+2، كتب varadharajan venkatesan: > > In my C# application, user will select any particular text in tiff >

[tesseract-ocr] Re: Tools required to build ,debug and trace tesseract code on linux

2019-11-20 Thread Essam Zaky
Thanks Shree Is there any Debugger which has GUI ,i had installed Ubuntu 19 VM بتاريخ الأربعاء، 20 نوفمبر، 2019 9:16:49 ص UTC+2، كتب Essam Zaky: > > Dears sorry for this basic question > I'm new in Linux world > now i need to build ,debug , and trace tesseract code and see how it'

[tesseract-ocr] Re: Tools required to build ,debug and trace tesseract code on linux

2019-11-20 Thread Essam Zaky
Thanks Shree The link describes the build process ?but what is the IDE will be used to debug and trace the code ,In windows i use Visual Studio what about Linux بتاريخ الأربعاء، 20 نوفمبر، 2019 9:16:49 ص UTC+2، كتب Essam Zaky: > > Dears sorry for this basic question > I'm new in Li

[tesseract-ocr] Re: Error when build C++ Visual Studio 2017 project after include tesseract 4.0 library

2019-11-19 Thread Essam Zaky
HI, try to use ignore error for multiple definition as follow try to use /FORCE:MULTIPLE linker option in your project here it the steps To set this linker option in the Visual Studio development environment

[tesseract-ocr] Tools required to build ,debug and trace tesseract code on linux

2019-11-19 Thread Essam Zaky
Dears sorry for this basic question I'm new in Linux world now i need to build ,debug , and trace tesseract code and see how it's working step by step in linux what is tools required tools to build Tesseract from source in linux also what are tools required to debug and trace tesseract code in

Re: [tesseract-ocr] Build from source failed to recognize arabic

2017-04-08 Thread Essam Zaky
For the sample images i used The accuracy for english is good but for arabic the cube is still better than current LSTM بتاريخ الخميس، 6 أبريل، 2017 9:25:53 م UTC+2، كتب peiman F.: > > ​what is accuracy of result for you!? > -- You received this message because you are subscribed to the

Re: [tesseract-ocr] Build from source failed to recognize arabic

2017-04-06 Thread Essam Zaky
Hi @.peiman thanks for reply i found the problem I was installed old build for v4 from DanBolomBerg site and the TESSDATA_PREFIX was refering to old version with cube now i updated the TESSDATA_PREFIXin system enviornment to the new downloaded data it's working Thanks again بتاريخ الخميس، 6

[tesseract-ocr] Build from source failed to recognize arabic

2017-04-06 Thread Essam Zaky
Hi dears i had build tesseract and training tools from source for windows and VS2015 when recognize English page it succeeded but when try to recognize arabic page it fails C:\Users\emz\tesseract\build\bin\Debug>tesseract eurotext.tif eurotext -l eng Tesseract Open Source OCR Engine

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-23 Thread Essam Zaky
2017 at 7:07:03 PM UTC+3, Essam Zaky wrote: >> >> Hi Egor >> Should i remove strorage and tesseract folders? >> my last failed trial was to build leptonica 1.7.4 and tesseract 3.05 for >> Visual studio 2010 >> >> بتاريخ الخميس، 23 فبراير، 2017 3:04:40 م

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-23 Thread Essam Zaky
Hi Egor Should i remove strorage and tesseract folders? my last failed trial was to build leptonica 1.7.4 and tesseract 3.05 for Visual studio 2010 بتاريخ الخميس، 23 فبراير، 2017 3:04:40 م UTC+2، كتب Egor Pugin: > > Please, check again. > I did some fixes, in general it should work now. > --

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-11 Thread Essam Zaky
these binaries > https://www.dropbox.com/s/obiqvrt4m53pmoz/tesseract-4.0.0-alpha.zip?dl=1 > > On Saturday, February 11, 2017 at 10:15:07 PM UTC+3, Essam Zaky wrote: >> >> >> I had downloaded the following version >> >> http://digi.bib.uni-mannheim.de/tesserac

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-11 Thread Essam Zaky
umped > over the lazy fox. The quick brown dog > jumped over the lazy fox. The quick > brown dog jumped over the lazy fox. > > Please, write: > 1. your processor > 2. your windows version > 3. visual studio version > > > > On Saturday, February 11, 2017 at 7:44

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-11 Thread Essam Zaky
680 (0x9330) encountered. Page 1 DotProductSSE can't be used on Android بتاريخ السبت، 11 فبراير، 2017 5:42:04 م UTC+2، كتب Egor Pugin: > > That's ok. > What about your android error? Does it still exist? > > On Saturday, February 11, 2017 at 6:31:58 PM UTC+3, Essam Zaky wrote: >&g

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-11 Thread Essam Zaky
y the way, what projects do you see in the solution? > Probably you have some build errors, so some projects left unbuilt. > Did you see any errors during the build? > > On Saturday, February 11, 2017 at 2:13:28 PM UTC+3, Essam Zaky wrote: >> >> Dear Egor >> >> In

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-02-11 Thread Essam Zaky
C+2، كتب Essam Zaky: > > Dear All > I have Windows and Visual Studio2010,2015 > Are there any tutorial to build Tesseract4.00 from source > Also are there any tutorial to do the training process in windows > > any suggestion are welcome > > thanks > -- You

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
gt; about leptonica version, ask Zdenko or Ray. > > On Sunday, January 29, 2017 at 9:50:28 PM UTC+3, Essam Zaky wrote: >> >> thanks Egor , Shree >> >> Egor,, what do you thin about Shree openion which say "There are recent >> changes in leptonica which cater t

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
mail.com > > wrote: > >> Tess uses stable 1.74 leptonica, not the master branch. You don't need to >> touch anything in cppan storage. >> >> On Sunday, January 29, 2017 at 6:08:13 PM UTC+3, Essam Zaky wrote: >>> >>> Thanks Egor >>> Sorry

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
e found near those binaries > (e.g tesseract-9fa26eb4.sln.lnk). You can open it, switch to debug and > build. > 3. For other questions, please, read > https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows > > On Sunday, January 29, 2017 at 5:38:04 PM UTC+3, Essam Za

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
I see some bin files here C:\Users\emz\.cppan\storage\bin\33e598b5\Release and some bin files here C:\Users\emz Also whare i can find main *.sln i would like to build the debug version of tesseract بتاريخ الأحد، 29 يناير، 2017 4:27:18 م UTC+2، كتب Egor Pugin: > > What process? > > -- You

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
Hi Egor It's completed now without error or crash 0 Errors 140141 Warning how to check that process is working fine? بتاريخ الأحد، 29 يناير، 2017 1:32:08 م UTC+2، كتب Egor Pugin: > > No, try it without removing storage. > > On Sunday, January 29, 2017 at 2:31:10 PM UTC+3, Essa

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-29 Thread Essam Zaky
;>>> I'm trying to track down that issue (crash), but still need more info. >>>>> Could you please clear the storage, re-run 'cppan --build >>>>> pvt.cppan.demo.google.tesseract-master' and attach log files >>>>> from c:\Users\u\.cppan\ >>>>> cppan.

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-18 Thread Essam Zaky
er'. Post the output here. > > On Wednesday, January 18, 2017 at 10:17:46 PM UTC+3, Essam Zaky wrote: >> >> Thanks Egor >> >> i removed >> c:\users\emz\.cppan\storage >> and ran cppan as follow >> Run-->cmd >> cppan --build pvt.cppan.demo.g

[tesseract-ocr] Re: Build from source for Visual studio and windows

2017-01-18 Thread Essam Zaky
s produce the mentioned error in red dependency 'pvt.cppan.demo.unicode.icu.data' not found بتاريخ الأربعاء، 18 يناير، 2017 3:32:36 م UTC+2، كتب Essam Zaky: > > Dear All > I have Windows and Visual Studio2010,2015 > Are there any tutorial to build Tesseract4.00 from source > Also are

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-18 Thread Essam Zaky
pan or run cppan without clean storage ? بتاريخ الأربعاء، 18 يناير، 2017 8:03:53 م UTC+2، كتب Egor Pugin: > > Hi, > > Try to remove directory c:\users\emz\.cppan\storage and re-run cppan again. > > On Wednesday, January 18, 2017 at 7:23:42 PM UTC+3, Essam Zaky wrote: >>

Re: [tesseract-ocr] Build from source for Visual studio and windows

2017-01-18 Thread Essam Zaky
ows > > > > ShreeDevi > > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Wed, Jan 18, 2017 at 7:02 PM, Essam Zaky <essa...@gmail.com > > wrote: > >> Dear All >> I have Windows and Visual Studio2010,2015 >> Are there any

[tesseract-ocr] Build from source for Visual studio and windows

2017-01-18 Thread Essam Zaky
Dear All I have Windows and Visual Studio2010,2015 Are there any tutorial to build Tesseract4.00 from source Also are there any tutorial to do the training process in windows any suggestion are welcome thanks -- You received this message because you are subscribed to the Google Groups

Re: [tesseract-ocr] Why doesTesseract ignore all texts in gray color?

2017-01-12 Thread Essam Zaky
Hi It's image processing problem you can use OpenCV to find text , here there are some idea's -Use SWT to find text -Use color histogram, Quantize histogram, and the maximum color is the background , convert all other colors to black , now all text will be black and you can pass it to Tesseract