AW: [tesseract-ocr] Using Tesseract as an OCR solution for blind people

2024-05-02 Thread Eigeldinger Simon
Hi Misti, Thanks for the info. Will have a look at that. Yes getting a good picture as a blind person isn't all that easy. Which output format might be the best to preserve the most formatting, headings and other things? hocr? Greetings, Simon Von: tesseract-ocr@googlegroups.com Im Auftrag

[tesseract-ocr] Using Tesseract as an OCR solution for blind people

2024-04-30 Thread Eigeldinger Simon
of the box with tesseract? Can tesseract also recognize tables and headings? A few years ago someone would need to process the images first. Is this still the status? Greetings, Simon -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To u

[tesseract-ocr] Fine Tuning

2024-01-23 Thread Simon
Hello everybody, I just finished fine tuning according to Ray's tutorial. I did the following steps: 1. I used tesstrain.sh to create training data and the starter traineddata. The training data consists of the eng.training_text with the multiple times added ± character.

Re: [tesseract-ocr] Re: Creating Starter Traineddata

2024-01-20 Thread Simon
m 14:19:33 UTC+1: > You need to look at it in the unicode list. > > On Sat, Jan 20, 2024, 3:50 PM Simon wrote: > >> Hey thanks for the response! >> >> How exactly do I add characters to the unicharset? >> >> Typically the unicharset has to follow a spe

Re: [tesseract-ocr] Re: Creating Starter Traineddata

2024-01-20 Thread Simon
hrieb am Freitag, 19. Januar 2024 um 16:22:24 UTC+1: > Yes, you need to add them before you create the starter model. You can > edit the Latin.unicarset before you run the combine command. > > On Fri, Jan 19, 2024, 5:27 PM Simon wrote: > >> Ok somehow I had "no

[tesseract-ocr] Re: Creating Starter Traineddata

2024-01-19 Thread Simon
sed: When I try to train some new characters do I have to add them to the Latin.unicharset before I create the starter traineddata or do I just add these characters to the created unicharset after I created starter traineddata? Simon schrieb am Freitag, 19. Januar 2024 um 10:38:24 UTC+

[tesseract-ocr] Re: Creating Starter Traineddata

2024-01-19 Thread Simon
ages or something that could give me more insights on why it didn't work? Simon schrieb am Donnerstag, 18. Januar 2024 um 11:11:52 UTC+1: > Hello everybody, > > I have a question regarding "Fine Tuning +- a few characters". > > In general the instructions on >

[tesseract-ocr] Creating Starter Traineddata

2024-01-18 Thread Simon
Hello everybody, I have a question regarding "Fine Tuning +- a few characters". In general the instructions on https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html#fine-tuning-for--a-few-characters say that you have to make a starter traineddata from the unicharset, but

[tesseract-ocr] Train Just a Few Layers

2024-01-09 Thread Simon
Hello everybody, currently I am trying to train just a few layern of the eng_best.traineddata file. I already created 30,000 box gt.txt and .tif files for training specifically for my problem. As I tried to follow the instructions for training tesseract 4

[tesseract-ocr] Re: jTessBoxEditor

2023-12-04 Thread Simon
I just saw the second picture I attached should be the following. In that one you can see the .box file information. [image: GoogleGroupsQuestion2.png] Simon schrieb am Sonntag, 3. Dezember 2023 um 10:38:51 UTC+1: > Hello everybody, > > is anyone familar with jTessBoxEditor. > I

[tesseract-ocr] jTessBoxEditor

2023-12-03 Thread Simon
Hello everybody, is anyone familar with jTessBoxEditor. I am currently generating synthetic training data. Within this synthetically created tif files are numbers to be trained. Within this program I also automatically create .box files. But somehow the box coordinates jTessBoxEditor shows

Re: [tesseract-ocr] Re: Training from Scratch

2023-11-29 Thread Simon
: > > Hi Simon, yes, I think the instructions you can give to the segmentation > step are quite limited, mostly the PSM parameter and I suppose a few minor > ones. There is something about tables but I've never used it and yours > might be too small for this to work. Yes, you should be a

Re: [tesseract-ocr] Re: Training from Scratch

2023-11-25 Thread Simon
? Lorenzo Blz schrieb am Freitag, 24. November 2023 um 10:45:14 UTC+1: > Hi Simon, > if I understand correctly how tesseract works, it follows this steps: > > - it segments the image into lines of text > - it then takes each individual line and slides a small window, 1px wide I

[tesseract-ocr] Re: Training from Scratch

2023-11-23 Thread Simon
Thanks a lot! This is not possible with the tesstrain repository right? desal...@gmail.com schrieb am Donnerstag, 23. November 2023 um 10:28:26 UTC+1: > If the original model lacks the ∠ symbol, fine tuning is not going to add > it for you. We have all went through that process. To introduce a

[tesseract-ocr] Re: Training Metrics

2023-11-23 Thread Simon
Alright, this might be a litte bit of a dump question but where exactly can I see the CER? 2 Percent improvement time=56, best error was 12.49 @ 8294 At iteration 8350/1/1, Mean rms=2.701%, delta=2.491%, char train=10.385%, word train=24.4%, skip ratio=0%, New best char error =

[tesseract-ocr] Re: Training Metrics

2023-11-23 Thread Simon
Alright, this might be a litte bit of a dump question but where exactry can I see the CER? 2 Percent improvement time=56, best error was 12.49 @ 8294 At iteration 8350/1/1, Mean rms=2.701%, delta=2.491%, char train=10.385%, word train=24.4%, skip ratio=0%, New best char error =

[tesseract-ocr] Re: I am unable to train a new font to tesseract, I am getting a deserialize failed error

2023-11-23 Thread Simon
As I learned in the list.train and list.eval folders there are lstmf file paths required. Also make sure when you are using tesseract on linux the end of file in the file should be LF and NOT the windows standard CRLF. Maybe this will help you:

[tesseract-ocr] Re: Training from Scratch

2023-11-23 Thread Simon
If I need to train new characters that are not recognized by a default model, is fine tuning in this case the right approach? One of these characters ist the one for angularity: ∠ This symbols appear in technical drawings and should be recognised in those. E.g. for the scenario in the

[tesseract-ocr] Training Metrics

2023-11-22 Thread Simon
As I am training my model I got in contact with the following metrics: E.g.: At iteration 6345/6500/6500, Mean rms=6.246%, delta=7.139%, char train=68.07%, word train=92.2%, skip ratio=0%, New best char error = 68.07 wrote checkpoint. Unfortunately I don't find any proper and detailed

[tesseract-ocr] Training from Scratch

2023-11-22 Thread Simon
As it is not properly possible to combine my traineddata from scratch with an existing one, I have decided to also train my traineddata model numbers. Therefore I wrote a script which synthetically generates groundtruth data with text2image. This script uses dozens of different fonts and

Re: [tesseract-ocr] Combine traineddata

2023-11-20 Thread Simon
ur installed 'eng' > database is doing what it's supposed to, on its own, first. > > The next sane thing to try is flipping them around, ie "eng+gdt" instead > of "gdt+eng", to see if results change and /how/, as that might give us all > a hint about what's going on

[tesseract-ocr] Combine traineddata

2023-11-20 Thread Simon
Hello everybody, right now I am working with tesseract to train it new symbols. Therefore I used tif pictures with only the desired symbol in it. I trained with tesstrain Repository and about 4000 training images. At the end of the procedure I got the traineddata file for my model Common_gdt.

[tesseract-ocr] OSD sometimes incorrect

2019-10-18 Thread simon mackenzie
I am using " tesseract file1.png stdout -l osd--psm 0" With some images that are correctly oriented it reports 180 degrees. It gets it right if I rotate the images 90, 180, 270. The images are lists with numbers and names in English. Is there any way to improve performance on this?

Re: [tesseract-ocr] Need Help Learning Howto Train Tesseract OCR on Fraktur Fonts - MAC - VietOCR v5.5.2 and Tesseract 4.1.0

2019-10-02 Thread Akos Simon
/tessdata/issues?utf8=%E2%9C%93=is%3Aissue+is%3Aopen+Fraktur > > Zdenko > > > st 2. 10. 2019 o 11:58 Akos Simon > > napísal(a): > >> training tesseract >> >> Tesseract it is an OCR TEXT recognition software that can be trained. >> I have gott

Re: [tesseract-ocr] Need Help Learning Howto Train Tesseract OCR on Fraktur Fonts - MAC - VietOCR v5.5.2 and Tesseract 4.1.0

2019-10-02 Thread Akos Simon
confused here, hopefully, this will change with your help ? .. ;) Thanks, Zdenko !! On Wednesday, October 2, 2019 at 7:38:08 AM UTC+2, zdenop wrote: > > Why do you think training will help you? What other option you have tried? > > Zdenko > > > st 2. 10. 2019 o 7:26 Ak

[tesseract-ocr] Need Help Learning Howto Train Tesseract OCR on Fraktur Fonts - MAC - VietOCR v5.5.2 and Tesseract 4.1.0

2019-10-01 Thread Akos Simon
Fraktur Fonts OCR recognition with Tesseract OCR is what I am looking for, I installed VietOCR v5.5.2 and Tesseract 4.1.0 on my mac, and now I am trying to find help on how to train it better there are too many OCR errors... How would I go about training the software? Can anyone

AW: [tesseract-ocr] Tesseract Windows binaries on Appveyor

2019-05-29 Thread Eigeldinger Simon
Thanks for the fix. Greetings, Simon Mit freundlichen Grüßen Simon Eigeldinger Informatik Nebengebäude 1, OG1 [Hohenems_logo]Stadt Hohenems Kaiser-Franz-Josef-Straße 4 6845 Hohenems T: +43 5576 7101-1143 | E: simon.eigeldin...@hohenems.at | www.hohenems.at Diese Nachricht und allfällige

[tesseract-ocr] Re: using tesseract4 works fine but with oem 0 "couldn't load any languages"

2018-07-14 Thread simon mackenzie
I found answer is to set -l osd. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email

[tesseract-ocr] using tesseract4 works fine but with oem 0 "couldn't load any languages"

2018-07-14 Thread simon mackenzie
I am using tesseract4 and all working fine with english. However tesseract4 cannot detect page orientation so I want to use tesseract3 for this. I thought I just had to do tesseract --oem 0but now it says "couldn't load any languages" Is there a way to use tesseract3 whilst tesseract4

Re: [tesseract-ocr] Re: Automagic Orientation Detection with the new LSTM models?

2018-03-06 Thread Simon Eigeldinger
Hi, like me as a blind i wonder how i might use some of those tools? because you can't see if the pic is good or bad. actually we might need something that does that automatically. any ideas on that? Greetings and thanks, Simon Am 06.03.2018 um 02:42 schrieb Michael Smith: I just do some

Re: [tesseract-ocr] tesseract data files

2018-03-04 Thread Simon Eigeldinger
Hm. I guess i just ship all 3 of them. *lol* and add the text of the wiki to the readme. Greetings, Simon Am 04.03.2018 um 18:43 schrieb ShreeDevi Kumar: The traineddata files in tessdata_best are larger in size and OCR takes more time. They are supposedly slightly more accurate

Re: [tesseract-ocr] tesseract data files

2018-03-04 Thread Simon Eigeldinger
Hi ShreeDevi, I have scraped the cygwin builds. i am using now the builds i get from the appveyor builds which just needs me to repackage the resulting stuff. so tessdata_best isn't like the wiki says for better accuracy? greetings, Simon Am 03.03.2018 um 05:12 schrieb ShreeDevi Kumar: Hi

[tesseract-ocr] tesseract data files

2018-03-02 Thread Simon Eigeldinger
. is that 3rd set still useable or shouldn't that ome not be used anymore? on the wiki https://github.com/tesseract-ocr/tesseract/wiki/Data-Files it's still listed as useable. Any suggestions? Greetings and thanks, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https

Re: [tesseract-ocr] russian-old?

2017-10-18 Thread 'Simon Eigeldinger' via tesseract-ocr
I guess i have to correct myself. german fraktur is in the tessdata repo. Am 18.10.2017 um 21:34 schrieb 'Simon Eigeldinger' via tesseract-ocr: Hi Yury, Maybe the same happened to it like the german fraktur data. they seem to have not been updated for a long time and they have been removed

Re: [tesseract-ocr] russian-old?

2017-10-18 Thread 'Simon Eigeldinger' via tesseract-ocr
Hi Yury, Maybe the same happened to it like the german fraktur data. they seem to have not been updated for a long time and they have been removed from the main repos. Greetings, Simon Am 18.10.2017 um 19:15 schrieb Yury Tarasievich: Hi guys, I may be wrong but the Russian tessdata does

Re: [tesseract-ocr] install Tesseract on window

2017-10-12 Thread 'Simon Eigeldinger' via tesseract-ocr
Hi, I have packaged a new version of tesseract with various tessdata files. though i am looking at the moment where to upload it the best. Greetings, Simon Am 11.10.2017 um 21:02 schrieb Dung Tran: Hello, I am a newbie and I tried to install Tesseract on my window machine. I

Re: [tesseract-ocr] new tessdata repos on github

2017-09-17 Thread 'Simon Eigeldinger' via tesseract-ocr
Hi, Thanks for the info. Greetings, Simon Am 17.09.2017 um 19:16 schrieb ShreeDevi Kumar: Simon, There is a significant difference in speed. Depending on the language, the difference in accuracy may be minimal or more. You should compare both for a representative sample to see which

Re: [tesseract-ocr] new tessdata repos on github

2017-09-17 Thread 'Simon Eigeldinger' via tesseract-ocr
Hi ShreeDevi, Thanks for the info. So it seems for blind people who need the best accuracy they should use tessdata_best. Greetings, Simon Am 17.09.2017 um 16:52 schrieb ShreeDevi Kumar: Please see https://github.com/tesseract-ocr/tesseract/issues/995#issuecomment-329667239 ShreeDevi

[tesseract-ocr] new tessdata repos on github

2017-09-17 Thread 'Simon Eigeldinger' via tesseract-ocr
. and there is the tessdata repo. what is that doing now in the future? Greetings and thanks for helping, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- You received this message because you are subscribed to the Google Groups "tesserac

[tesseract-ocr] unrecognised characters dropped from wordbox rather than replaced with spaces

2017-09-01 Thread simon mackenzie
I have a some text "Morgan, L 220". There is a horizontal line crossing it out. Due to the crossing out it is difficult for OCR to identify word boundaries. Therefore I am combining the content of a whole row and then parsing it myself e.g. treating a big space as a separator.

[tesseract-ocr] strange hack that improves OCR on numbers

2017-08-27 Thread simon mackenzie
I have columns like this 356 Smith 23123 Jones12 123 Jacks 19124 Barnes 10 Wordboxes are correctly identified for all names/numbers for some pages. However on other pages there are numerous missing boxes for columns of numbers especially the last

Re: [tesseract-ocr] thanks for tesseract daily builds

2017-02-07 Thread Simon Eigeldinger
when we do that every commit that happens to the tesseract repo, especially when more per day happen. i wonder if its more interesting to build it once a day. would that be a good idea? greetings and thanks for adding the builds, simon Am 07.02.2017 um 15:49 schrieb Egor Pugin: @egorpugin

[tesseract-ocr] thanks for tesseract daily builds

2017-02-06 Thread Simon Eigeldinger
together and some times you kind of miss the last line which seems not to be included in the pdf as text but in the txt file if you creat it. is it also possible to compile the training tools as well? greetings and thanks, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren

[tesseract-ocr] AppVeyor: add downloadable builds

2016-12-21 Thread Simon Eigeldinger
the artifacts for win32 and win64. can someone post some infos about how to use them? Greetings and thanks a lot, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- You received this message because you are subscribed to the Google Groups

Re: [tesseract-ocr] Re: opening multi page pdfs using tesseract

2016-09-16 Thread Simon Eigeldinger
, Simon Eigeldinger wrote: Hi all, can i use tesseract to open multipage pdfs directly? we have multi function printers which produce pdfs with images which can be run through ocr. how can i acomplish that for tesseract? do i need a second program for that? greetings, simon --- Diese E-Mail

[tesseract-ocr] opening multi page pdfs using tesseract

2016-09-15 Thread Simon Eigeldinger
Hi all, can i use tesseract to open multipage pdfs directly? we have multi function printers which produce pdfs with images which can be run through ocr. how can i acomplish that for tesseract? do i need a second program for that? greetings, simon --- Diese E-Mail wurde von Avast Antivirus

Re: [tesseract-ocr] do i get a performance boost when i compile tesseract as a 64 bit program?

2016-05-15 Thread Simon Eigeldinger
for tesseract so i guess i will build my own builds then which i will share with people. i guess i would build 32 and 64 bit versions if i can with one install. greetings, simon Am 15.05.2016 um 13:10 schrieb Marco Atzeri: On 15/05/2016 12:33, Simon Eigeldinger wrote: Hi all, i am thinking

[tesseract-ocr] do i get a performance boost when i compile tesseract as a 64 bit program?

2016-05-15 Thread Simon Eigeldinger
. do i get a performance boost when i compile tesseract with 64 bit? i also don't know if i can install cygwin 32 and 64 bit on the same system or if i just need cygwin 64 bit to also compile 32 bit progams. greetings, simon -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com

Re: [tesseract-ocr] how to compile tesseract for windows on a linux machine?

2016-05-14 Thread Simon Eigeldinger
Hi, I have tesseract running on my linux box. i want to compile a windows version under linux. greetings, simon Am 14.05.2016 um 18:09 schrieb ShreeDevi Kumar: There is an archlinux distribution for tesseract - see https://www.archlinux.org/packages/community/i686/tesseract/ ShreeDevi

[tesseract-ocr] how to compile tesseract for windows on a linux machine?

2016-05-11 Thread Simon Eigeldinger
have? Which dependencies do i need? I never have done that and would be grateful for some hand holding. I just compiled stuff on cygwin. Thanks and greetings, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- You received this message

[tesseract-ocr] how to compile tesseract for windows on a linux machine?

2016-05-07 Thread Simon Eigeldinger
and greetings, Simon --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, sen

Re: [tesseract-ocr] how to compile tesseract on msys2/mingw?

2016-03-05 Thread Simon Eigeldinger
Thanks for the info. might have a look at this. greetings, simon Am 05.03.2016 um 05:09 schrieb ShreeDevi Kumar: https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-tesseract-ocr/PKGBUILD Modify the pkgbuild to use the latest source. ShreeDevi

[tesseract-ocr] how to compile tesseract on msys2/mingw?

2016-03-04 Thread Simon Eigeldinger
builds daily builds from the source. greetings, simon -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com/domasofan/ E-Mail: simon.eigeldin...@vol.at MSN: simon_eigeldin...@hotmail.com ICQ: 121823966 Jabber: domaso...@andrelouis.com --- Diese E-Mail wurde von Avast Antivirus-Software

[tesseract-ocr] baseapi.h and mysql.h error compil c++

2015-12-15 Thread simon .barotte
Hi all, I begin with tesseract and I get an error when I want to compile a program thaht uses tesseract API and MySQL in C++. I have this error : In file included from /usr/include/mysql/mysql.h:75:0, from ocr.cpp:4: /usr/include/mysql/my_list.h:26:3: error: conflicting

Re: [tesseract-ocr] tesseract on cygwin

2015-07-24 Thread Simon Eigeldinger
hi, i did test it 2 days ago and it seems to work. at least over here and on a windows 7 machine in the office. but i could recheck again. greetings, simon Am 24.07.2015 um 08:50 schrieb zdenko podobny: it is not about input, but output. pdf output is key feature of leptonica 1.71 release

Re: [tesseract-ocr] tesseract on cygwin

2015-07-24 Thread Simon Eigeldinger
which seem to contain everything but shows the warning message. i recompiled a new version on my fake website so people can play with the training tools as well. so and now i am off for 2 weeks. have a nice time while i am not around. greetings, simon Am 24.07.2015 um 08:50 schrieb zdenko podobny

Re: [tesseract-ocr] tesseract on cygwin: training tools seem not to build

2015-07-23 Thread Simon Eigeldinger
hi, and i just opened a ticket: https://github.com/tesseract-ocr/tesseract/issues/61 greetings, simon Am 23.07.2015 um 23:23 schrieb Jim O'Regan: On 23 July 2015 at 19:02, Simon Eigeldinger simon.eigeldin...@vol.at wrote: Hi all, pango_font_info.cpp:223:46: error: 'strcasestr

Re: [tesseract-ocr] displayed version number of tesseract when compiled from git

2015-07-23 Thread Simon Eigeldinger
hi, thanks for the info. so i guess then i might recompile my windows builds in debug mode then? greetings, simon Am 23.07.2015 um 21:11 schrieb zdenko podobny: 1. Well if someone compile code from git (s)he should know what revision is using ;-) And of course git code (unreleased

Re: [tesseract-ocr] building tesseract on windows using cygwin

2015-07-21 Thread Simon Eigeldinger
.exe around 13.4 mb. german and english data files. tesseract-all-langs-win-git-20150721.exe around 351.7 mb. all the data files for tesseract which it can use at the moment. Let's see if it works. had no time currently to test but will do in the office tomorrow. greetings, simon Am

Re: [tesseract-ocr] Configure for single character recognition

2014-11-15 Thread Simon Støvring
://bhajans.ramparivar.com On Fri, Nov 14, 2014 at 7:12 PM, Simon Støvring simonst...@gmail.com javascript: wrote: Hello, I am trying to recognize single characters written with the Gotham Bold font. I have trained Tesseract by following Michael Jay Lissners guide Adding New Fonts to Tesseract 3 OCR

Re: [tesseract-ocr] Configure for single character recognition

2014-11-15 Thread Simon Støvring
The letters will always be uppercase, so capitlization is not really an issue. I can try to layout the letters in a straight line and use the line mode. However, I need to know the location of each character. That is which row and column it is placed on. If Tesseract fails recognizing a single

Re: [tesseract-ocr] Configure for single character recognition

2014-11-15 Thread Simon Støvring
://bhajans.ramparivar.com On Sat, Nov 15, 2014 at 3:39 PM, Simon Støvring simonst...@gmail.com javascript: wrote: I have tried with the English traineddata and got similar results. However, I had not tried recognizing the entire 'prepared-image' with psm 6 and I see that gives pretty good

[tesseract-ocr] Configure for single character recognition

2014-11-14 Thread Simon Støvring
to match correctly but generally it's just not good enough and I'ld like to know if there's any way I can improve it. Should I train differently? Should I pass other configurations or should I process the images before trying to recognize the characters? Best regards, Simon B. Støvring -- You

Re: [tesseract-ocr] Poor results with tesseract OCR'ing .tif (as compared to an on-line OCR)

2014-10-24 Thread Simon Eigeldinger
. -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com/domasofan/ E-Mail: simon.eigeldin...@vol.at MSN: simon_eigeldin...@hotmail.com ICQ: 121823966 Jabber: domaso...@andrelouis.com --- Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv. http

Re: [tesseract-ocr] Poor results with tesseract OCR'ing .tif (as compared to an on-line OCR)

2014-10-24 Thread Simon Eigeldinger
hi, is there a guideline what to do with poor quality pics? i am blind so i have no clue what sighted people do with those. *smile* and it seems tesseract can't do much about pic quality. maybe imagemagick might be a good choice for fixing things? greetings, simon Am 24.10.2014 um 23:02

[tesseract-ocr] how to automagically convert images that are best for tesseract?

2014-10-24 Thread Simon Eigeldinger
? thanks. greetings, simon -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com/domasofan/ E-Mail: simon.eigeldin...@vol.at MSN: simon_eigeldin...@hotmail.com ICQ: 121823966 Jabber: domaso...@andrelouis.com --- Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus

[tesseract-ocr] thanks to robert

2014-10-24 Thread Simon Eigeldinger
btw forgot to say thanks to robert melton for telling me about this script or at least for googling for that. wonder if we can get something awesome out of things. greetings, simon -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com/domasofan/ E-Mail: simon.eigeldin...@vol.at

Re: [tesseract-ocr] Re: PDF output not searchable within SumatraPDF

2014-10-15 Thread Simon Eigeldinger
. greetings, simon Am 15.10.2014 um 18:06 schrieb Chris Cameron: All the files I mention can be found here: https://www.dropbox.com/sh/v5w4zl0c2z1wra1/AACxjmomYL4o-iQEhBrLvNgHa Incidentally, I now see that Chrome's PDF viewer is also unable to search the PDF. Thanks, Chris -- Simon Eigeldinger

[tesseract-ocr] PDFs still broken?

2014-10-08 Thread Simon Eigeldinger
format is 4; unreadable Error during processing. tested with the eurotext.tif file from the testing directory on a windows system. compiled with cygwin. https://dl.dropboxusercontent.com/u/1598766/tesseract-error.7z greetings, simon -- Simon Eigeldinger Follow me on Twitter: http

[tesseract-ocr] issues with pdf

2014-10-06 Thread Simon Eigeldinger
as well. greetings, simon --- Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv. http://www.avast.com -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop receiving emails from

Re: [tesseract-ocr] Re: compiling tesseract on cygwin

2014-10-01 Thread Simon Eigeldinger
# build and install tesseract make install ---script end--- before in 2012 i used to do a make -j 4 before i did make install after make install i did make training but it seems the training tools seem to have some compilation issues. greetings, simon Am 01.10.2014 um 16:32 schrieb Wes

Re: compiling tesseract under cygwin now with more details

2012-02-20 Thread Simon Eigeldinger
hi, maybe it gets hardcoded when you use --prefix with the configure script. but i guess thats so with every program you use this. greetings, simon Am 19.02.2012 14:22, schrieb zdenko podobny: Hi, I am not aware about any hardcoded path in tesseract excluding one variable: configure set

Re: compiling tesseract under cygwin now with more details

2012-02-20 Thread Simon Eigeldinger
hi, well last time i was able to do that successfully and when i tried that it was 31. december 2011 and with 644 i guess. i might try again with 678. greetings, simon Am 19.02.2012 18:09, schrieb Sriranga(78yrsold): Zdenko, Just now I svn upated upto r-676 in Linux (since I don't know

Re: compiling tesseract under cygwin now with more details

2012-02-19 Thread Simon Eigeldinger
Hi Zdenko, last time i tried to compile it it was in end of december 2011. there it worked that way. maybe the code can't be compiled right now. in that time i compiled SVN R 644 and last time i tried 675. But thanks. maybe i just need to wait a little bit. Thanks, Greetings, Simon Am

Re: compiling tesseract under cygwin now with more details

2012-02-19 Thread Simon Eigeldinger
i specify. can i also specify something that its not so hardlinked to paths cause when i give the binaries to another person they have to install it to the same place. greetings, simon Am 18.02.2012 08:44, schrieb zdenko podobny: I am not not cygwin user so just some ideas: - cygwin

Compiling Tesseract under Cygwin

2011-07-01 Thread Simon Eigeldinger
Hello, I want to compile Tesseract from SVN under cygwin. Can someone tell me how to do that? Greetings, Simon -- Simon Eigeldinger Follow me on Twitter: http://www.twitter.com/domasofan/ E-Mail: simon.eigeldin...@vol.at MSN: simon_eigeldin...@hotmail.com ICQ: 121823966 Jabber: domaso