Dňa 14.08.2010 00:17, Jimmy O'Regan wrote / napísal(a): > On 13 August 2010 21:54, zdenko podobny <[email protected]> wrote: > >> Hello, >> I would like to announce new version 1.01 of pyTesseractTrainer - successor >> of tesseractTrainer.py Version 1.00 is identical with tesseractTrainer.py. >> Features: >> >> visual editor of box file >> layout of symbol from box file reflect symbols on image >> possibility to define bold, italic, and underline font >> deleting, joining, splitting of symbols/boxes >> easy and exact way of adjusting boxes >> support for opening different image formats (tiff, png, jpeg, bmp, gif) >> multi-platform support (tested on Linux 64 bit and Windows XP) >> >> Buxfixes (in 1.01): >> >> unicode support >> > Ooh. No mean feat, 'cause Python sucks at Unicode :) > > >> opening of tesseract v3.00 box file (but save support only v2.0x box file) >> identify/imagick is not need anymore >> correct error that block to open file on Windows >> solved issues regarding training symbols @ and $ (used also to identify bold >> and italic font) >> workaround for missing Numeric support in PyGTK >> >> Because IFAIK nobody react on Catalin e-mail I offered him to create project >> to collect patches and possibly to solve known issues. Because of my low >> time resource project is looking still for owner/contributors. Warmly >> > I would recommend creating a project somewhere that offers distributed > VCS support, that way you don't have the 'owner goes missing, no-one > can commit problem'. > > As it's written in Python, Launchpad is probably the best place. The > Ubuntu folks are big fans of Python, and it'll probably be relatively > easy to recruit. > > On a related note, for anyone who likes Bazaar, there's a mirror of > Tesseract's code on Launchpad. I'm not quite up to speed on bzr, but > if someone sends me a link to a branch, I'll (figure out how to :) > merge it to SVN. > > >> welcomed are expect for python (multi-platform) GUI (GTK/QT/wx...) >> because performance issues - on Windows XP (2GB memory) script crash or >> freezes during opening file with a lot of boxes/symbols (e.g. >> eng.arial.g4.tif), on Mandrivalinux 2010.164 bit (6GB memory) it take to >> open&display 15 minutes! >> > Ouch! I guess there's a lot of copying of image regions going on when > all you really want is a reference. What's the graphics library? PIL? > > Script depends on python & pygtk only (no PIL, even it did not import cairo :-) ). At the moment I wanted to solve some issues of happy tesseractTrainer.py users. So no ui changes additional features at the moment.
Zd. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

