Dňa 14.08.2010 00:17, Jimmy O'Regan wrote / napísal(a):
> On 13 August 2010 21:54, zdenko podobny <[email protected]> wrote:
>   
>> Hello,
>> I would like to announce new version 1.01 of pyTesseractTrainer - successor
>> of tesseractTrainer.py Version 1.00 is identical with tesseractTrainer.py.
>> Features:
>>
>> visual editor of box file
>> layout of symbol from box file reflect symbols on image
>> possibility to define bold, italic, and underline font
>> deleting, joining, splitting of symbols/boxes
>> easy and exact way of adjusting boxes
>> support for opening different image formats (tiff, png, jpeg, bmp, gif)
>> multi-platform support (tested on Linux 64 bit and Windows XP)
>>
>> Buxfixes (in 1.01):
>>
>> unicode support
>>     
> Ooh. No mean feat, 'cause Python sucks at Unicode :)
>
>   
>> opening of tesseract v3.00 box file (but save support only v2.0x box file)
>> identify/imagick is not need anymore
>> correct error that block to open file on Windows
>> solved issues regarding training symbols @ and $ (used also to identify bold
>> and italic font)
>> workaround for missing Numeric support in PyGTK
>>
>> Because IFAIK nobody react on Catalin e-mail I offered him to create project
>> to collect patches and possibly to solve known issues. Because of my low
>> time resource project is looking still for owner/contributors. Warmly
>>     
> I would recommend creating a project somewhere that offers distributed
> VCS support, that way you don't have the 'owner goes missing, no-one
> can commit problem'.
>
> As it's written in Python, Launchpad is probably the best place. The
> Ubuntu folks are big fans of Python, and it'll probably be relatively
> easy to recruit.
>
> On a related note, for anyone who likes Bazaar, there's a mirror of
> Tesseract's code on Launchpad. I'm not quite up to speed on bzr, but
> if someone sends me a link to a branch, I'll (figure out how to :)
> merge it to SVN.
>
>   
>> welcomed are expect for python (multi-platform) GUI (GTK/QT/wx...)
>>  because performance issues - on Windows XP (2GB memory) script crash or
>> freezes during opening file with a lot of boxes/symbols (e.g.
>> eng.arial.g4.tif), on Mandrivalinux 2010.164 bit (6GB memory) it take to
>> open&display 15 minutes!
>>     
> Ouch! I guess there's a lot of copying of image regions going on when
> all you really want is a reference. What's the graphics library? PIL?
>
>   
Script depends on python & pygtk only (no PIL, even it did not import
cairo :-) ).
At the moment I wanted to solve some issues of happy tesseractTrainer.py
users. So no ui changes additional features at the moment.

Zd.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to