On 13 August 2010 21:54, zdenko podobny <[email protected]> wrote:
> Hello,
> I would like to announce new version 1.01 of pyTesseractTrainer - successor
> of tesseractTrainer.py Version 1.00 is identical with tesseractTrainer.py.
> Features:
>
> visual editor of box file
> layout of symbol from box file reflect symbols on image
> possibility to define bold, italic, and underline font
> deleting, joining, splitting of symbols/boxes
> easy and exact way of adjusting boxes
> support for opening different image formats (tiff, png, jpeg, bmp, gif)
> multi-platform support (tested on Linux 64 bit and Windows XP)
>
> Buxfixes (in 1.01):
>
> unicode support

Ooh. No mean feat, 'cause Python sucks at Unicode :)

> opening of tesseract v3.00 box file (but save support only v2.0x box file)
> identify/imagick is not need anymore
> correct error that block to open file on Windows
> solved issues regarding training symbols @ and $ (used also to identify bold
> and italic font)
> workaround for missing Numeric support in PyGTK
>
> Because IFAIK nobody react on Catalin e-mail I offered him to create project
> to collect patches and possibly to solve known issues. Because of my low
> time resource project is looking still for owner/contributors. Warmly

I would recommend creating a project somewhere that offers distributed
VCS support, that way you don't have the 'owner goes missing, no-one
can commit problem'.

As it's written in Python, Launchpad is probably the best place. The
Ubuntu folks are big fans of Python, and it'll probably be relatively
easy to recruit.

On a related note, for anyone who likes Bazaar, there's a mirror of
Tesseract's code on Launchpad. I'm not quite up to speed on bzr, but
if someone sends me a link to a branch, I'll (figure out how to :)
merge it to SVN.

> welcomed are expect for python (multi-platform) GUI (GTK/QT/wx...)
>  because performance issues - on Windows XP (2GB memory) script crash or
> freezes during opening file with a lot of boxes/symbols (e.g.
> eng.arial.g4.tif), on Mandrivalinux 2010.164 bit (6GB memory) it take to
> open&display 15 minutes!

Ouch! I guess there's a lot of copying of image regions going on when
all you really want is a reference. What's the graphics library? PIL?

> BR,
>
> Zd.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>



-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to