On Sunday, February 24, 2013 11:53:52 AM UTC-5, zdenop wrote: > > On Sun, Feb 24, 2013 at 12:20 AM, Nick White > <[email protected]<javascript:> > > wrote: > >> On Fri, Feb 22, 2013 at 03:20:49PM +0000, Nick White wrote: >> > On Sun, Jun 03, 2012 at 10:27:23PM +0100, zdenko podobny wrote: >> > > it looks like it is ASCII only oriented (at least in report non-ASCII >> are >> > > malformed...), ftk has only binary distribution, so no possible fix >> can >> > > expected... >> > > >> > > BTW: tools are at new place: >> http://code.google.com/p/isri-ocr-evaluation-tools >> > > ; report can be found at stephenvrice.com/images/AT-1995.pdf >> > >> > I finally got around to working with these tools a bit. It seems >> > that they do process unicode correctly (though I haven't tested >> > combined characters, and suspect that may not work). You're correct >> > the reports don't seem to output unicode properly, but that's >> > probably easily fixed. >> >> Right, I created a workaround to enable at least the 'accuracy' tool >> (which is the really important one) to work fine with UTF-8. It's a >> script called utf8toolwrap.sh; if you're interested, check it out; >> it's attached to this issue: >> https://code.google.com/p/isri-ocr-evaluation-tools/issues/detail?id=2 >> >> It makes the 'accuracy' tool actually very useful; it shows how >> common various misrecognitions are - very useful for potential >> unicharambigs rules :) >> >> Nick >> >> P.S. It requires a Linux-ish environment, and the tools asc2uni and >> uni2asc from the isri toolkit to be available on the PATH. >> >> Hi > > thanks for caring about this... Maybe with would make a sense to make > fork of these tools ;-) Just in a case that there will be nobody who will > react on your patches. And we case some time with applying several patches > from issues ;-) >
I'd suggest cloning the SVN repository to Github using svn2git. Creating an SVN fork just has the potential to move the bottleneck to someone else. If it's on Github, anyone can fork it and work on improving things. Tom -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

