Hi friends,
 
OCR code's now been tweaked and tested to work in both WinXP and Win9x.
This should work in unix as well.
 
Here is a summary:
 
1. Put ocrad 0.16 in the path
 
2. Change the following in ImageStripper.py
 
                ocr = os.popen("ocrad -s %s -c %s -x %s < %s 2>ocrerr.txt" %
                               (scale, charset, orf, pnmfile))
 
into this
 
                ocr_cmd = ur'ocrad -s %s -c %s "%s"' % (scale, charset, pnmfile)
 
                # os.popen3() returns [stdin, stdout, stderr]
                ocr = os.popen3( ocr_cmd )[1]
 
3. Change this
 
        if os.path.exists(program) and is_executable(program):
 
into this
 
        if os.path.exists(program + ".exe") or ( os.path.exists(program) and is_executable(program) ):
 
Because of the way the instruction is interpreted it does not produce fatal errors even if the file is not found.
 
4. Change this
 
                for line in open(orf):
                    if line.startswith("lines"):
                        nlines = int(line.split()[1])
                        if nlines:
                            ctokens.add("image-text-lines:%d" %
                                        int(log2(nlines)))
 
into this

                nlines = ctext.count('\n')
                if nlines:
                    ctokens.add("image-text-lines:%d" %
                                nlines )
 
5. Finally I sugest you change the default scale from 1 to 2 like in this line
 
        scale = options["Tokenizer", "ocrad_scale"] or 2
 
 
Compile and enjoy.
 
 
Happy coding :)
 
Vibe
_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to