Hi,
I wrote this small script to speed up OCRD-train
<https://github.com/OCR-D/ocrd-train> training startup.
It generates the boxes for all the images provided on the command line (it
works only for single line images).
It is a simple conversion of the generate_line_box.py from ocrd-train. I
used it once, it seems to work fine.
Currently with OCR-D the boxes and lstmf generation is very slow because it
starts a new process for each image.
I execute this script before calling the makefile.
I do the "shell expansion" in python so that it can handle a very long list
of files.
So you need to call it in this way:
python generate_all_line_boxes.py -i 'data/train/*.tif'
with single quotes to prevent shell expansion.
BTW, it would be nice to have the same thing for the lstmf files.
Bye
Lorenzo
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLwTnogqM0C1jk69QKX3hMFvk7nuMJLYAbvw%2BsL%3DZdsQcA%40mail.gmail.com.
#!/usr/bin/env python
import io
import argparse
import unicodedata
from PIL import Image
import glob
#
# command line arguments
#
arg_parser = argparse.ArgumentParser('''Creates tesseract box files for given (line) image text pairs''')
# Image files (NOTE: use quotes in the command line to prevent shell expansion)
arg_parser.add_argument('-i', '--images', nargs='?', metavar='IMAGE', help='Image files', required=True)
args = arg_parser.parse_args()
#
# main
#
files = list(glob.glob(args.images))
for image_name in files:
#print("Processing:", image_name)
# load image
with open(image_name, "rb") as f:
width, height = Image.open(f).size
# load gt
gt_txt_name = image_name.replace(".tif", ".gt.txt")
with io.open(gt_txt_name, "r", encoding='utf-8') as f:
lines = f.read().strip().split('\n')
box_name = image_name.replace(".tif", ".box")
with io.open(box_name, "w", encoding='utf-8') as f:
for line in lines:
if len(line) == 0:
f.write("WARNING: line is empty")
for i in range(1, len(line)):
char = line[i]
prev_char = line[i-1]
if unicodedata.combining(char):
f.write(u"%s %d %d %d %d 0\n" % ((prev_char + char), 0, 0, width, height))
elif not unicodedata.combining(prev_char):
f.write(u"%s %d %d %d %d 0\n" % (prev_char, 0, 0, width, height))
if not unicodedata.combining(line[-1]):
f.write(u"%s %d %d %d %d 0\n" % (line[-1], 0, 0, width, height))
f.write(u"%s %d %d %d %d 0\n" % ("\t", width, height, width+1, height+1))