This only occurs when converting the hocr into plain text.  When I
just use ocroscript rec-tess everything appears correctly and the text
remains on the correct lines.

ocroscript rec-tess image.png > output.htm

That he did not transmute that thought into
action is known to-day the world over. That,
instead, he went ruggedly forward overcoming all
obstacles, to hew out a new career, greater by

But for some reason when I then convert the hocr to text using the
following it causes a line break to be inserted before the end of each
line.  This problem causes the last 1 - 2 words of each line to appear
on a completely new line.

ocroscript hocr-to-text output.htm > output.txt

That he did not transmute that thought
into
action is known to-day the world over.
That,
instead, he went ruggedly forward overcoming
all
obstacles, to hew out a new career, greater
by


Why is this occurring and is there any way to prevent it?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to