It works wonderfully.. can you explain more on the Regex statement? I can't
understand what the first regex statement is matching against.
Thanks again for sharing your wonderful solution!!
On Saturday, August 9, 2014 9:32:00 AM UTC+8, Quan Nguyen wrote:
>
> It employs a proper Regex statement. Following is the function in Java
> that it uses:
>
> /**
> * Removes line breaks.
> * @param text
> * @return
> */
> public static String removeLineBreaks(String text) {
> return text.replaceAll("(?<=\n|^)[\t ]+|[\t ]+(?=$|\n)",
> "").replaceAll("(?<=.)\n(?=.)", " ");
> }
>
>
>
> On Friday, August 8, 2014 10:49:27 AM UTC-5, Bruce wrote:
>>
>> Aww.. I tried removing the '\n' line breaks manually, however for some
>> articles, the paragraph break still consists of single '\n' line break so
>> if I remove that too doing a find/replace I loses the paragraph break. How
>> did VietOCR solve this issue?
>>
>> On Thursday, August 7, 2014 7:23:27 AM UTC+8, Quan Nguyen wrote:
>>>
>>> I'm afraid not. You can use any programming editor that supports Regex
>>> find/replace to do it for you, or use a tool such as VietOCR
>>> <http://vietocr.sf.net> to remove line breaks from the output text.
>>>
>>> On Wednesday, August 6, 2014 10:51:34 AM UTC-5, Bruce wrote:
>>>>
>>>> For example with the image attached, I get the output:
>>>>
>>>> - Chapter One
>>>> -
>>>> - A royal-red Ford F—150 Super-
>>>> - Crew rolled through the streets
>>>> - of Albany, Georgia. The pickup’s
>>>> - driver brimmed with optimism, so
>>>> - much that he couldn’t possibly
>>>> - foresee the battles about to hit
>>>> - his hometown.
>>>> -
>>>> - Life here is going to be good,
>>>> - thirty—seven—year—old Nathan
>>>> - Hayes told himself. After eight
>>>> - years in Atlanta, Nathan had
>>>> - come home to Albany, three
>>>> - hours south, with his wife and
>>>>
>>>> Is there a way to make the output as the below, without the line breaks
>>>> within a paragraph?
>>>>
>>>> - Chapter One
>>>> -
>>>> - A royal-red Ford F—150 Super-Crew rolled through the streets of
>>>> Albany, Georgia. The pickup’s driver brimmed with optimism, so much
>>>> that he
>>>> couldn’t possibly foresee the battles about to hit his hometown.
>>>> -
>>>> - Life here is going to be good, thirty—seven—year—old Nathan Hayes
>>>> told himself. After eight years in Atlanta, Nathan had come home to
>>>> Albany,
>>>> three hours south, with his wife and
>>>>
>>>> Thanks in advance!
>>>>
>>>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/edf3793f-e8b9-4772-9710-376236bba488%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.