It works wonderfully.. can you explain more on the Regex statement? I can't 
understand what the first regex statement is matching against. 

Thanks again for sharing your wonderful solution!!

On Saturday, August 9, 2014 9:32:00 AM UTC+8, Quan Nguyen wrote:
>
> It employs a proper Regex statement. Following is the function in Java 
> that it uses:
>
>     /**
>      * Removes line breaks.
>      * @param text
>      * @return
>      */
>     public static String removeLineBreaks(String text) {
>         return text.replaceAll("(?<=\n|^)[\t ]+|[\t ]+(?=$|\n)", 
> "").replaceAll("(?<=.)\n(?=.)", " ");
>     }
>
>
>
> On Friday, August 8, 2014 10:49:27 AM UTC-5, Bruce wrote:
>>
>> Aww.. I tried removing the '\n' line breaks manually, however for some 
>> articles, the paragraph break still consists of single '\n' line break so 
>> if I remove that too doing a find/replace I loses the paragraph break. How 
>> did VietOCR solve this issue?
>>
>> On Thursday, August 7, 2014 7:23:27 AM UTC+8, Quan Nguyen wrote:
>>>
>>> I'm afraid not. You can use any programming editor that supports Regex 
>>> find/replace to do it for you, or use a tool such as VietOCR 
>>> <http://vietocr.sf.net> to remove line breaks from the output text.
>>>
>>> On Wednesday, August 6, 2014 10:51:34 AM UTC-5, Bruce wrote:
>>>>
>>>> For example with the image attached, I get the output:
>>>>
>>>>    - Chapter One
>>>>    - 
>>>>    - A royal-red Ford F—150 Super-
>>>>    - Crew rolled through the streets
>>>>    - of Albany, Georgia. The pickup’s
>>>>    - driver brimmed with optimism, so
>>>>    - much that he couldn’t possibly
>>>>    - foresee the battles about to hit
>>>>    - his hometown.
>>>>    - 
>>>>    - Life here is going to be good,
>>>>    - thirty—seven—year—old Nathan
>>>>    - Hayes told himself. After eight
>>>>    - years in Atlanta, Nathan had
>>>>    - come home to Albany, three
>>>>    - hours south, with his wife and
>>>>
>>>> Is there a way to make the output as the below, without the line breaks 
>>>> within a paragraph?
>>>>
>>>>    - Chapter One
>>>>    - 
>>>>    - A royal-red Ford F—150 Super-Crew rolled through the streets of 
>>>>    Albany, Georgia. The pickup’s driver brimmed with optimism, so much 
>>>> that he 
>>>>    couldn’t possibly foresee the battles about to hit his hometown.
>>>>    - 
>>>>    - Life here is going to be good, thirty—seven—year—old Nathan Hayes 
>>>>    told himself. After eight years in Atlanta, Nathan had come home to 
>>>> Albany, 
>>>>    three hours south, with his wife and
>>>>    
>>>> Thanks in advance!
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/edf3793f-e8b9-4772-9710-376236bba488%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to