Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-26 Thread adarsh
Thanks alot shree.

On Monday, February 26, 2018 at 2:04:04 PM UTC+5:30, shree wrote:
>
> try
>
> -c page_separator= "\n"
>
> or the code for CRLF
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2b71683b-5b16-4b85-bda7-c3de26bbae86%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-26 Thread ShreeDevi Kumar
 try

-c page_separator= "\n"

or the code for CRLF

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXbGmJi9CBQw59%2BoXK5Yg5Le%2B8vQUphqThHOFWkAcWg%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-25 Thread adarsh
Can you please suggest a way to print a newline instead of FF. I am able to 
print any character other than formfeed by using the " -c 
page_separator="Hello" " option, but i don't know how to print a newline. 

Thanks in advance. 

Regards
Adarsh

On Friday, February 23, 2018 at 6:03:32 PM UTC+5:30, shree wrote:
>
> Probably FF.
>
> Tesseract adds a page break (normally form feed) by default.
>
> It is still possible to suppress page breaks by setting an empty
> page_separator.
>
>
> ShreeDevi
> 
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Fri, Feb 23, 2018 at 12:29 PM, > 
> wrote:
>
>> Is there any way to remove the End of page symbol that appears in the 
>> image? It looks like a box with some 000c written at the end. 
>>
>> Regards
>> Adarsh
>>
>>
>> On  Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote:
>>>
>>> What --psm are you using?
>>>
>>> Tesseract might be treating the last portion as a different column.
>>>
>>> Try PSM 4 or 6.
>>>
>>> On 22-Feb-2018 3:48 PM,  wrote:
>>>

 


 


 
 The issue I am facing is that when i scan a file which has coumn data 
 separeated by "|" , OR, then in a single line, tesseract is printing the 
 last column data after the last line of the file.
 I'll be attaching the image for your referral. Hope i receive some help 
 soon. The output image has the discrepancy on the last line .

 Can anyone suggest some solution. @shree much help needed.


 -- 
 You received this message because you are subscribed to the Google 
 Groups "tesseract-ocr" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.
 To post to this group, send email to tesser...@googlegroups.com.
 Visit this group at https://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/76378a71-f459-454e-9c6c-a0e3f682b1b9%40googlegroups.com
  
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/8f7a5127-f9ee-40c9-abfe-7843ff4c1a71%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8d9e5c41-2fd4-43ea-9091-85bc6bc3087a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-23 Thread ShreeDevi Kumar
Probably FF.

Tesseract adds a page break (normally form feed) by default.

It is still possible to suppress page breaks by setting an empty
page_separator.


ShreeDevi

भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Feb 23, 2018 at 12:29 PM,  wrote:

> Is there any way to remove the End of page symbol that appears in the
> image? It looks like a box with some 000c written at the end.
>
> Regards
> Adarsh
>
>
> On  Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote:
>>
>> What --psm are you using?
>>
>> Tesseract might be treating the last portion as a different column.
>>
>> Try PSM 4 or 6.
>>
>> On 22-Feb-2018 3:48 PM,  wrote:
>>
>>>
>>> 
>>>
>>>
>>> 
>>>
>>>
>>> 
>>> The issue I am facing is that when i scan a file which has coumn data
>>> separeated by "|" , OR, then in a single line, tesseract is printing the
>>> last column data after the last line of the file.
>>> I'll be attaching the image for your referral. Hope i receive some help
>>> soon. The output image has the discrepancy on the last line .
>>>
>>> Can anyone suggest some solution. @shree much help needed.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/76378a71-f459-454e-9c6c-a0e3f682b1b9%40goo
>>> glegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/8f7a5127-f9ee-40c9-abfe-7843ff4c1a71%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUAsHKZL4TuE234g0WoZ%3Dt%3DGvAfiKs4Cp4PrkxXqN7p%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread adarsh
Is there any way to remove the End of page symbol that appears in the 
image? It looks like a box with some 000c written at the end. 

Regards
Adarsh


On  Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote:
>
> What --psm are you using?
>
> Tesseract might be treating the last portion as a different column.
>
> Try PSM 4 or 6.
>
> On 22-Feb-2018 3:48 PM, > wrote:
>
>>
>> 
>>
>>
>> 
>>
>>
>> 
>> The issue I am facing is that when i scan a file which has coumn data 
>> separeated by "|" , OR, then in a single line, tesseract is printing the 
>> last column data after the last line of the file.
>> I'll be attaching the image for your referral. Hope i receive some help 
>> soon. The output image has the discrepancy on the last line .
>>
>> Can anyone suggest some solution. @shree much help needed.
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/76378a71-f459-454e-9c6c-a0e3f682b1b9%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8f7a5127-f9ee-40c9-abfe-7843ff4c1a71%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread adarsh
@shree

You are awesome. 

Your solution straightaway gave me the solution. You are awesome man. 
Really appreciate your help. You have responded whenever I needed it. :)

Keep up the good work.

Regards
Adarsh SHUKLA

On Thursday, February 22, 2018 at 4:22:21 PM UTC+5:30, shree wrote:
>
> What --psm are you using?
>
> Tesseract might be treating the last portion as a different column.
>
> Try PSM 4 or 6.
>
> On 22-Feb-2018 3:48 PM, > wrote:
>
>>
>> 
>>
>>
>> 
>>
>>
>> 
>> The issue I am facing is that when i scan a file which has coumn data 
>> separeated by "|" , OR, then in a single line, tesseract is printing the 
>> last column data after the last line of the file.
>> I'll be attaching the image for your referral. Hope i receive some help 
>> soon. The output image has the discrepancy on the last line .
>>
>> Can anyone suggest some solution. @shree much help needed.
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/76378a71-f459-454e-9c6c-a0e3f682b1b9%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/02adb1b8-180f-436a-8daa-f895e06e3d66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Tesseract is giving column data on the last line of file

2018-02-22 Thread ShreeDevi Kumar
What --psm are you using?

Tesseract might be treating the last portion as a different column.

Try PSM 4 or 6.

On 22-Feb-2018 3:48 PM,  wrote:

>
> 
>
>
> 
>
>
> 
> The issue I am facing is that when i scan a file which has coumn data
> separeated by "|" , OR, then in a single line, tesseract is printing the
> last column data after the last line of the file.
> I'll be attaching the image for your referral. Hope i receive some help
> soon. The output image has the discrepancy on the last line .
>
> Can anyone suggest some solution. @shree much help needed.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/76378a71-f459-454e-9c6c-a0e3f682b1b9%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX2gCkFAvotKghFK%2BD8vu9rodv3HKZWt_zpScHo3-uZag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.