Re: [AI] OCR conversion results not accurate

2017-11-26 Thread Ajay Minocha
Hi,

Can say from my experience that OpenBook works particularly well with
pages having multiple columns. I personally don't like the way
FindReader handles those kind of pages, I use openbook for that
purpose.

HTH,
Ajay

On 26/11/2017, Leikhu Laishram Leingakpa <laishram...@gmail.com> wrote:
> Dear Rahul
> I too face the same problem whenever there are two or more columns in
> a page. May be once you can see by converting into searchable text in
> pdf in stead of docx or rtf.
>
> Leikhu Laishram
>
> On 11/25/17, mukesh jain <mukesh.jai...@gmail.com> wrote:
>> hello,
>> even I would like to test them as I myself so far  scanned my law books.
>>
>> so if convenient, you may attach few of them on my email.
>>
>>
>> On 11/25/17, Sameer <sameer.la...@gmail.com> wrote:
>>>
>>> Dear Friend,
>>>
>>> Kindly send me a sample pdf on my email id, sameer.la...@gmail.com
>>>
>>> Regards
>>> Mr. Sameer Latey
>>> Mumbai, India
>>> -Original Message-
>>> From: Rahul Bajaj
>>> Sent: 25 November, 2017 3:03 PM
>>> To: accessindia@accessindia.org.in
>>> Subject: [AI] OCR conversion results not accurate
>>>
>>> Hi Everyone,
>>>
>>> I am sure this has been discussed multiple times on the list before,
>>> but I'd nonetheless be grateful if someone could help me with this.
>>> I use Fine Reader 11. More than 60% of the documents I receive at work
>>> are inaccessible PDFs which I have to convert into word.
>>> While I am able to obtain reasonably accurate results, the text
>>> appears up in a badly jumbled fashion in some portions of the
>>> document. For instance, the definitions clause of a lot of contract
>>> appears in the result in such a way that all the terms appear together
>>> and all the definitions appear together, so it becomes difficult to
>>> figure out which definition relates to which term.
>>> Further, some paras are often incomplete. Clause numbers are often
>>> missing. This high rate of inaccuracy makes one doubt even the
>>> correctness of the portion that does appear properly.
>>>
>>> Will migrating to the latest version of Fine Reader help with this, or
>>> is this attributable to inherent weaknesses of the OCR process?
>>> Further, would switching over to another OCR engine produce better
>>> results?
>>> If so, which OCR engine should I use, and where might I be able to get
>>> it?
>>> I'd be happy to send a couple of sample documents to those of you
>>> using OCR engines apart from Fine Reader which you think would work
>>> better.
>>>
>>>
>>>
>>> Best,
>>> Rahul
>>>
>>> The list has now migrated to www.accessindia.inclusivehabitat.in
>>>
>>> You should now post to the id: a...@accessindia.inclusivehabitat.in
>>>
>>>
>>>
>>>
>>> Search for old postings at:
>>> http://www.mail-archive.com/accessindia@accessindia.org.in/
>>>
>>> To unsubscribe send a message to
>>> accessindia-requ...@accessindia.org.in
>>> with the subject unsubscribe.
>>>
>>> To change your subscription to digest mode or make any other changes,
>>> please
>>>
>>> visit the list home page at
>>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>>>
>>>
>>> Disclaimer:
>>> 1. Contents of the mails, factual, or otherwise, reflect the thinking of
>>> the
>>>
>>> person sending the mail and AI in no way relates itself to its veracity;
>>>
>>> 2. AI cannot be held liable for any commission/omission based on the
>>> mails
>>> sent through this mailing list..
>>>
>>>
>>> The list has now migrated to www.accessindia.inclusivehabitat.in
>>>
>>> You should now post to the id: a...@accessindia.inclusivehabitat.in
>>>
>>>
>>>
>>>
>>> Search for old postings at:
>>> http://www.mail-archive.com/accessindia@accessindia.org.in/
>>>
>>> To unsubscribe send a message to
>>> accessindia-requ...@accessindia.org.in
>>> with the subject unsubscribe.
>>>
>>> To change your subscription to digest mode or make any other changes,
>>> please
>>> visit the list home page at
>>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>>>
>>>
>>> Disclaim

Re: [AI] OCR conversion results not accurate

2017-11-26 Thread Leikhu Laishram Leingakpa
Dear Rahul
I too face the same problem whenever there are two or more columns in
a page. May be once you can see by converting into searchable text in
pdf in stead of docx or rtf.

Leikhu Laishram

On 11/25/17, mukesh jain <mukesh.jai...@gmail.com> wrote:
> hello,
> even I would like to test them as I myself so far  scanned my law books.
>
> so if convenient, you may attach few of them on my email.
>
>
> On 11/25/17, Sameer <sameer.la...@gmail.com> wrote:
>>
>> Dear Friend,
>>
>> Kindly send me a sample pdf on my email id, sameer.la...@gmail.com
>>
>> Regards
>> Mr. Sameer Latey
>> Mumbai, India
>> -Original Message-
>> From: Rahul Bajaj
>> Sent: 25 November, 2017 3:03 PM
>> To: accessindia@accessindia.org.in
>> Subject: [AI] OCR conversion results not accurate
>>
>> Hi Everyone,
>>
>> I am sure this has been discussed multiple times on the list before,
>> but I'd nonetheless be grateful if someone could help me with this.
>> I use Fine Reader 11. More than 60% of the documents I receive at work
>> are inaccessible PDFs which I have to convert into word.
>> While I am able to obtain reasonably accurate results, the text
>> appears up in a badly jumbled fashion in some portions of the
>> document. For instance, the definitions clause of a lot of contract
>> appears in the result in such a way that all the terms appear together
>> and all the definitions appear together, so it becomes difficult to
>> figure out which definition relates to which term.
>> Further, some paras are often incomplete. Clause numbers are often
>> missing. This high rate of inaccuracy makes one doubt even the
>> correctness of the portion that does appear properly.
>>
>> Will migrating to the latest version of Fine Reader help with this, or
>> is this attributable to inherent weaknesses of the OCR process?
>> Further, would switching over to another OCR engine produce better
>> results?
>> If so, which OCR engine should I use, and where might I be able to get
>> it?
>> I'd be happy to send a couple of sample documents to those of you
>> using OCR engines apart from Fine Reader which you think would work
>> better.
>>
>>
>>
>> Best,
>> Rahul
>>
>> The list has now migrated to www.accessindia.inclusivehabitat.in
>>
>> You should now post to the id: a...@accessindia.inclusivehabitat.in
>>
>>
>>
>>
>> Search for old postings at:
>> http://www.mail-archive.com/accessindia@accessindia.org.in/
>>
>> To unsubscribe send a message to
>> accessindia-requ...@accessindia.org.in
>> with the subject unsubscribe.
>>
>> To change your subscription to digest mode or make any other changes,
>> please
>>
>> visit the list home page at
>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>>
>>
>> Disclaimer:
>> 1. Contents of the mails, factual, or otherwise, reflect the thinking of
>> the
>>
>> person sending the mail and AI in no way relates itself to its veracity;
>>
>> 2. AI cannot be held liable for any commission/omission based on the
>> mails
>> sent through this mailing list..
>>
>>
>> The list has now migrated to www.accessindia.inclusivehabitat.in
>>
>> You should now post to the id: a...@accessindia.inclusivehabitat.in
>>
>>
>>
>>
>> Search for old postings at:
>> http://www.mail-archive.com/accessindia@accessindia.org.in/
>>
>> To unsubscribe send a message to
>> accessindia-requ...@accessindia.org.in
>> with the subject unsubscribe.
>>
>> To change your subscription to digest mode or make any other changes,
>> please
>> visit the list home page at
>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>>
>>
>> Disclaimer:
>> 1. Contents of the mails, factual, or otherwise, reflect the thinking of
>> the
>> person sending the mail and AI in no way relates itself to its veracity;
>>
>> 2. AI cannot be held liable for any commission/omission based on the
>> mails
>> sent through this mailing list..
>>
>
>
> --
> Regards,
> Mukesh jain
> Email:
> mukesh.jai...@gmail.com
> mukeshheerachandj...@ntpc.co.in
> Skype: mukeshjain211
> Mob: 09977165123
> "Face your deficiencies and acknowledge them; but do not let them
> master you. Let them teach you patience, sweetness, insight. "
>
> Helen Keller
>
> The list has now migrated to www.accessindia.inclusivehabitat.in
>
&g

Re: [AI] OCR conversion results not accurate

2017-11-25 Thread mukesh jain
hello,
even I would like to test them as I myself so far  scanned my law books.

so if convenient, you may attach few of them on my email.


On 11/25/17, Sameer <sameer.la...@gmail.com> wrote:
>
> Dear Friend,
>
> Kindly send me a sample pdf on my email id, sameer.la...@gmail.com
>
> Regards
> Mr. Sameer Latey
> Mumbai, India
> -Original Message-
> From: Rahul Bajaj
> Sent: 25 November, 2017 3:03 PM
> To: accessindia@accessindia.org.in
> Subject: [AI] OCR conversion results not accurate
>
> Hi Everyone,
>
> I am sure this has been discussed multiple times on the list before,
> but I'd nonetheless be grateful if someone could help me with this.
> I use Fine Reader 11. More than 60% of the documents I receive at work
> are inaccessible PDFs which I have to convert into word.
> While I am able to obtain reasonably accurate results, the text
> appears up in a badly jumbled fashion in some portions of the
> document. For instance, the definitions clause of a lot of contract
> appears in the result in such a way that all the terms appear together
> and all the definitions appear together, so it becomes difficult to
> figure out which definition relates to which term.
> Further, some paras are often incomplete. Clause numbers are often
> missing. This high rate of inaccuracy makes one doubt even the
> correctness of the portion that does appear properly.
>
> Will migrating to the latest version of Fine Reader help with this, or
> is this attributable to inherent weaknesses of the OCR process?
> Further, would switching over to another OCR engine produce better
> results?
> If so, which OCR engine should I use, and where might I be able to get it?
> I'd be happy to send a couple of sample documents to those of you
> using OCR engines apart from Fine Reader which you think would work
> better.
>
>
>
> Best,
> Rahul
>
> The list has now migrated to www.accessindia.inclusivehabitat.in
>
> You should now post to the id: a...@accessindia.inclusivehabitat.in
>
>
>
>
> Search for old postings at:
> http://www.mail-archive.com/accessindia@accessindia.org.in/
>
> To unsubscribe send a message to
> accessindia-requ...@accessindia.org.in
> with the subject unsubscribe.
>
> To change your subscription to digest mode or make any other changes, please
>
> visit the list home page at
> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>
>
> Disclaimer:
> 1. Contents of the mails, factual, or otherwise, reflect the thinking of the
>
> person sending the mail and AI in no way relates itself to its veracity;
>
> 2. AI cannot be held liable for any commission/omission based on the mails
> sent through this mailing list..
>
>
> The list has now migrated to www.accessindia.inclusivehabitat.in
>
> You should now post to the id: a...@accessindia.inclusivehabitat.in
>
>
>
>
> Search for old postings at:
> http://www.mail-archive.com/accessindia@accessindia.org.in/
>
> To unsubscribe send a message to
> accessindia-requ...@accessindia.org.in
> with the subject unsubscribe.
>
> To change your subscription to digest mode or make any other changes, please
> visit the list home page at
> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in
>
>
> Disclaimer:
> 1. Contents of the mails, factual, or otherwise, reflect the thinking of the
> person sending the mail and AI in no way relates itself to its veracity;
>
> 2. AI cannot be held liable for any commission/omission based on the mails
> sent through this mailing list..
>


-- 
Regards,
Mukesh jain
Email:
mukesh.jai...@gmail.com
mukeshheerachandj...@ntpc.co.in
Skype: mukeshjain211
Mob: 09977165123
"Face your deficiencies and acknowledge them; but do not let them
master you. Let them teach you patience, sweetness, insight. "

Helen Keller

The list has now migrated to www.accessindia.inclusivehabitat.in

You should now post to the id: a...@accessindia.inclusivehabitat.in




Search for old postings at:
http://www.mail-archive.com/accessindia@accessindia.org.in/

To unsubscribe send a message to
accessindia-requ...@accessindia.org.in
with the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at
http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in


Disclaimer:
1. Contents of the mails, factual, or otherwise, reflect the thinking of the 
person sending the mail and AI in no way relates itself to its veracity;

2. AI cannot be held liable for any commission/omission based on the mails sent 
through this mailing list..


Re: [AI] OCR conversion results not accurate

2017-11-25 Thread Sameer


Dear Friend,

Kindly send me a sample pdf on my email id, sameer.la...@gmail.com

Regards
Mr. Sameer Latey
Mumbai, India
-Original Message- 
From: Rahul Bajaj

Sent: 25 November, 2017 3:03 PM
To: accessindia@accessindia.org.in
Subject: [AI] OCR conversion results not accurate

Hi Everyone,

I am sure this has been discussed multiple times on the list before,
but I'd nonetheless be grateful if someone could help me with this.
I use Fine Reader 11. More than 60% of the documents I receive at work
are inaccessible PDFs which I have to convert into word.
While I am able to obtain reasonably accurate results, the text
appears up in a badly jumbled fashion in some portions of the
document. For instance, the definitions clause of a lot of contract
appears in the result in such a way that all the terms appear together
and all the definitions appear together, so it becomes difficult to
figure out which definition relates to which term.
Further, some paras are often incomplete. Clause numbers are often
missing. This high rate of inaccuracy makes one doubt even the
correctness of the portion that does appear properly.

Will migrating to the latest version of Fine Reader help with this, or
is this attributable to inherent weaknesses of the OCR process?
Further, would switching over to another OCR engine produce better
results?
If so, which OCR engine should I use, and where might I be able to get it?
I'd be happy to send a couple of sample documents to those of you
using OCR engines apart from Fine Reader which you think would work
better.



Best,
Rahul

The list has now migrated to www.accessindia.inclusivehabitat.in

You should now post to the id: a...@accessindia.inclusivehabitat.in




Search for old postings at:
http://www.mail-archive.com/accessindia@accessindia.org.in/

To unsubscribe send a message to
accessindia-requ...@accessindia.org.in
with the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at

http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in


Disclaimer:
1. Contents of the mails, factual, or otherwise, reflect the thinking of the 
person sending the mail and AI in no way relates itself to its veracity;


2. AI cannot be held liable for any commission/omission based on the mails 
sent through this mailing list.. 



The list has now migrated to www.accessindia.inclusivehabitat.in

You should now post to the id: a...@accessindia.inclusivehabitat.in




Search for old postings at:
http://www.mail-archive.com/accessindia@accessindia.org.in/

To unsubscribe send a message to
accessindia-requ...@accessindia.org.in
with the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at
http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in


Disclaimer:
1. Contents of the mails, factual, or otherwise, reflect the thinking of the 
person sending the mail and AI in no way relates itself to its veracity;

2. AI cannot be held liable for any commission/omission based on the mails sent 
through this mailing list..


[AI] OCR conversion results not accurate

2017-11-25 Thread Rahul Bajaj
Hi Everyone,

I am sure this has been discussed multiple times on the list before,
but I'd nonetheless be grateful if someone could help me with this.
I use Fine Reader 11. More than 60% of the documents I receive at work
are inaccessible PDFs which I have to convert into word.
While I am able to obtain reasonably accurate results, the text
appears up in a badly jumbled fashion in some portions of the
document. For instance, the definitions clause of a lot of contract
appears in the result in such a way that all the terms appear together
and all the definitions appear together, so it becomes difficult to
figure out which definition relates to which term.
Further, some paras are often incomplete. Clause numbers are often
missing. This high rate of inaccuracy makes one doubt even the
correctness of the portion that does appear properly.

Will migrating to the latest version of Fine Reader help with this, or
is this attributable to inherent weaknesses of the OCR process?
Further, would switching over to another OCR engine produce better
results?
If so, which OCR engine should I use, and where might I be able to get it?
I'd be happy to send a couple of sample documents to those of you
using OCR engines apart from Fine Reader which you think would work
better.



Best,
Rahul

The list has now migrated to www.accessindia.inclusivehabitat.in

You should now post to the id: a...@accessindia.inclusivehabitat.in




Search for old postings at:
http://www.mail-archive.com/accessindia@accessindia.org.in/

To unsubscribe send a message to
accessindia-requ...@accessindia.org.in
with the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at
http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in


Disclaimer:
1. Contents of the mails, factual, or otherwise, reflect the thinking of the 
person sending the mail and AI in no way relates itself to its veracity;

2. AI cannot be held liable for any commission/omission based on the mails sent 
through this mailing list..