Re: [AI] OCR conversion results not accurate
Hi, Can say from my experience that OpenBook works particularly well with pages having multiple columns. I personally don't like the way FindReader handles those kind of pages, I use openbook for that purpose. HTH, Ajay On 26/11/2017, Leikhu Laishram Leingakpa <laishram...@gmail.com> wrote: > Dear Rahul > I too face the same problem whenever there are two or more columns in > a page. May be once you can see by converting into searchable text in > pdf in stead of docx or rtf. > > Leikhu Laishram > > On 11/25/17, mukesh jain <mukesh.jai...@gmail.com> wrote: >> hello, >> even I would like to test them as I myself so far scanned my law books. >> >> so if convenient, you may attach few of them on my email. >> >> >> On 11/25/17, Sameer <sameer.la...@gmail.com> wrote: >>> >>> Dear Friend, >>> >>> Kindly send me a sample pdf on my email id, sameer.la...@gmail.com >>> >>> Regards >>> Mr. Sameer Latey >>> Mumbai, India >>> -Original Message- >>> From: Rahul Bajaj >>> Sent: 25 November, 2017 3:03 PM >>> To: accessindia@accessindia.org.in >>> Subject: [AI] OCR conversion results not accurate >>> >>> Hi Everyone, >>> >>> I am sure this has been discussed multiple times on the list before, >>> but I'd nonetheless be grateful if someone could help me with this. >>> I use Fine Reader 11. More than 60% of the documents I receive at work >>> are inaccessible PDFs which I have to convert into word. >>> While I am able to obtain reasonably accurate results, the text >>> appears up in a badly jumbled fashion in some portions of the >>> document. For instance, the definitions clause of a lot of contract >>> appears in the result in such a way that all the terms appear together >>> and all the definitions appear together, so it becomes difficult to >>> figure out which definition relates to which term. >>> Further, some paras are often incomplete. Clause numbers are often >>> missing. This high rate of inaccuracy makes one doubt even the >>> correctness of the portion that does appear properly. >>> >>> Will migrating to the latest version of Fine Reader help with this, or >>> is this attributable to inherent weaknesses of the OCR process? >>> Further, would switching over to another OCR engine produce better >>> results? >>> If so, which OCR engine should I use, and where might I be able to get >>> it? >>> I'd be happy to send a couple of sample documents to those of you >>> using OCR engines apart from Fine Reader which you think would work >>> better. >>> >>> >>> >>> Best, >>> Rahul >>> >>> The list has now migrated to www.accessindia.inclusivehabitat.in >>> >>> You should now post to the id: a...@accessindia.inclusivehabitat.in >>> >>> >>> >>> >>> Search for old postings at: >>> http://www.mail-archive.com/accessindia@accessindia.org.in/ >>> >>> To unsubscribe send a message to >>> accessindia-requ...@accessindia.org.in >>> with the subject unsubscribe. >>> >>> To change your subscription to digest mode or make any other changes, >>> please >>> >>> visit the list home page at >>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in >>> >>> >>> Disclaimer: >>> 1. Contents of the mails, factual, or otherwise, reflect the thinking of >>> the >>> >>> person sending the mail and AI in no way relates itself to its veracity; >>> >>> 2. AI cannot be held liable for any commission/omission based on the >>> mails >>> sent through this mailing list.. >>> >>> >>> The list has now migrated to www.accessindia.inclusivehabitat.in >>> >>> You should now post to the id: a...@accessindia.inclusivehabitat.in >>> >>> >>> >>> >>> Search for old postings at: >>> http://www.mail-archive.com/accessindia@accessindia.org.in/ >>> >>> To unsubscribe send a message to >>> accessindia-requ...@accessindia.org.in >>> with the subject unsubscribe. >>> >>> To change your subscription to digest mode or make any other changes, >>> please >>> visit the list home page at >>> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in >>> >>> >>> Disclaim
Re: [AI] OCR conversion results not accurate
Dear Rahul I too face the same problem whenever there are two or more columns in a page. May be once you can see by converting into searchable text in pdf in stead of docx or rtf. Leikhu Laishram On 11/25/17, mukesh jain <mukesh.jai...@gmail.com> wrote: > hello, > even I would like to test them as I myself so far scanned my law books. > > so if convenient, you may attach few of them on my email. > > > On 11/25/17, Sameer <sameer.la...@gmail.com> wrote: >> >> Dear Friend, >> >> Kindly send me a sample pdf on my email id, sameer.la...@gmail.com >> >> Regards >> Mr. Sameer Latey >> Mumbai, India >> -Original Message- >> From: Rahul Bajaj >> Sent: 25 November, 2017 3:03 PM >> To: accessindia@accessindia.org.in >> Subject: [AI] OCR conversion results not accurate >> >> Hi Everyone, >> >> I am sure this has been discussed multiple times on the list before, >> but I'd nonetheless be grateful if someone could help me with this. >> I use Fine Reader 11. More than 60% of the documents I receive at work >> are inaccessible PDFs which I have to convert into word. >> While I am able to obtain reasonably accurate results, the text >> appears up in a badly jumbled fashion in some portions of the >> document. For instance, the definitions clause of a lot of contract >> appears in the result in such a way that all the terms appear together >> and all the definitions appear together, so it becomes difficult to >> figure out which definition relates to which term. >> Further, some paras are often incomplete. Clause numbers are often >> missing. This high rate of inaccuracy makes one doubt even the >> correctness of the portion that does appear properly. >> >> Will migrating to the latest version of Fine Reader help with this, or >> is this attributable to inherent weaknesses of the OCR process? >> Further, would switching over to another OCR engine produce better >> results? >> If so, which OCR engine should I use, and where might I be able to get >> it? >> I'd be happy to send a couple of sample documents to those of you >> using OCR engines apart from Fine Reader which you think would work >> better. >> >> >> >> Best, >> Rahul >> >> The list has now migrated to www.accessindia.inclusivehabitat.in >> >> You should now post to the id: a...@accessindia.inclusivehabitat.in >> >> >> >> >> Search for old postings at: >> http://www.mail-archive.com/accessindia@accessindia.org.in/ >> >> To unsubscribe send a message to >> accessindia-requ...@accessindia.org.in >> with the subject unsubscribe. >> >> To change your subscription to digest mode or make any other changes, >> please >> >> visit the list home page at >> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in >> >> >> Disclaimer: >> 1. Contents of the mails, factual, or otherwise, reflect the thinking of >> the >> >> person sending the mail and AI in no way relates itself to its veracity; >> >> 2. AI cannot be held liable for any commission/omission based on the >> mails >> sent through this mailing list.. >> >> >> The list has now migrated to www.accessindia.inclusivehabitat.in >> >> You should now post to the id: a...@accessindia.inclusivehabitat.in >> >> >> >> >> Search for old postings at: >> http://www.mail-archive.com/accessindia@accessindia.org.in/ >> >> To unsubscribe send a message to >> accessindia-requ...@accessindia.org.in >> with the subject unsubscribe. >> >> To change your subscription to digest mode or make any other changes, >> please >> visit the list home page at >> http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in >> >> >> Disclaimer: >> 1. Contents of the mails, factual, or otherwise, reflect the thinking of >> the >> person sending the mail and AI in no way relates itself to its veracity; >> >> 2. AI cannot be held liable for any commission/omission based on the >> mails >> sent through this mailing list.. >> > > > -- > Regards, > Mukesh jain > Email: > mukesh.jai...@gmail.com > mukeshheerachandj...@ntpc.co.in > Skype: mukeshjain211 > Mob: 09977165123 > "Face your deficiencies and acknowledge them; but do not let them > master you. Let them teach you patience, sweetness, insight. " > > Helen Keller > > The list has now migrated to www.accessindia.inclusivehabitat.in > &g
Re: [AI] OCR conversion results not accurate
hello, even I would like to test them as I myself so far scanned my law books. so if convenient, you may attach few of them on my email. On 11/25/17, Sameer <sameer.la...@gmail.com> wrote: > > Dear Friend, > > Kindly send me a sample pdf on my email id, sameer.la...@gmail.com > > Regards > Mr. Sameer Latey > Mumbai, India > -Original Message- > From: Rahul Bajaj > Sent: 25 November, 2017 3:03 PM > To: accessindia@accessindia.org.in > Subject: [AI] OCR conversion results not accurate > > Hi Everyone, > > I am sure this has been discussed multiple times on the list before, > but I'd nonetheless be grateful if someone could help me with this. > I use Fine Reader 11. More than 60% of the documents I receive at work > are inaccessible PDFs which I have to convert into word. > While I am able to obtain reasonably accurate results, the text > appears up in a badly jumbled fashion in some portions of the > document. For instance, the definitions clause of a lot of contract > appears in the result in such a way that all the terms appear together > and all the definitions appear together, so it becomes difficult to > figure out which definition relates to which term. > Further, some paras are often incomplete. Clause numbers are often > missing. This high rate of inaccuracy makes one doubt even the > correctness of the portion that does appear properly. > > Will migrating to the latest version of Fine Reader help with this, or > is this attributable to inherent weaknesses of the OCR process? > Further, would switching over to another OCR engine produce better > results? > If so, which OCR engine should I use, and where might I be able to get it? > I'd be happy to send a couple of sample documents to those of you > using OCR engines apart from Fine Reader which you think would work > better. > > > > Best, > Rahul > > The list has now migrated to www.accessindia.inclusivehabitat.in > > You should now post to the id: a...@accessindia.inclusivehabitat.in > > > > > Search for old postings at: > http://www.mail-archive.com/accessindia@accessindia.org.in/ > > To unsubscribe send a message to > accessindia-requ...@accessindia.org.in > with the subject unsubscribe. > > To change your subscription to digest mode or make any other changes, please > > visit the list home page at > http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in > > > Disclaimer: > 1. Contents of the mails, factual, or otherwise, reflect the thinking of the > > person sending the mail and AI in no way relates itself to its veracity; > > 2. AI cannot be held liable for any commission/omission based on the mails > sent through this mailing list.. > > > The list has now migrated to www.accessindia.inclusivehabitat.in > > You should now post to the id: a...@accessindia.inclusivehabitat.in > > > > > Search for old postings at: > http://www.mail-archive.com/accessindia@accessindia.org.in/ > > To unsubscribe send a message to > accessindia-requ...@accessindia.org.in > with the subject unsubscribe. > > To change your subscription to digest mode or make any other changes, please > visit the list home page at > http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in > > > Disclaimer: > 1. Contents of the mails, factual, or otherwise, reflect the thinking of the > person sending the mail and AI in no way relates itself to its veracity; > > 2. AI cannot be held liable for any commission/omission based on the mails > sent through this mailing list.. > -- Regards, Mukesh jain Email: mukesh.jai...@gmail.com mukeshheerachandj...@ntpc.co.in Skype: mukeshjain211 Mob: 09977165123 "Face your deficiencies and acknowledge them; but do not let them master you. Let them teach you patience, sweetness, insight. " Helen Keller The list has now migrated to www.accessindia.inclusivehabitat.in You should now post to the id: a...@accessindia.inclusivehabitat.in Search for old postings at: http://www.mail-archive.com/accessindia@accessindia.org.in/ To unsubscribe send a message to accessindia-requ...@accessindia.org.in with the subject unsubscribe. To change your subscription to digest mode or make any other changes, please visit the list home page at http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in Disclaimer: 1. Contents of the mails, factual, or otherwise, reflect the thinking of the person sending the mail and AI in no way relates itself to its veracity; 2. AI cannot be held liable for any commission/omission based on the mails sent through this mailing list..
Re: [AI] OCR conversion results not accurate
Dear Friend, Kindly send me a sample pdf on my email id, sameer.la...@gmail.com Regards Mr. Sameer Latey Mumbai, India -Original Message- From: Rahul Bajaj Sent: 25 November, 2017 3:03 PM To: accessindia@accessindia.org.in Subject: [AI] OCR conversion results not accurate Hi Everyone, I am sure this has been discussed multiple times on the list before, but I'd nonetheless be grateful if someone could help me with this. I use Fine Reader 11. More than 60% of the documents I receive at work are inaccessible PDFs which I have to convert into word. While I am able to obtain reasonably accurate results, the text appears up in a badly jumbled fashion in some portions of the document. For instance, the definitions clause of a lot of contract appears in the result in such a way that all the terms appear together and all the definitions appear together, so it becomes difficult to figure out which definition relates to which term. Further, some paras are often incomplete. Clause numbers are often missing. This high rate of inaccuracy makes one doubt even the correctness of the portion that does appear properly. Will migrating to the latest version of Fine Reader help with this, or is this attributable to inherent weaknesses of the OCR process? Further, would switching over to another OCR engine produce better results? If so, which OCR engine should I use, and where might I be able to get it? I'd be happy to send a couple of sample documents to those of you using OCR engines apart from Fine Reader which you think would work better. Best, Rahul The list has now migrated to www.accessindia.inclusivehabitat.in You should now post to the id: a...@accessindia.inclusivehabitat.in Search for old postings at: http://www.mail-archive.com/accessindia@accessindia.org.in/ To unsubscribe send a message to accessindia-requ...@accessindia.org.in with the subject unsubscribe. To change your subscription to digest mode or make any other changes, please visit the list home page at http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in Disclaimer: 1. Contents of the mails, factual, or otherwise, reflect the thinking of the person sending the mail and AI in no way relates itself to its veracity; 2. AI cannot be held liable for any commission/omission based on the mails sent through this mailing list.. The list has now migrated to www.accessindia.inclusivehabitat.in You should now post to the id: a...@accessindia.inclusivehabitat.in Search for old postings at: http://www.mail-archive.com/accessindia@accessindia.org.in/ To unsubscribe send a message to accessindia-requ...@accessindia.org.in with the subject unsubscribe. To change your subscription to digest mode or make any other changes, please visit the list home page at http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in Disclaimer: 1. Contents of the mails, factual, or otherwise, reflect the thinking of the person sending the mail and AI in no way relates itself to its veracity; 2. AI cannot be held liable for any commission/omission based on the mails sent through this mailing list..
[AI] OCR conversion results not accurate
Hi Everyone, I am sure this has been discussed multiple times on the list before, but I'd nonetheless be grateful if someone could help me with this. I use Fine Reader 11. More than 60% of the documents I receive at work are inaccessible PDFs which I have to convert into word. While I am able to obtain reasonably accurate results, the text appears up in a badly jumbled fashion in some portions of the document. For instance, the definitions clause of a lot of contract appears in the result in such a way that all the terms appear together and all the definitions appear together, so it becomes difficult to figure out which definition relates to which term. Further, some paras are often incomplete. Clause numbers are often missing. This high rate of inaccuracy makes one doubt even the correctness of the portion that does appear properly. Will migrating to the latest version of Fine Reader help with this, or is this attributable to inherent weaknesses of the OCR process? Further, would switching over to another OCR engine produce better results? If so, which OCR engine should I use, and where might I be able to get it? I'd be happy to send a couple of sample documents to those of you using OCR engines apart from Fine Reader which you think would work better. Best, Rahul The list has now migrated to www.accessindia.inclusivehabitat.in You should now post to the id: a...@accessindia.inclusivehabitat.in Search for old postings at: http://www.mail-archive.com/accessindia@accessindia.org.in/ To unsubscribe send a message to accessindia-requ...@accessindia.org.in with the subject unsubscribe. To change your subscription to digest mode or make any other changes, please visit the list home page at http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in Disclaimer: 1. Contents of the mails, factual, or otherwise, reflect the thinking of the person sending the mail and AI in no way relates itself to its veracity; 2. AI cannot be held liable for any commission/omission based on the mails sent through this mailing list..