I got similar poor results with v5.0.1.20220118 — Table of Contents LD. [introduction ccc cccccccccccccescescescescessessessessessessessessessessesseeesseseecescseceeseeseesiscasesseseeeseesees 1.1. Purpose of this document... ccc ccc esceecsesecsesceecsesensescseceesecsesciesseecseensssenesseeenans 1.2. U.S. Electronic Submission Back ground 0.0.0.0... ccc ce ccececsesceeceetetseeeeceeenseeenecseneneeeeness 1.3. CDS cccccccccccccseccessescsecsecsecsesecsessessessesesscsecsescseseesessessessesesssecsecsessesessessessesessseesesaees 1.3.1. Operational DataModel (ODM)... ccc ccecceteseseteseseseseseseseseseseseseseseeerereeesereneees 1.3.2. Study Data Tabulation M odel (SDTM) .0....ccccccccecececeseseteseeseresesesesesesereeerereaetes 1.3.3. Analysis Dataset Model (ADaM) .o.......ccccccccccccccccccccccsccecenessceecestseceeerstsceeeensusceeeesas # # # # # # # # # # # #
In other contexts I have gotten gibbersih like this with rows of dots, or even standard ellipses (...) Your original is very low res, which may be the issue. In my case I'm working with scanned microfilm, a less than ideal source. Good luck! On Friday, July 8, 2022 at 1:20:40 AM UTC-4 [email protected] wrote: > Hi > I have a simple image and I tried tesseract ocr 4.x (with eng language) > but it didn't detect text properly. > [image: test.bmp] > The OCR Result is like the below. Especially page number separator "..." > and page number is wrong. > Is there any way to improve accuracy for this? > And I feel it takes time to OCR a full page image. Is there any option to > make it faster? > Thanks > Jason > --- > Table of Contents > > LD. [introductions eee > cccecceecsesensesenecsecensesenecseceeseeeecseceeseseecnecsesesereneceesesereneceeenseseneseeneneeeen > 1.1. Purpose of this document... ccc ccc > ceecsetecsesceecseecsescsecsesescscsececneseeenesseessseenans > 1.2. U.S. Electronic Submission Back ground 00.0.0... ccc ee > ccetecseseeeceetetseeeneeeeenseseneceeeneeeenees > 1.3. CDISC i > ccceccccesecsteseststessnsessssessssssecssseecisseecnsseecisseecesseecesseecasseecesseesesseecesseeeaseseeesasess > > 1.3.1. Operational DataModel > (ODM).......0ccccccececeteseeesesesesesesesesesereseseresesesereseneeeneees > 1.3.2. Study Data Tabulation M odel (SDTM) ou... > cccccceececeseeteseseeresesesesesererseereneeeees > > 1.3.3. Analysis Dataset Model (ADaM) > .u.......ccccccccccccccccccsecccceceneseceecestseceecestaceeeensuseeenenaas > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d004de82-dea1-4fc8-b588-732a61ed021fn%40googlegroups.com.

