Re: [tesseract-ocr] How to get linewise/ row-wise output rather than column wise in hOCR output

2020-07-13 Thread Deepak Sen
How can I increase the accuracy, I am getting less acurate results/detections On Mon, Jul 13, 2020, 10:09 PM Deepak Sen wrote: > Thanks > > On Mon, Jul 13, 2020, 10:08 PM Shree Devi Kumar > wrote: > >> Use --psm 6 >> >> Page segmentation mode instead of the default >> >> On Mon, Jul 13, 2020,

Re: [tesseract-ocr] How to get linewise/ row-wise output rather than column wise in hOCR output

2020-07-13 Thread Deepak Sen
Thanks On Mon, Jul 13, 2020, 10:08 PM Shree Devi Kumar wrote: > Use --psm 6 > > Page segmentation mode instead of the default > > On Mon, Jul 13, 2020, 22:05 Deepak Sen wrote: > >> Hi, >> I am using latest tessaract version and getting the hOCR output of a >> table where line no of (column2,

Re: [tesseract-ocr] How to get linewise/ row-wise output rather than column wise in hOCR output

2020-07-13 Thread Shree Devi Kumar
Use --psm 6 Page segmentation mode instead of the default On Mon, Jul 13, 2020, 22:05 Deepak Sen wrote: > Hi, > I am using latest tessaract version and getting the hOCR output of a table > where line no of (column2, row1) is not line-1 so what i want is tessaract > first goes through all the

[tesseract-ocr] How to get linewise/ row-wise output rather than column wise in hOCR output

2020-07-13 Thread Deepak Sen
Hi, I am using latest tessaract version and getting the hOCR output of a table where line no of (column2, row1) is not line-1 so what i want is tessaract first goes through all the rows in column1 and goes to column2 but I want it to go like row1(all columns) row2(all columns). Thanks, I hope

Re: [tesseract-ocr] How to exclude some symbols from recognizing?

2020-07-13 Thread Shree Devi Kumar
Search for whitelist / blacklist On Mon, Jul 13, 2020, 17:24 Владимир Калачихин wrote: > Subj > Numbers, for example. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it,

[tesseract-ocr] How to exclude some symbols from recognizing?

2020-07-13 Thread Владимир Калачихин
Subj Numbers, for example. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit

Re: [tesseract-ocr] Tesseract-OCR Training Arabic text & numbers

2020-07-13 Thread Eliyaz L
Thanks for the support, it saves lot of time and efforts. i tried the latest tesseract its working fine with the arabic text and numbers but the only issue is with arabic date, so if the issue is still open, can i prepare dataset and train a separate custom model for only numbers and date. if

Re: [tesseract-ocr] Looking for segmentation algorithm implementations and (G)UIs

2020-07-13 Thread Shree Devi Kumar
Good collection of segmentation algorithms. Dan Bloomberg has update the segmentation algorithms in leptonica some time back. You may want to take a look at those too. Tesseract also uses leptonica, but older algorithms, I think. On Sat, Jul 11, 2020 at 9:19 PM Rainer Verteidiger <