Re: [tesseract-ocr] Page segmentation and preserve_interword_space are not working

2017-07-26 Thread Prav
Thanks for the reply. TSV is giving data in a column. So it covers column1 then column2 and finally column 3 one below the other. I am not able to figure out how to construct a table from a TSV. On Wednesday, July 26, 2017 at 11:26:18 PM UTC+5:30, shree wrote: > > Try 'tsv' instead of 'hocr' >

Re: [tesseract-ocr] Page segmentation and preserve_interword_space are not working

2017-07-26 Thread ShreeDevi Kumar
Try 'tsv' instead of 'hocr' ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Wed, Jul 26, 2017 at 10:30 PM, Prav wrote: > Hi, > > I am trying to extract tabular data. For this I am converting the

[tesseract-ocr] Page segmentation and preserve_interword_space are not working

2017-07-26 Thread Prav
Hi, I am trying to extract tabular data. For this I am converting the image into hocr. Now this hocr is not coming properly. It first puts the data for one column and then for the other. I do not get data which is put row wise and column wise so that the extraction comes as a proper table. I