Normally, for text output, the other config files should not impact.


- excuse the brevity, sent from mobile

On 07-Apr-2017 2:18 AM, "Mike Hall" <hall.mi...@gmail.com> wrote:

> Yes, we are using the -psm 6 command line argument.  And it was not
> working.
>
> But I figured out the issue.
>
> Tesseract has a set of config files. Inside several of these config files
> (hocr, pdf, tsv, unlv) is the setting *tessedit_pageseg_mode*. This
> setting was set to 1 in all the config files.   Once I removed the
> *tessedit_pageseg_mode* parameter from the config files, our command line
> argument of -psm 6 worked.
>
> Alternatively, I did experiment with the config files.  When I changed the 
> *tessedit_pageseg_mode
> *setting to 6 in all the config files and ran Tesseract with the -psm 6
> command line argument, it also worked.
>
> Thanks
>
> On Thursday, April 6, 2017 at 1:12:18 PM UTC-5, shree wrote:
>
>> Have u tried --psm 6
>>
>> - excuse the brevity, sent from mobile
>>
>> On 06-Apr-2017 11:06 PM, "Mike Hall" <hall....@gmail.com> wrote:
>>
>>> We have a C# .Net app that is using Tesseract to do Optical Character
>>> Recognition (OCR) on .tiff files.  I've attached a sample tiff file.
>>>
>>> We are then outputting the data to a text file.  However, Tesseract is
>>> reading the data in a Vertical fashion.  In my example image, it is reading
>>> the tiff as two columns of data and the data the data is being outputted
>>> from Tesseract like this:
>>>
>>> TYPE:
>>> DATE:
>>> Address:
>>> City:
>>> State:
>>> Owner:
>>> Owner Type:
>>> Acreage:
>>> Mortgage:
>>> 12345
>>> 2017-04-06
>>> 100 Main St.
>>> Some City
>>> Some State
>>> John Doe
>>> Primary
>>> 10.25
>>> Yes
>>>
>>> What we want is Tesseract to read the tiff file horizontally and have
>>> the output look like this:
>>>
>>> TYPE:
>>> 12345
>>> DATE:
>>> 2017-04-06
>>> Address:
>>> 100 Main St.
>>> City:
>>> Some City
>>> State:
>>> Some State
>>> Owner:
>>> John Doe
>>> Owner Type:
>>> Primary
>>> Acreage:
>>> 10.25
>>> Mortgage:
>>> Yes
>>>
>>> We've tried the various Page Sementation options for Tesseract, but they
>>> all produce the same result.
>>> Has anyone run into this same issue? Anybody have any ideas?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/790b41ef-f97f-4695-b7c8-1c68bdd1cd38%40goo
>>> glegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/790b41ef-f97f-4695-b7c8-1c68bdd1cd38%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/e56e8714-716a-4664-90c0-bb0f4217c46a%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/e56e8714-716a-4664-90c0-bb0f4217c46a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUirqMstF7ANWq9AoCy6RK7-ZGkes-yWLvGAroUH4t%2Beg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to