Hmm. What page_seg_mode are you using?

On Wed, Apr 25, 2012 at 3:17 AM, Pleiades <[email protected]>wrote:

> Just to be sure I was clear about what I have and what I want.
>
> In version 3.0 I had as output file what I am searching now in version
> 3.01 (each \t corresponds to one box of my Excel array)
>
> Titre   CLA   ORI   MET   NUM D   NUM F   X (pm)   Y (pm)   SE   CLR
> FOR   RES (Excel picture: 12 columns for raw 1)
> 1 # Title Part1: The title is here (Excel picture: 12 columns for raw
> 2 where all the last 11 columns are empty)
>
> With version 3.01 I have as output file:
>
> Titre
> CLA
> MET
> NUM D
> NUM F
> X (pm)
> Y (pm)
> CLR
> FOR
> RES
> 1 # Title Part1: The title is here
>
> It seems so strange that the default output format is different for
> versions 3.0 ant 3.01...
> Have you any idea easier than write my own parser to do that?
>
> Thank you.
>
> On 25 avr, 05:41, Mayur Mudigonda <[email protected]> wrote:
> > The problem with blackboxes sometimes is that we expect too much out of
> it.
> > Tesseract is in the order of 10k LOC. A large amount of it to detect and
> > recognize characters/text/page segmentation/alignment/transforms. If you
> > dove into the code a bit more, you'd be more awed than sad :)
> >
> > Sermon given, I would write the parsing in Python which has some pretty
> > nifty tools to play with text. WIthout knowing more (in terms of example
> > pics, goals and ideas) it is hard to comment/advice further :)
> >
> > A little knowledge is dangerous.... :)
> >
> > Cheers,
> > M
> >
> > On Tue, Apr 24, 2012 at 8:13 AM, Pleiades <
> [email protected]>wrote:
> >
> > > Thank you for your answer.
> > > It's a pity... :-( It worked well with v3.0...
> > > Could you help me telling in which file or function I have to write my
> > > own parser.
> > > Thanks
> >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "tesseract-ocr" group.
> > > To post to this group, send email to [email protected]
> > > To unsubscribe from this group, send email to
> > > [email protected]
> > > For more options, visit this group at
> > >http://groups.google.com/group/tesseract-ocr?hl=en
> >
> > --
> >
> > URL:www.cse.msu.edu/~mudigon1www.blindsight.com/team
> > Elegance is not a dispensable luxury but a factor that decides between
> > success and failure.
> > Edsger Dijkstra
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 

URL:
www.cse.msu.edu/~mudigon1
www.blindsight.com/team
Elegance is not a dispensable luxury but a factor that decides between
success and failure.
Edsger Dijkstra

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to