Dmitry,
I am extremely thankful for your valuable guidance. It works for me.I have
to lean many things
under you.
With warmest Regards,
-sriranga(78yrs)

On Thu, Feb 17, 2011 at 1:56 PM, Dmitry Silaev <[email protected]>wrote:

> Sriranga,
>
> > It is
> > presumed that commandline for (WinXP) should be as follows:
> > eg= "  c:\tess\copy  001.tr + 002.tr + 003.tr + oo4.tr > 1234.tr or
> > Multiimage.tr"  which may kindly be confirmed.  OR correct commandline
> for
> > cancatenate using command "copy" to be used may kindly be intimated.
>
> This command won't do what you want. First, you don't need to indicate
> a path before "copy" as it is a built-in command of the MS-DOS command
> processor, while prepended with a path it is treated as a name of an
> executable within the "c:\tess\" directory and it doesn't exist.
> Second, you don't need the ">" as it will direct all informational
> output of the "copy" command (not files' contents) to "1234.tr". A
> destination file should be specified at the end of the command after a
> space. Therefore your command line should be
>
> copy  001.tr + 002.tr + 003.tr + oo4.tr 1234.tr
>
> Warm regards,
> Dmitry Silaev
>
>
>
>
>
> On Thu, Feb 17, 2011 at 9:44 AM, Sriranga(78yrsold)
> <[email protected]> wrote:
> > Dmitry,
> > Thanks for the valuable guidance  However I could not understand how to
> > cancatenate (simply "copy" all the resulted .tr files together? It is
> > presumed that commandline for (WinXP) should be as follows:
> > eg= "  c:\tess\copy  001.tr + 002.tr + 003.tr + oo4.tr > 1234.tr or
> > Multiimage.tr"  which may kindly be confirmed.  OR correct commandline
> for
> > cancatenate using command "copy" to be used may kindly be intimated.
> > With Warmest Regards,
> > -sriranga(78yrs)
> >
> > On Wed, Feb 16, 2011 at 11:58 AM, Dmitry Silaev <[email protected]>
> > wrote:
> >>
> >> Guys,
> >>
> >> If you have more than one box/tiff pair, you can train (i.e. generate a
> >> .tr file) for each of these pairs separately.
> >>
> >> Then you can concatenate (simply "cat" or "copy") all resulted .tr files
> >> together and then run all training tools on the single final .tr file.
> This
> >> relieves you from the 32 file limit.
> >>
> >> For your convenience you can craft a batch file or shell script which
> >> would train, concatenate, cluster, etc. in one run. You should analyze
> all
> >> errors carefully though.
> >>
> >> Warm regards,
> >> Dmitry Silaev
> >>
> >>
> >>
> >>
> >> On Wed, Feb 16, 2011 at 5:56 AM, Sriranga(78yrsold)
> >> <[email protected]> wrote:
> >>>
> >>> Dimitry,
> >>> It appears that Khem has not endorsed copy to you as such I am
> forwarding
> >>> for valuable guidance/comments - which may help me in my Kannada
> project..
> >>> with regards,
> >>> -sriranga(78yrs)
> >>>
> >>> ---------- Forwarded message ----------
> >>> From: KHEM Sochenda <[email protected]>
> >>> Date: Wed, Feb 16, 2011 at 7:45 AM
> >>> Subject: Re: Tesseract Training
> >>> To: "Sriranga(78yrsold)" <[email protected]>
> >>>
> >>>
> >>> Dear Sriranga,
> >>>
> >>> The below are the steps that I did the trainings:
> >>>
> >>> I created 3 pages of training images as you can see in the attachments(
> >>> khm.limons1.1 is page, khm.limons1.2 is page 2, and the khm.limons1.3
> is the
> >>> page 3)
> >>> I create box files of every page (khm.limons1.1.box and so on) with the
> >>> command line:
> >>>
> >>> "tesseract khm.limons1.1.tif khm.limons1.1 batch.nochop  makebox" for
> >>> page 1 and "tesseract khm.limons1.2.tif khm.limons1.2 batch.nochop
>  makebox"
> >>> for page two and the same for the page 3.
> >>>
> >>> Then I edit the box files, I got the final result in the attachments.
> >>> I merged the images together into a single file (khm.limons1.0.tif)
> >>> I merged to three box files into a single box file with page number
> >>> assigned (khm.limons1.0.box)
> >>>
> >>> I ran the command to train the sinble file "tesseract khm.limons1.1.tif
> >>> khm.limons1.0.tif khm.limons1.0 nobatch box.train".. Result look okay
> at
> >>> this step. (My purpose to merge this into one file is I want a single
> font
> >>> to be in just one .tr file)
> >>>
> >>> I then run the command "unicharset_extractor khm.limons1.0.box " to
> >>> extract every single glyp from the box files. The result look okay.
> >>>
> >>> Then I tried running this to extract the feature "mftraining –U
> >>> unicharset –O khm.unicharset khm.limons1.0.tr" and "cntraining
> >>> khm.limons1.0.tr" I failed in this step.
> >>>
> >>>
> >>>
> --------------------------------------------------------------------------------------------------------
> >>> Since I have no clue getting the above idea works, I obmitted the step
> 4
> >>> and 5 and skipped to point 6, 7, and 8 using the separated box files, I
> got
> >>> the traineddata as in the attached file. With three .tr files
> separately is
> >>> not what I want to do.
> >>>
> >>> Currently I used the obtained trained data for my temporary OCR system.
> >>> What I wished to do is to add other fonts, but the number of .tr files
> are
> >>> limited to 32 only... This is what I concerned.
> >>>
> >>> Best Regards,
> >>>
> >>> Sochenda
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >
> >
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to