The generated box will not contain Korean characters. Use any box editors mentioned in training page. Box editors are created for that purpose. Box editors will split the image blocks from tif provided, and create a rectangle area and asigns some value to it. Adjust the size of these rectangles in box editor and update the equivalent Korean character for that rectangle.
When you asign a Korean character to a rectangle area, that means whenever image has that pattern as in rectangle area assign it with equivalent Korean character. On Thu, Apr 28, 2011 at 9:53 PM, Oleg Tikhonov <[email protected]>wrote: > It's exactly where I'm started and stuck. The produced box does not contain > any Korean character only Latin ones. And that is a problem. > > > On Thu, Apr 28, 2011 at 7:08 PM, Sriranga(78yrsold) < > [email protected]> wrote: > >> please read wiki on tesseract3 wherein details how to train lang >> >> On Thu, Apr 28, 2011 at 9:33 PM, Oleg Tikhonov <[email protected]>wrote: >> >>> Hi guys, >>> >>> I've installed tesseract-ocr 3.0 on Windows 7. All work fine if selected >>> language is English. >>> I tried to add/teach the system the Korean. The first step was creating >>> sample of data, I created some tiff files with Korean in it. After, I ran >>> tesseract command: >>> tesseract [lang].[fontname].exp[num].tif [lang].[fontname].exp[num] >>> batch.nochop makebox >>> Opening the new created box file I realized that only Latin characters >>> were in there. What's wrong? Might be I have to change a system language? >>> Please advise me how anyway to create a training data set? Thank you in >>> advance, >>> >>> Oleg >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/tesseract-ocr?hl=en >>> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- Regards Aravinda | ಅರವಿಂದ http://aravindavk.in -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

