Hi All,

I have came across the Khmer training data with tesseract, I also made few 
train to see if it possible.

I have tried to use my train data and download from here to test but I got 
error:

Tesseract khmer1.tif output -l khm
read_params_file: Can't open ûl
read_params_file: Can't open khm
Tesseract Open Source OCR Engine v3.02 with Leptonica
I have moved file: khm.traineddata to "C:\Program Files 
(x86)\Tesseract-OCR\tessdata"

Can anyone give me some clue what could be wrong here?
Thanks,
Metrey

On Tuesday, August 23, 2011 4:26:07 PM UTC+7, Jane wrote:
>
> I love your first paragraph, Dmitri. Anyway, I dun backup the training 
> images only the trained data. It's been nearly a year that I paused working 
> on tesseract as deadline of my project is tight. Most of the images and box 
> files were shared with Sriranga, Dmitri, and also the group. Also, I do not 
> want to share to group my image files as they were extracted from 
> news/gossips that maybe political harm. Bad luck for me too, that I didnt 
> backup the images. 
>
> With the trainedata, any new people want to work on that can be sure that 
> tesseract is usable with Khmer and they can add more training data as they 
> wish.
>
> Thanks to all the members, especially Dmitri and Sriranga, for giving me 
> all the feedback, explanation and idea.
>
> Sochenda
>
>
>
> On Tue, Aug 23, 2011 at 3:41 PM, Dmitri Silaev 
> <[email protected]<javascript:>
> > wrote:
>
>> We have no right to force people to give away everything. No doubt,
>> it's better to have sources, but ask the project owners and Google why
>> they are holding back sources for the latest traineddata files, huh?
>> And this is when the whole project had been declared open source...
>>
>> But we can ask people if they really intend to hold back information
>> (they just may not realize this), that's more correct and polite.
>> Anyways, the Khmer "traineddata" file Sochenda has shared is somewhat
>> useful.
>>
>> That's my opinion.
>>
>> Warm regards,
>> Dmitri Silaev
>> www.CustomOCR.com
>>
>>
>>
>>
>>
>> On Tue, Aug 23, 2011 at 11:13 AM, zdenko podobny 
>> <[email protected]<javascript:>> 
>> wrote:
>> >
>> > On Tue, Aug 23, 2011 at 9:01 AM, Dmitri Silaev 
>> > <[email protected]<javascript:>
>> >
>> > wrote:
>> >>
>> >> He-he, IMHO is a way to seem a bit less all-knowing and all-seeing to
>> >> people. At least I use it that way )) See
>> >> http://en.wiktionary.org/wiki/IMHO
>> >>
>> >> Anyways, I've checked what you shared, thanks so much. Actually you
>> >> don't need to share the "normproto", "microfeat", etc. files as they
>> >> are generated. The main part is your source image and box files, which
>> >> you didn't share, is this intentional? If yes, it's your right and
>> >> that's OK...
>> >>
>> > Well it is right, but I would not say it is OK (my opinion) ;-). 
>> Beucase if
>> > somebody wants to improve it he/she has to start from begining.
>> > I think dan-fraktur project [1] is good example how to
>> > contribute language data (buildscript, tif, box, dictionary data files -
>> > everything is included...)
>> > Zdenko
>> > [1] https://github.com/paalberti/tesseract-dan-fraktur
>> >>
>> >> Warm regards,
>> >> Dmitri Silaev
>> >> www.CustomOCR.com
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Aug 23, 2011 at 6:20 AM, KHEM Sochenda 
>> >> <[email protected]<javascript:>
>> >
>> >> wrote:
>> >> > I dun know what IMHO is. Never use it. however I share the link here.
>> >> >
>> >> >
>> >> >  
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgOTliYWMxY2YtNjJkOS00Mzg0LWI0OTctYzI1NGJhMGY1Mjk4&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgODFhZWU1NzAtNjc0OS00MThmLThjODItMGNlODM1ZDFjNzkx&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgNGY1MmI1ZTYtOTA0OS00NWFkLWI3Y2ItYWRiMWJhZDBjODQ1&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgNDU0NTAwMmQtNzFhZi00NGI0LTkxMjItYTRmMjZiNTkxYzQy&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgYjI3Zjk4YWItZjMxOC00MTAwLThiMjUtODdlZDI2N2ExMzYx&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgZmRhMWFkODAtZDQ5OS00OWY5LTk5ZmUtZWRlZTc0N2ExMGZi&hl=en_US
>> >> >
>> >> > 
>> https://docs.google.com/leaf?id=0B9BTtR5QkyOgZTE3N2RlZWMtYjRjNi00NTkyLTljZDQtOTgwNDljNmQ3ZDhi&hl=en_US
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Aug 17, 2011 at 2:15 PM, zdenko podobny 
>> >> > <[email protected]<javascript:>
>> >
>> >> > wrote:
>> >> >>
>> >> >> IMHO best way is to create somewhere public repository (e.g.
>> >> >> code.google.com, github.com sf.net ...) and send link here. I will 
>> add
>> >> >> it
>> >> >> to http://code.google.com/p/tesseract-ocr/wiki/AddOns.
>> >> >> Zd.
>> >> >>
>> >> >>
>> >> >> On Wed, Aug 17, 2011 at 9:11 AM, KHEM Sochenda 
>> >> >> <[email protected]<javascript:>
>> >
>> >> >> wrote:
>> >> >>>
>> >> >>> Dear Dmitri,
>> >> >>>
>> >> >>> Do you know how to upload the training dataset? I want to upload 
>> what
>> >> >>> I
>> >> >>> did.
>> >> >>>
>> >> >>> Regards,
>> >> >>> Sochenda
>> >> >>>
>> >> >>> On Tue, Aug 9, 2011 at 3:50 PM, Dmitri Silaev 
>> >> >>> <[email protected]<javascript:>
>> >
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Training for Khmer is really a challenging task. You can refer to 
>> the
>> >> >>>> following thread for some clues:
>> >> >>>>
>> >> >>>> 
>> https://groups.google.com/d/topic/tesseract-ocr/TzwbS3CwhGo/discussion
>> >> >>>> You can also contact Sochenda to ask if he did any progress on 
>> this.
>> >> >>>>
>> >> >>>> Warm regards,
>> >> >>>> Dmitri Silaev
>> >> >>>> www.CustomOCR.com
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Tue, Aug 9, 2011 at 11:56 AM, Sovila Srun 
>> >> >>>> <[email protected]<javascript:>
>> >
>> >> >>>> wrote:
>> >> >>>> > Thanks a lot, Zdenko! Now, I successfully configured.
>> >> >>>> > I have a question to you. I would like to train to system for 
>> Khmer
>> >> >>>> > language, do you have any comments about this? From what I need 
>> to
>> >> >>>> > start it.
>> >> >>>> > Oh, anyway you can speak Russian?
>> >> >>>> > Thanks, Cheyvarman!
>> >> >>>> > Best regards!
>> >> >>>> >
>> >> >>>> > 2011/8/9 zdenko podobny <[email protected] <javascript:>>
>> >> >>>> >>
>> >> >>>> >> What you want to configure and what did you try?
>> >> >>>> >>
>> >> >>>> >> On Tue, Aug 9, 2011 at 6:53 AM, Cheyvarman 
>> >> >>>> >> <[email protected]<javascript:>
>> >
>> >> >>>> >> wrote:
>> >> >>>> >>>
>> >> >>>> >>> Anyone, can tell me how to configure tesseract-ocr any 
>> version in
>> >> >>>> >>> windows?
>> >> >>>> >>> It's not worked to configure it via instruction :(
>> >> >>>> >>> Thanks in advance
>> >> >>>> >>>
>> >> >>>> >>> --
>> >> >>>> >>> You received this message because you are subscribed to the
>> >> >>>> >>> Google
>> >> >>>> >>> Groups "tesseract-ocr" group.
>> >> >>>> >>> To post to this group, send email to
>> >> >>>> >>> [email protected] <javascript:>
>> >> >>>> >>> To unsubscribe from this group, send email to
>> >> >>>> >>> [email protected] <javascript:>
>> >> >>>> >>> For more options, visit this group at
>> >> >>>> >>> http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >>>> >>
>> >> >>>> >> --
>> >> >>>> >> You received this message because you are subscribed to the 
>> Google
>> >> >>>> >> Groups "tesseract-ocr" group.
>> >> >>>> >> To post to this group, send email to
>> >> >>>> >> [email protected] <javascript:>
>> >> >>>> >> To unsubscribe from this group, send email to
>> >> >>>> >> [email protected] <javascript:>
>> >> >>>> >> For more options, visit this group at
>> >> >>>> >> http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >>>> >
>> >> >>>> > --
>> >> >>>> > You received this message because you are subscribed to the 
>> Google
>> >> >>>> > Groups "tesseract-ocr" group.
>> >> >>>> > To post to this group, send email to 
>> >> >>>> > [email protected]<javascript:>
>> >> >>>> > To unsubscribe from this group, send email to
>> >> >>>> > [email protected] <javascript:>
>> >> >>>> > For more options, visit this group at
>> >> >>>> > http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >>>> >
>> >> >>>>
>> >> >>>> --
>> >> >>>> You received this message because you are subscribed to the Google
>> >> >>>> Groups "tesseract-ocr" group.
>> >> >>>> To post to this group, send email to 
>> >> >>>> [email protected]<javascript:>
>> >> >>>> To unsubscribe from this group, send email to
>> >> >>>> [email protected] <javascript:>
>> >> >>>> For more options, visit this group at
>> >> >>>> http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >>>
>> >> >>> --
>> >> >>> You received this message because you are subscribed to the Google
>> >> >>> Groups "tesseract-ocr" group.
>> >> >>> To post to this group, send email to 
>> >> >>> [email protected]<javascript:>
>> >> >>> To unsubscribe from this group, send email to
>> >> >>> [email protected] <javascript:>
>> >> >>> For more options, visit this group at
>> >> >>> http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >>
>> >> >> --
>> >> >> You received this message because you are subscribed to the Google
>> >> >> Groups "tesseract-ocr" group.
>> >> >> To post to this group, send email to 
>> >> >> [email protected]<javascript:>
>> >> >> To unsubscribe from this group, send email to
>> >> >> [email protected] <javascript:>
>> >> >> For more options, visit this group at
>> >> >> http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups "tesseract-ocr" group.
>> >> > To post to this group, send email to 
>> >> > [email protected]<javascript:>
>> >> > To unsubscribe from this group, send email to
>> >> > [email protected] <javascript:>
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/tesseract-ocr?hl=en
>> >> >
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> >> Groups "tesseract-ocr" group.
>> >> To post to this group, send email to 
>> >> [email protected]<javascript:>
>> >> To unsubscribe from this group, send email to
>> >> [email protected] <javascript:>
>> >> For more options, visit this group at
>> >> http://groups.google.com/group/tesseract-ocr?hl=en
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups "tesseract-ocr" group.
>> > To post to this group, send email to 
>> > [email protected]<javascript:>
>> > To unsubscribe from this group, send email to
>> > [email protected] <javascript:>
>> > For more options, visit this group at
>> > http://groups.google.com/group/tesseract-ocr?hl=en
>> >
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]<javascript:>
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to