On Fri, Nov 7, 2008 at 9:15 PM, Tien Dung <[EMAIL PROTECTED]> wrote:

> Hi Ray,
>
> Thanks for answering.
>
> The second parameter should be GARBAGE_STRING instead of GARBAGE_WERD.

Yes. I sent the answer from my G1 Android phone, so I didn't have access to
the code at the time.

>
>
> DangAmbigs is not currently used to force replacement. It is only used to
>> test for missed dictionary words.
>>
> So is it really help when we enter more data to DangAmbigs file?
>
Currently DangAmbigs is of most use at preventing mis-adaption errors.
For instance, say you are missing e->c. On a degraded page, it might adapt
to the (incorrect) word 'arc', when the correct word was 'are', thus
training the error e->c, and you find that hundreds of other e->c errors are
made all over the page/document. So it is critical to have a minimal set of
substitution errors in there, but adding more doesn't necessarily buy you
anything.

>
>
> I found that the currently SVN revision is 193 (quite small) and is not
> change for few weeks.
>
> Is Tesseract in active development?

Yes. Very much so.

>
>
> When will be the next release or trunk code update?

I am hoping to get 2.04 out before the end of the year, and 3.00 out in Q1
of 2009.

>
>
> Is the small SVN revision is because of Tesseract code is hard to change or
> improve or you don't have enough time to take care of it?

The main obstacle at the moment is lack of time to properly integrate
patches into the main line and fix bugs.
2.04 will integrate many of the fixes/patches that people have sent over the
last too long, as well as address as many of the minor issues as possible. I
will then update these patches into our internal Google version, and then
push that out as 3.00 early next year. I don't want to go straight to 3.00
as there has been a lot of contributions to 2.03 to patch it up and/or make
it more portable, and I don't want them to get lost. We have gone a long way
towards completing the thread-safety project, and made some other major
improvements, with more to come, so 3.00 will be quite different to 2.03,
and that will make reuse of the patches much more difficult.

>
>
> Regards,
>
> --
> Tien Dung
> http://codemonkeycode.blogspot.com/
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to