Hello,

         I'm looking at Tesseract to underpin a commercial project I am
prototyping.  My current plan is to interact with tesseract via the command
line from either a Python or Java running on a Beagleboard (small circuit
board with an ARM chip on it).  For this purpose I'm pretty happy with
tesseract 3.10 prerelease - the fact that the lead developer has locked the
project and done a runner like a subprime victim is a bit of a problem (I'm
assuming that no emails from him means that he does not even monitor the
list any more).

  From what I can see OCRopus is mainly targeted towards decoding whole
pages of text and is quite heavyweight - I need to read ID cards so
tesseract is a better prospect for me as it is lighter.  Also I don't
necessary want to use Ubuntu and Python for this.  The prospect of having to
hack an image up into a unsigned char * fills me with dread but I guess
someone else can do that between C++ and Java/Python/whatever.

  I'd be perfectly willing to help out regarding Java (my main language)
focused stuff and trying to compile something to ARM as I need to do that
anyway.  However my C++ knowledge is sketchy at best.  I personally don't
care how many libraries the code depends upon as long as the instructions to
compile and include the external libraries are clear.  I'd like to be able
to call directly from Java (or Python or ... both) but if not then a command
line program which I can shell and capture the results from is fine for me.

  I'm not especially wedded to open source or tesseract - it is just that
tesseract was easy to get working and produced workable results.  Does
anyone know of another project (proprietary or open source) which can do a
similar job?

Cheers,

Neil

On 9 April 2010 07:15, Andres <[email protected]> wrote:

> Hello Pierre and people,
>
> I've been following this thread from its beginning. I think that your idea
> worth very much and is very valuable in this situation.
>
> How can a serious project depend on ONLY ONE guy (that Ray) who is missing
> ? …and he doesn’t answer emails for MONTHS, is unreachable and leaved the
> project LOCKED as rthomas said ? If this project should rely on only one
> person, he doesn’t seem to be in conditions to do that. What should be the
> reason to continue waiting for him ? The description of the situation even
> reminds me to the “Jacob” character, in Lost TV series.
>
> The fork seems to be a nice idea, perhaps a new wiki with tutorials and
> better documentation would be also very helpful for increasing the activity
> of contributors.
>
> I don’t know OCRopus but here:
> http://googlesystem.blogspot.com/2007/04/open-source-ocr-software-sponsored-by.htmlthey
>  tell that it’s partly based on Tesseract…I’m curious about how the
> Tesseract modifications are being managed. Anyone knows something about that
> ?
>
> Perhaps if some of you are used to write to OCRopus list it would be a good
> idea to ask there about what’s the reason of the lack of support here.
>
> I mostly agree with you Pierre about what you said about relying on extern
> libraries, but I think that there is an exception to be made, the boost
> libraries, don’t you agree?
>
> I encourage those who had been following this thread silently as me to
> share their opinion.
>
> Cheers,
>
> Andres
>
>
> 2010/4/8 MARTIN Pierre <[email protected]>
>
>  > Ocropus is much better candidate for contributing, and it even uses
>> > distributed version control (Mercurial).
>> Agreed.
>>
>> > The unfortunate downside is
>> > that at least awareness of Ubuntu and Python are required. But on the
>> > other hand Ubuntu+Python is much more fun to learn than the mathematics
>> > and algorithms (in C++) behind OCR.
>> i have both, well, Python a while ago but that's obviously only a syntax
>> problem. i'm very used to high level OOP languages (ObjC mostly) and Python
>> is not that far from it. Unfortunatelly, this project really doesn't suits
>> my target. i'm requiring opensource (Or at least easily portable by myself
>> on each major release, which is really not the case right now), and most of
>> all closed sources when required by a commercial license.
>>
>> Also, i notive that OCROpus is getting into the same (And i think wrong)
>> direction than Tesseract is about to take: relying on various totally
>> un-related libraries. An OCR library should be only relying on the input
>> format (Which at best would be raw picture LSB / MSB 1bpp data), not a
>> leptonica-ish wrapper.
>>
>> i'm actually reading Tesseract source code, and i have quite a lot of
>> enlightenements doing this, so i think i'll be writing an eMail to Ray very
>> soon.
>>
>> Anyone interested with joining my action, and edventually helping me
>> writing it (As you may have noticed, my english became poor over time :D).
>>
>> Thanks anyway for your advices which i honnestly find really valuable,
>> everyone.
>> Pierre.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<tesseract-ocr%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>



-- 
-- 

Neil Benn Msc
Director
Ziath Ltd
Phone :+44 (0)7508 107942
Website - http://www.ziath.com

IMPORTANT NOTICE:  This message, including any attached documents, is
intended only for the use of the individual or entity to which it is
addressed, and may contain information that is privileged, confidential and
exempt from disclosure under applicable law.  If the reader of this message
is not the intended recipient, or the employee or agent responsible for
delivering the message to the intended recipient, you are hereby notified
that any dissemination, distribution or copying of this communication is
strictly prohibited. If you have received this communication in error,
please notify Ziath Ltd immediately by email at [email protected]. Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to