Hi Dan,
Thanks for the quick response. Yes, I thoroughly enjoyed implementing
the auto orientation. And I've created a separate binary for this
specific step in the chain, however it seems as if image region
support is intended (at least at some stage it was) to be supported by
ocropus. Perhaps not at the moment. For the time being I'll look at
just creating a separate binary, but +1 for integrating Leptonica, or
at least the auto-orientation and image segmentation. Love to hear how
others have solved this problem.

Because I'm using images from cameras, id really like to have FST
language model support and thus am only using tesseract for generating
draft ground truth. And on that note, I've been unable to understand
if I can call tesseract from ocropus to generate ground truth drafts.
I'd find that a useful feature, and at the moment just plan on doing
it in a python script. Perhaps after the StandardPreProcessing, an
option could be to generate ground truth from tesseract. Probably not
an original idea, but it would be neat.

Cheers,

Nathan

On 23 July 2011 13:51, Dan Bloomberg <[email protected]> wrote:
> Nathan,
> I can't directly help you, but I support leptonica and you can get and
> install it from leptonica.org or code.google.com/p/leptonica.
> Why the standard install of ocropus no longer includes leptonica
> continues to baffle me, given that ocropus claims its intent is to
> aggregate the best document imaging and analysis software.
> This is further baffling because tesseract, which is the best open
> source OCR package, uses leptonica.
> Depending on what you want to do, it is likely that you can
> implement your application directly with tesseract and leptonica.
> When you download leptonica, look at the programs in the prog
> directory.  There are about 200 of them.  Many are regression tests,
> and quite a few show you how to solve typical document-related
> tasks.   There are several ways to identify and remove image
> regions, for example.  If you can't find anything sufficiently close, let me
> know
> and I'll give you some suggestions.
>   -- Dan
> On Fri, Jul 22, 2011 at 8:35 PM, Nathan K <[email protected]> wrote:
>>
>> Is anyone currently using TextImageSegByLeptonica or any
>> ITextImageClassification interfaces? I've been unable to get any
>> working using a recent snapshot of trunk. First I tried the standard
>> install as documented on the install transcript page. Which yeilded
>> something along the lines of 'component doesn't exist', which I
>> assumed was because I the standard install doesn't include Leptonica
>> anymore. Thus I tried the the following, and ended up with the
>> subsequent error. Greatly appreciate any pointers.
>>
>> Also, I noticed that the method 'textImageProbabilities' returns an
>> 'intarray', this can't be saved using 'write_png', what is the best
>> way to visulise the results? And in addition, remove image regions? Is
>> there a sister method? More generally, I've got a binary image region
>> on my documents which I'm trying to remove. I assume this is the
>> correct approach. Cheers Nathan
>>
>> Command:
>> scons lept=1
>>
>> Produces:
>>
>> g++ -o ocr-layout/ocr-layout-rast.os -c
>> -DDATADIR='"/usr/local/share/ocropus"'
>> -DDEFAULT_DATA_DIR='"/usr/local/share/ocropus/models"'
>> -DDEFAULT_EXT_DIR='"/usr/local/share/ocropus/extensions"' -g -fPIC -O2
>> -Wall -Wno-sign-compare -Wno-write-strings -Wno-unknown-pragmas
>> -D__warn_unused_result__=__far__ -D_BACKWARD_BACKWARD_WARNING_H=1
>> -fopenmp -fPIC -DHAVE_LEPTONICA -DHAVE_SQLITE3 -Iocr-lineseg
>> -Iocr-commands -Iocr-leptonica -Iocr-binarize -Iocr-pfst -Iocr-utils
>> -Iocr-line -Iocr-layout -Iocr-voronoi -I/usr/local/include
>> -I/usr/local/include -I/usr/include/leptonica
>> -I/usr/local/include/leptonica -I/usr/local/include/leptonica
>> ocr-layout/ocr-layout-rast.cc
>> In file included from ocr-layout/ocr-layout-internal.h:60,
>>                 from ocr-layout/ocr-layout-rast.cc:28:
>> ocr-layout/ocr-text-image-seg.h:77: error: no unique final overrider
>> for 'virtual const char* iulib::IComponent::interface()' in
>> 'ocropus::RemoveImageRegions'
>> scons: *** [ocr-layout/ocr-layout-rast.os] Error 1
>> scons: building terminated because of errors.
>>
>>
>> --
>> Email: nathank [at] noshly.com (professional)
>> Email: its [at] madteckhead.com (personal)
>> Website: http://www.madteckhead.com
>>
>> --------------------------------------------
>> Q: Why is this email five sentences or less?
>> A: http://five.sentenc.es
>>
>> This email (including any attachments) is confidential and may be
>> privileged. If you have received it in error, please notify the sender
>> by return email and delete this message from your system. Any
>> unauthorised use or dissemination of this message in whole or in part
>> is strictly prohibited. Please note that emails are susceptible to
>> change and we will not be liable for the improper or incomplete
>> transmission of the information contained in this communication nor
>> for any delay in its receipt or damage to your system. We do not
>> guarantee that the integrity of this communication has been maintained
>> nor that this communication is free of viruses, interceptions or
>> interference.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "ocropus" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/ocropus?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "ocropus" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/ocropus?hl=en.
>



-- 
Email: nathank [at] noshly.com (professional)
Email: its [at] madteckhead.com (personal)
Website: http://www.madteckhead.com

--------------------------------------------
Q: Why is this email five sentences or less?
A: http://five.sentenc.es

This email (including any attachments) is confidential and may be
privileged. If you have received it in error, please notify the sender
by return email and delete this message from your system. Any
unauthorised use or dissemination of this message in whole or in part
is strictly prohibited. Please note that emails are susceptible to
change and we will not be liable for the improper or incomplete
transmission of the information contained in this communication nor
for any delay in its receipt or damage to your system. We do not
guarantee that the integrity of this communication has been maintained
nor that this communication is free of viruses, interceptions or
interference.

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to