We need to embed the image as well as hocr in the PDF file so that the PDF
will become searchable.


Thanks,
Raghu

On Sun, Feb 22, 2009 at 5:46 PM, Thomas Breuel <[email protected]> wrote:

>  On Sun, Feb 22, 2009 at 19:46, Raghu Udupa <[email protected]> wrote:
>
>> Thanks Faisal.
>>
>> I am planning to use ocropus/tesseract for TIFF to OCR conversion.
>>
>> I was looking for a reliable HOCR to PDF conversion program on Linux
>> platform with a C/C++ API or a program that can be called on command line.
>>
>
> There are actually several different kinds of hOCR-to-PDF conversions
> possible; some convert the hOCR/HTML text itself to PDF, others embed the
> page image and use the hOCR info just for searching.
>
> We're going to be focusing on that as part of a new project later in the
> year.
>
> For now, we're working hard on getting the next release of OCRopus out.
>
> Tom
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to