Hi,
i've found this class awesome.
At the end of file there is a TODO: "Scale the text width to fit the OCR 
bbox";
I developed this TODO:

You replaced TODO row with this:
boolean textScaled = false;                                     
do {
  float lineWidth = defaultFont.getBaseFont().getWidthPoint(line,bboxHeightPt
);
  if(lineWidth < bboxWidthPt){
    textScaled = true;
  } else {
    bboxHeightPt-=0.1f;
  }
} while(textScaled==false);

After, i suggest to replace this row:

cb.setFontAndSize(defaultFont.getBaseFont(), Math.round(bboxHeightPt));

with this:
cb.setFontAndSize(defaultFont.getBaseFont(), bboxHeightPt);

Ciao! (I'am italian! :-p)

       Federico Tarantino

Il giorno domenica 16 dicembre 2007 23:58:32 UTC+1, Florian Hackenberger ha 
scritto:
>
> Hi! 
>
> I've hacked up a VERY basic hOCR to PDF converter in Java using iText 
> and jericho if anyone is interested. It reads all tags with bbox 
> properties and places the contained text into a box on a layer. The 
> original image is read from the ocr_page tag and added above the text. 
> The current shortcomings (to be solved within the next few weeks) are: 
>  * Does not handle multiple pages 
>  * Scaling the fonts to match the bounding boxes is not implemented 
>  * Only uses tags with of the ocr_line class having the bbox property 
> (to be solved later) 
>
> Please tell me if you like it and whether I should package it 
> properly. Please bear in mind that the file was hacked up in about 5 
> hours, so don't expect well structured code. The result is sort of a 
> proof of concept. Patches are welcome. 
>
> The java file can be found here (it needs the jericho and iText2 
> libraries in order to compile): 
> http://www.acoveo.com/acoveo/files/HocrToPdf.java 
>
> Cheers, 
>             Florian Hackenberger 
>
> -- 
> DI Florian Hackenberger 
> [email protected] <javascript:>

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msg/ocropus/-/8VlmO3bJOWEJ.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to