Re: TextPosition returns some characters with wrong case

Kévin Sailly Thu, 13 Oct 2011 12:29:37 -0700

no you can not use height font, if the font embended is not matching
standard unicode font (check that with a font editor) then you can try to
produce a matrix to go from the "special" font unicode mapping to the
standard one.


that's just an idea, I am just a user...

regards,
kévin

2011/10/13 Yavuz Nuzumlalı <[email protected]>

> Used font in the PDF file is "Kingfisher-Heavy", is it one of the
> unmatching
> fonts?
>
> Can I use character height values in order to solve correct this problem?
>
> For example; if I can get the height  for each character in the pdf file, I
> can compare this value with nearer characters, then I could convert a
> lowercase character to uppercase using some logic. Does PDFBox provide an
> interface to get height values for textposition objects, or characters?
>
> On Wed, Oct 12, 2011 at 8:29 PM, Kévin Sailly <[email protected]
> >wrote:
>
> > Hello,
> >
> > May be a font problem, the embended one in the pdf file is matching the
> > standard font mapping to unicode?
> >
> > Regards,
> > Kévin
> >
> > 2011/10/12 Yavuz Nuzumlalı <[email protected]>
> >
> > > Hi,
> > >
> > > When I try to use TextPosition to get text in a PDF file, it sometimes
> > > gives
> > > me related character with changed case.
> > >
> > > For example, The text in the pdf is like this:
> > >
> > > "BEBEK RANGE ROVER "
> > >
> > > And PDFBox returns the text like this:
> > >
> > > "bebek RANGe ROVeR "
> > >
> > > I'm using processTextPosition() method to get text. What could be the
> > > problem, I can't figured out how to solve the problem.
> > >
> > > Thanks.
> > >
> >
>

Re: TextPosition returns some characters with wrong case

Reply via email to