Hi Tilman,
thank you for your reply.

It's more complicated because form XObjects, patterns, annotations,
> softmasks (and maybe more) can also have fonts. I also doubt that you
> can detect CK fonts this way.
>
I see... I have a nasty pdf file that is causing OutOfMemoryError when used
by another library and I reached the conclusion that it is (somehow)
because the text and the fonts it uses...

I saw the RemoveAllText example and maybe is what I need. I modified it and
instead of removing text I did nothing, and the new pdf file seems to have
the "corruption" removed...

One last question, how could I modify the RemoveAllText example to remove
from the pdf file all images?

Thanks.

Jorge



El jue, 6 may 2021 a las 1:07, Tilman Hausherr (<[email protected]>)
escribió:

> Am 05.05.2021 um 18:39 schrieb jorgeeflorez:
> > Hi,
> > I would like to know what would be the best way to detect whether ia pdf
> > file has CID fonts. As far as I understand, these fonts are used in asian
> > texts (japanese, chinese, korean, etc). I have the following code:
> >
> >          PDDocument doc = PDDocument.load(myFile);
> >          for (int i = 0; i < doc.getNumberOfPages(); ++i)
> >          {
> >              PDPage page = doc.getPage(i);
> >              PDResources res = page.getResources();
> >              for (COSName fontName : res.getFontNames())
> >              {
> >                  PDFont font = res.getFont(fontName);
> >                  COSName subType =
> > font.getCOSObject().getCOSName(COSName.SUBTYPE);
> >                  System.out.println("CID? " +
> COSName.TYPE0.equals(subType));
> >                  System.out.println("font instanceof PDType0Font? " +
> (font
> > instanceof PDType0Font));
> >              }
> >          }
> > Would this be the right way to do it?
>
>
> It's more complicated because form XObjects, patterns, annotations,
> softmasks (and maybe more) can also have fonts. I also doubt that you
> can detect CK fonts this way.
>
> Re removing the text, see the RemoveAllTexts example in the source code
> download. IIRC this one only does the page content stream.
>
> Tilman
>
>
> >
> > I need to detect this and try to create a pdf file from the original, but
> > without the text.
> >
> > Any indication is appreciated.
> >
> > Regards,
> >
> > Jorge
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to