Hello,
I have been asked to process a large number of PDF and, for reasons I can't go
into, I need to separate the text from the graphics. I know I can create
separate PDFs from the originals (using a variety of tools) but I prefer not
to, mainly for speed reasons.
So I thought it might be possible to use OCGs (aka Layers) for this. Parsing
the PDPageContentStream in two buckets, one for text and the other for
graphics.
If this is feasible, does anyone know of any sample code that might be relevant
that I could use to kick start things?
Thanks in advance.
PDFDev/