Using PrintTextColors.java or this approach ( https://stackoverflow.com/questions/21430341/identifying-the-text-based-on-the-output-in-pdf-using-pdfbox/21453780#21453780) works for me to get just the Color Information from Text.
I also need the MarkedContent ID for Text that is currently being processed or written (to check for underlines and other Dictionary Information). Up until now I'd just use the PDMarkedContentExtractor to get all the Text with it's corresponding MCID. I've seen MarkedContent Operators that I'd like to add to this Stripper, but could'nt get it working yet (I'm new to Java). Are there any similar examples for MCID's which I could use as guidance? Am Di., 22. Nov. 2022 um 04:32 Uhr schrieb Tilman Hausherr < thaush...@t-online.de>: > Please try the PrintTextColors.java example from the source code > download, in the examples subproject. > > Trying to interpret the PDF operators + parameters would become a huge > effort, which already exists. > > Tilman > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >