Hi,
Text extraction is available from the PDFTextStripper class. A subclass
can create HTML. All the rest you'll have to write yourself.
Tilman
Am 06.12.2021 um 13:52 schrieb shah manon:
For organizing books and article I need a light weight PDF viewer with copy,
highlight, image snap sort, sticky note, search option in JavaFX. By googling I
come to PDFBox and MuPDF. MuPDF has a class TextPage which is amazing but MuPDF
is written in C++ and its Java binding is a subset of its original API.
As I am very new to PDFBox Can anybody tell me how can I get the functionality
of extractText(), extractTEXT(), extractBLOCKS(), extractWORDS(),
extractHTML(), extractXHTML(), extractXML(), extractDICT(), extractJSON(),
extractRAWDICT(), extractRAWJSON(), search() using PDFBox please?
Nadvi.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org