Hi,

we're using PDFBox 3.0.0-beta1 to extract text from PDFs. This produces lots of 
warnings about missing unicode mappings. Is there a programmatic way to 
suppress those messages or would it be better to configure the logging to do 
that?

If it's better to configure logging, I would try to configure the logging level 
for PDSimpleFont, PDType0Font, PDFont and GlyphList. Are those all relevant 
loggers or are there any more?

For GlyphList, the most common warning is "Not a number in Unicode character 
name: unionsq". I also saw a warning "Not a number in Unicode character name: 
users" but only for one PDF.


Mit freundlichen Grüßen
Erik Brangs
*** Suchen. Finden. Entdecken. Deutsche Nationalbibliothek ***

-- 
Erik Brangs
Deutsche Nationalbibliothek
Informationstechnik
Adickesallee 1
60322 Frankfurt am Main
Telefon: +49 69 1525-1792
Telefax: +49 69 1525-1799
mailto:e.bra...@dnb.de
http://www.dnb.de

Reply via email to