[ https://issues.apache.org/jira/browse/PDFBOX-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898511#comment-17898511 ]
Tilman Hausherr commented on PDFBOX-5901: ----------------------------------------- Extraction does work, here's what I get with page 24 if I comment out the logging: {noformat} 2024营销趋势洞察24 监控和管理您的品牌形象。 您想实时发现新的消费趋势吗? 我们最先进的视觉人工智能技术可帮助营销人员搜索和监控图像: . 标志和名人检测 . 人脸检测(年龄、性别、密度) . 场景和物体检测 . 情绪检测 . 模因检测 . 物体字符识别 . 图像聚类 另外,您可以在我们的网红营销平台 Klear 中上传图像,以发现创建 类似内容的网红。 了解更多 {noformat} The real problem is that the logging makes it very slow. I'll look for a way to reduce this. > there is an issue with font mapping or rendering > ------------------------------------------------ > > Key: PDFBOX-5901 > URL: https://issues.apache.org/jira/browse/PDFBOX-5901 > Project: PDFBox > Issue Type: Bug > Components: FontBox > Affects Versions: 2.0.31 > Reporter: ltzzZ > Priority: Major > Attachments: PDFBOX-5901-p24.pdf, image-2024-11-15-12-38-12-100.png, > image-2024-11-15-12-38-36-179.png, image-2024-11-15-12-39-22-585.png > > > When I try to extract the text content of a pdf file, I keep looping through > the warning log of font rendering or mapping, I can't get the content of the > file, how can I fix this problem. > > My code: > !image-2024-11-15-12-38-36-179.png! > problem: > !image-2024-11-15-12-39-22-585.png! > and sometimes the CPU usage is abnormal > !image-2024-11-15-12-38-12-100.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org