[jira] [Commented] (PDFBOX-5796) PDFBox cannot extract vector text from a PDF

2024-04-03 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/PDFBOX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833653#comment-17833653 ] Tilman Hausherr commented on PDFBOX-5796: - There's no flag. When I look at such PDFs, it's a

[jira] [Commented] (PDFBOX-5796) PDFBox cannot extract vector text from a PDF

2024-04-03 Thread Samved Chandrakant Divekar (Jira)
[ https://issues.apache.org/jira/browse/PDFBOX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833651#comment-17833651 ] Samved Chandrakant Divekar commented on PDFBOX-5796: [~tilman] I can perform OCR, my

[jira] [Commented] (PDFBOX-5796) PDFBox cannot extract vector text from a PDF

2024-04-03 Thread Maruan Sahyoun (Jira)
[ https://issues.apache.org/jira/browse/PDFBOX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833461#comment-17833461 ] Maruan Sahyoun commented on PDFBOX-5796: Adobe's "Recognize Text" function indeed is doing OCR.

[jira] [Commented] (PDFBOX-5796) PDFBox cannot extract vector text from a PDF

2024-04-03 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/PDFBOX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833456#comment-17833456 ] Tilman Hausherr commented on PDFBOX-5796: - Maybe Adobe is using OCR? PDFBox doesn't have a