Hi Dan,
I answered there. Summary: There's nothing you can do. That ticket can be closed.
Tilman

Am 06.03.2026 um 20:25 schrieb Dan S:

Hi I am a Apache NIFI developer and we have a user reporting an issue regarding the use of TIka in our ExtractDocumentText processor. The user is noticing that a particular symbol is not being parsed correctly but rather is being translated either into a ? (question mark) or " (double quote). Please see NIFI-10218 <https://issues.apache.org/jira/browse/NIFI-10218> for more details.

Please advise if there is anything on our side to do to properly extract this text or is this a known limitation of parsing PDF documents.

Thank you!

Reply via email to