Drop the BOM when extracting plain text ---------------------------------------
Key: TIKA-240 URL: https://issues.apache.org/jira/browse/TIKA-240 Project: Tika Issue Type: Bug Components: parser Reporter: Jukka Zitting Priority: Minor Plain text files sometimes have a byte order mark (BOM) at the beginning of the file to better indicate the character encoding used in the file. It looks like Tika currently outputs the BOM as a part of the extracted text, which is not desirable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.