[ https://issues.apache.org/jira/browse/TIKA-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-240. -------------------------------- Resolution: Fixed Fix Version/s: 0.4 Assignee: Jukka Zitting Fixed in revision 789134. > Drop the BOM when extracting plain text > --------------------------------------- > > Key: TIKA-240 > URL: https://issues.apache.org/jira/browse/TIKA-240 > Project: Tika > Issue Type: Bug > Components: parser > Reporter: Jukka Zitting > Assignee: Jukka Zitting > Priority: Minor > Fix For: 0.4 > > > Plain text files sometimes have a byte order mark (BOM) at the beginning of > the file to better indicate the character encoding used in the file. It looks > like Tika currently outputs the BOM as a part of the extracted text, which is > not desirable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.