[ https://issues.apache.org/jira/browse/TIKA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-392. -------------------------------- Resolution: Fixed Fix Version/s: 0.7 Assignee: Jukka Zitting Fixed in revision 927044 by explicitly adding extra whitespace between subsequent text runs. > RTF parser smashes words together in subsequent table cells > ----------------------------------------------------------- > > Key: TIKA-392 > URL: https://issues.apache.org/jira/browse/TIKA-392 > Project: Tika > Issue Type: Bug > Components: parser > Reporter: Jukka Zitting > Assignee: Jukka Zitting > Priority: Minor > Fix For: 0.7 > > > I have an RTF document with the following snippet of content (it's an export > of a private phone book so I can't share the full document): > {\rtlch\fcs1 \af0\afs24 \ltrch\fcs0 > \f0\fs24\lang2055\langfe2055\langfenp2055\insrsid9461491\charrsid9461491 Fax > / Phone Station\cell Fax / Phone #\cell } > The extracted text is: > Fax / Phone StationFax / Phone > Note how the cell boundary between "Station" and "Fax" is lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.