Has anyone encoutered problems with this text filter. I am testing the text extraction of quite a large document (6MB worth of Thinking In Java by captain Bruce Eckel). Seaching was not producing expected results. I have taken the Reader object generated by the MsWordTextFilter and converted it into a String and writen it to a file. Inspection shows that most of the document has been omitted. The missing part is in the middle of the file and there are no particularly unusal contents that mark the start of the missing section. I've tested larger docs that work fine so its a bit of a mystery?
Cheers, Thomas -- View this message in context: http://www.nabble.com/MsWordTextFilter-Problem-t1626136.html#a4406009 Sent from the Jackrabbit - Dev forum at Nabble.com.
