Created https://issues.apache.org/jira/browse/TIKA-696 to track the issue.
Can't see the watermark when saving and reopening the doc at the .docx format, have attached the .doc example Thanks Julien On 23 August 2011 14:06, Nick Burch <[email protected]> wrote: > On Tue, 23 Aug 2011, Julien Nioche wrote: > >> We definitely don't get them in Tika. See docs attached (saved with >> OpenOffice ) >> > > It's probably worth putting these sample files on a tika issue so they > don't get lost, and can be used in a future unit test > > The next thing to check is probably to unit the .docx file, and see where > the watermark text lives. If it's in the main document part then it should > be farily easy to get for Tika. If it's in a different part, then a little > bit of support will likely be needed on the POI side to allow easier access > to it > > > Nick > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > [email protected].**org<[email protected]> > For additional commands, e-mail: [email protected] > > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com
