Ard,
> > By coincidence I discovered that the xml file contains > > leading binary characters (ff fe) and that it as a whole is > > seen as binary by my text editor. So perhaps this is causing > > the duplicate results. I came across this link: http://www.25hoursaday.com/weblog/2005/10/18/TheMythOfTheOfficeXMLBinaryKey.aspx It mentions the ff fe bytes ( to indicate little-endian order) I see at the beginning of my document. The xml files contain the heading <?xml version="1.0" encoding="utf-16"?> specifying the encoding. > > I'll try to get them removed and see whether the issues is resolved. > > If you could do a test with this, it would give me some pointers indeed... When I manually overwrite a document (left out the two bites and also the encoding) the index is being 'repaired' and only one hit is found with a search. It looks like the trailing bytes and the encoding are causing the unexpected search results. --Æde _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
