making a filter that processes "non plain text" files like the ones you mentioned sounds good. If I understand it correctly it should be called when adding an attachment, it should process the file creating searchable text and hand them off to lucene for indexing right ? please also consider a unit test for it.
adding a few more file-types for pure text files is a good quick-win, starting with .mm .htm .xhtml .java .c .cpp .php .asm .sh .properties .kml .gpx .loc anyone else opinions, suggestions ? regards, Harry 2011/1/13 Rolf Schumacher <[email protected]> > ok, Harry, thank you for the link. > > My suggestions, please correct: > > - hard-coding of file types seems to me as not a problem: anything shall be > searched > - the list is too short, important types such as .doc, .odt, .pdf, .ppt, > .odp are missing > - am I right here?: If I can provide a filter that makes text out of this > files it should not be as tough to add them > - we may be better off if we have an attribute with each attachment telling > its MIME type as far as detectable at attachment time, that way we are not > as much dependent on correct file extentions > > - a quick suggestion: please add .mm as another xml type. The freemind > plugin is of great value. > > kind regards > > > Rolf > > > > On 11.01.2011 18:42, Harry Metske wrote: > >> Rolf, >> >> see the source >> >> https://github.com/apache/jspwiki/blob/jspwiki_2_8_5/src/com/ecyrd/jspwiki/search/LuceneSearchProvider.java#L328 >> >> >> as you can see, currently the filetypes are hardcoded to just 4 types. >> We could make this a configurable option, patches are welcome. >> >> You say "comments given to an Attachment", I assume you mean Change Notes >> entered while uploading an attachment (or saving an normal Wiki Page). >> That is a bit more work I think. >> Being a complete Lucene null, but looking at the code it looks like we >> could >> add another field (we already index the page author and page name) for the >> Change Note. >> >> regards, >> Harry >> >> >> 2011/1/10 Rolf Schumacher<[email protected]> >> >> >> >>> I am using JSPWiki 2.8.4 >>> >>> Is it possible to extend a search to attachments to some mime types, e.g. >>> pdf? >>> >>> Is it possible to extend a search to the comments given to an attachment? >>> >>> kind regards >>> >>> Rolf >>> >>> >>> >> >> >
