making a filter that processes "non plain text"  files like the ones you
mentioned sounds good.
If I understand it correctly it should be called when adding an attachment,
it should process the file creating searchable text and hand them off to
lucene for indexing right ?
please also consider a unit test for it.

adding a few more file-types for pure text files is a good quick-win,
starting with .mm .htm .xhtml .java .c .cpp .php .asm .sh .properties .kml
.gpx .loc

anyone else opinions, suggestions ?

regards,
Harry

2011/1/13 Rolf Schumacher <[email protected]>

> ok, Harry, thank you for the link.
>
> My suggestions, please correct:
>
> - hard-coding of file types seems to me as not a problem: anything shall be
> searched
> - the list is too short, important types such as .doc, .odt, .pdf, .ppt,
> .odp are missing
> - am I right here?: If I can provide a filter that makes text out of this
> files it should not be as tough to add them
> - we may be better off if we have an attribute with each attachment telling
> its MIME type as far as detectable at attachment time, that way we are not
> as much dependent on correct file extentions
>
> - a quick suggestion: please add .mm as another xml type. The freemind
> plugin is of great value.
>
> kind regards
>
>
> Rolf
>
>
>
> On 11.01.2011 18:42, Harry Metske wrote:
>
>> Rolf,
>>
>> see the source
>>
>> https://github.com/apache/jspwiki/blob/jspwiki_2_8_5/src/com/ecyrd/jspwiki/search/LuceneSearchProvider.java#L328
>>
>>
>> as you can see, currently the filetypes are hardcoded to just 4 types.
>> We could make this a configurable option, patches are welcome.
>>
>> You say "comments given to an Attachment", I assume you mean Change Notes
>> entered while uploading an attachment (or saving an normal Wiki Page).
>> That is a bit more work I think.
>> Being a complete Lucene null, but looking at the code it looks like we
>> could
>> add another field (we already index the page author and page name) for the
>> Change Note.
>>
>> regards,
>> Harry
>>
>>
>> 2011/1/10 Rolf Schumacher<[email protected]>
>>
>>
>>
>>> I am using JSPWiki 2.8.4
>>>
>>> Is it possible to extend a search to attachments to some mime types, e.g.
>>> pdf?
>>>
>>> Is it possible to extend a search to the comments given to an attachment?
>>>
>>> kind regards
>>>
>>> Rolf
>>>
>>>
>>>
>>
>>
>

Reply via email to