Re: Adding additional metadata

2010-01-11 Thread Erlend Garåsen
I managed to "hack" HtmlParser by modifying the class HTMLMetaProcessor. Now I'm able to parse my metadata. I agree with you. I will write my own plugin later. At the moment I'm only interested to find out whether it is possible to start using Solr/Nutch instead of paying A LOT for a Fast/Ul

Re: Adding additional metadata

2010-01-11 Thread Andrzej Bialecki
On 2010-01-11 13:18, Erlend Garåsen wrote: First of all: I didn't know about the list archive, so sorry for not searching that resource before I sent a new post. MilleBii wrote: For lastModified just enable the index|query-more plugins it will do the job for you. Unfortunately not. Our pages

Re: Adding additional metadata

2010-01-11 Thread Erlend Garåsen
First of all: I didn't know about the list archive, so sorry for not searching that resource before I sent a new post. MilleBii wrote: For lastModified just enable the index|query-more plugins it will do the job for you. Unfortunately not. Our pages include Dublin core metadata which has a

Re: Adding additional metadata

2010-01-08 Thread J.G.Konrad
Something like this may work for your filter. I have not tested this but maybe it will give you a better idea of what you need to do for the author data. This is based on nutch-1.0 so I'm not sure if this would work for the trunk version. public class AuthorFilter implements HtmlParseFilter { p

Re: Adding additional metadata

2010-01-08 Thread MilleBii
For lastModified just enable the index|query-more plugins it will do the job for you. For other meta searc the mailing list its explained many times how to do it 2010/1/8, Erlend Garåsen : > > Hello, > > I have tried to add additional metadata by changing the code in > HtmlParser.java and MoreInd

Adding additional metadata

2010-01-08 Thread Erlend Garåsen
Hello, I have tried to add additional metadata by changing the code in HtmlParser.java and MoreIndexingFilter.java without any luck. Do I really have to do something which is mentioned on the following wiki in order to fetch the content of the metadata, i.e. write my own parser, filter and a