How do i unsubscribe from this list ??? anyone knows.

On Fri, May 21, 2010 at 1:44 PM, Claus Daldorph Nielsen <[email protected]> wrote:
> I have checked the discussion and in nutch-site.xml I have added
>  <property>
>    <name>metatags.names</name>
>    <value>title;keywords</value>
>  </property>
>
>  <property>
>    <name>query.basic.title.boost</name>
>    <value>2.0</value>
>  </property>
>
>  <property>
>    <name>query.basic.keywords.boost</name>
>    <value>2.0</value>
>  </property>
>
>
> I have also included the 'parse-metatags' in plugin.includes.
>
>
>
> Claus Daldorph Nielsen
>
> Theilgaard Mortensen a/s
> Niels Hemmingsens gade 9
> 1153 København K
>
> Tlf: 33448555
>
>
>
> Julien Nioche <[email protected]>
> 21-05-2010 13:33
> Please respond to
> [email protected]
>
>
> To
> [email protected]
> cc
>
> Subject
> Re: Parse and index meta tags in Nutch 1.0
>
>
>
>
>
>
> Have you checked the discussion in
> http://lucene.472066.n3.nabble.com/description-and-keywords-td690681.html?
> What have you modified in nutch-site.xml?
>
> j.
>
> On 21 May 2010 12:15, Claus Daldorph Nielsen <[email protected]> wrote:
>
>> Julien,
>>
>> Thanks it looks much like what I need. I have applied the patch and
> added
>> the lines to nutch-site.xml and then rebuild the Nutch project. But
> still
>> I don't see any metatags in my index. Do you have any suggestions to
> what
>> I might be doing wrong? Perhaps some configuration that I missed?
>>
>>
>>
>> Claus Daldorph Nielsen
>>
>> Theilgaard Mortensen a/s
>> Niels Hemmingsens gade 9
>> 1153 København K
>>
>> Tlf: 33448555
>>
>>
>>
>> Julien Nioche <[email protected]>
>> 21-05-2010 09:39
>> Please respond to
>> [email protected]
>>
>>
>> To
>> [email protected]
>> cc
>>
>> Subject
>> Re: Parse and index meta tags in Nutch 1.0
>>
>>
>>
>>
>>
>>
>> Claus,
>>
>> See https://issues.apache.org/jira/browse/NUTCH-809 and a related
>> discussion
>> on
>>
> http://lucene.472066.n3.nabble.com/description-and-keywords-td690681.html
>>
>> Julien
>>
>> --
>> DigitalPebble Ltd
>> http://www.digitalpebble.com
>>
>> On 21 May 2010 08:26, Claus Daldorph Nielsen <[email protected]> wrote:
>>
>> > Hi,
>> >
>> > I am new to Nutch and trying to get Nutch to index meta tags from html
>> > pages and store them for searching in Solr. The tags are on this form:
>> > <meta name="TITLE" content="Some title" />
>> > <meta name="KEYWORDS" content="Forum, help, build, stuff" />
>> >
>> > I would like to store the tags as two different fields in the index. I
>> > have tried the example explaining how to create a plugin but the
> example
>> > is for Nutch 0.9 and only helps me getting started.
>> >
>> > I think that I should look at :
>> >
>> >
>>
>>
> $NUTCH_HOME/src/plugin/parse-html/src/java/org/apache/nutch/parse/html/HtmlParser.java
>> >
>> > and find the line:
>> > HTMLMetaProcessor.getMetaTags(metaTags, root, base);
>> >
>> > But I'm not sure how to go on from here. Any help would be appreciated
>> and
>> > you are welcome to inform me if you know of an existing plugin that
> will
>> > index the meta tags.
>> >
>> >
>> >
>> > Claus Daldorph Nielsen
>> >
>> > Theilgaard Mortensen a/s
>>
>>
>
>
> --
> DigitalPebble Ltd
> http://www.digitalpebble.com
>
>



-- 
Karol Rybak

Reply via email to