[
https://issues.apache.org/jira/browse/NUTCH-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234340#comment-13234340
]
Hudson commented on NUTCH-809:
------------------------------
Integrated in nutch-trunk-maven #206 (See
[https://builds.apache.org/job/nutch-trunk-maven/206/])
NUTCH-809 Parse-metatags plugin (jnioche) (Revision 1303371)
Result = SUCCESS
jnioche :
Files :
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/conf/nutch-default.xml
* /nutch/trunk/src/plugin/build.xml
* /nutch/trunk/src/plugin/parse-metatags
* /nutch/trunk/src/plugin/parse-metatags/README.txt
* /nutch/trunk/src/plugin/parse-metatags/build.xml
* /nutch/trunk/src/plugin/parse-metatags/ivy.xml
* /nutch/trunk/src/plugin/parse-metatags/plugin.xml
* /nutch/trunk/src/plugin/parse-metatags/sample
* /nutch/trunk/src/plugin/parse-metatags/sample/testMetatags.html
* /nutch/trunk/src/plugin/parse-metatags/src
* /nutch/trunk/src/plugin/parse-metatags/src/java
* /nutch/trunk/src/plugin/parse-metatags/src/java/org
* /nutch/trunk/src/plugin/parse-metatags/src/java/org/apache
* /nutch/trunk/src/plugin/parse-metatags/src/java/org/apache/nutch
* /nutch/trunk/src/plugin/parse-metatags/src/java/org/apache/nutch/parse
*
/nutch/trunk/src/plugin/parse-metatags/src/java/org/apache/nutch/parse/MetaTagsParser.java
* /nutch/trunk/src/plugin/parse-metatags/src/test
* /nutch/trunk/src/plugin/parse-metatags/src/test/org
* /nutch/trunk/src/plugin/parse-metatags/src/test/org/apache
* /nutch/trunk/src/plugin/parse-metatags/src/test/org/apache/nutch
* /nutch/trunk/src/plugin/parse-metatags/src/test/org/apache/nutch/parse
* /nutch/trunk/src/plugin/parse-metatags/src/test/org/apache/nutch/parse/html
*
/nutch/trunk/src/plugin/parse-metatags/src/test/org/apache/nutch/parse/html/TestMetatagParser.java
> Parse-metatags plugin
> ---------------------
>
> Key: NUTCH-809
> URL: https://issues.apache.org/jira/browse/NUTCH-809
> Project: Nutch
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.4, nutchgora
> Reporter: Julien Nioche
> Assignee: Julien Nioche
> Fix For: 1.5
>
> Attachments: NUTCH-809-trunk.patch, NUTCH-809.patch,
> NUTCH-809_metatags_1.3.patch, metatags-plugin+tutorial.zip
>
>
> h2. Parse-metatags plugin
> The parse-metatags plugin consists of a HTMLParserFilter which takes as
> parameter a list of metatag names with '*' as default value. The values are
> separated by ';'.
> In order to extract the values of the metatags description and keywords, you
> must specify in nutch-site.xml
> {code:xml}
> <property>
> <name>metatags.names</name>
> <value>description;keywords</value>
> </property>
> {code}
> The MetatagIndexer uses the output of the parsing above to create two fields
> 'keywords' and 'description'. Note that keywords is multivalued.
> The query-basic plugin is used to include these fields in the search e.g. in
> nutch-site.xml
> {code:xml}
> <property>
> <name>query.basic.description.boost</name>
> <value>2.0</value>
> </property>
> <property>
> <name>query.basic.keywords.boost</name>
> <value>2.0</value>
> </property>
> {code}
> This code has been developed by DigitalPebble Ltd and offered to the
> community by ANT.com
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira