looks like this is NOT in fact working.. i have a webpage that has this <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Asset Control and Behavior Branch</title> <meta name="keywords" content="Computational and Information Sciences, CISD, Tokarcik, research, data fusion, knowledge management, battlespace weather, environmental effects, computational science and engineering, battlefield communications and networks "> <meta name="description" content="This page explains the CISD mission and hosts the biographies of the CISD Director and Deputy Director.">
This in nutch-site.xml parse-(html|tika|metatags) the page is.. https://snip/inside/directorates/cisd/asset.cfm solr schema.xml is <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false" /> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> </fieldType> with <field name="metatag.description" type="text_general" stored="true" indexed="true" default="none" /> <field name="metatag.keywords" type="text_general" stored="true" indexed="true" default="none" /> <field name="metatag.date" type="text_general" stored="true" indexed="true" default="none" /> and the solr result is " title ": "Asset Control and Behavior Branch" , " metatag.date ": "none" , " metatag.description ": "none" , " metatag.keywords ": "none" Kris ----- Original Message ----- From: "KRIS MUSSHORN" <[email protected]> To: [email protected] Sent: Wednesday, September 7, 2016 9:24:36 AM Subject: indexing metatags with Nutch 1.12 Looks like its working correctly this morning using protocol-http and metatags... i didnt do anything to cause it to work...

