[ https://issues.apache.org/jira/browse/NUTCH-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel updated NUTCH-2546: ----------------------------------- Fix Version/s: 1.17 > parse-(metatags|html) plugin - "meta property" not extracted only "meta name" > ----------------------------------------------------------------------------- > > Key: NUTCH-2546 > URL: https://issues.apache.org/jira/browse/NUTCH-2546 > Project: Nutch > Issue Type: Improvement > Components: parser > Affects Versions: 1.15 > Reporter: Irinel > Priority: Major > Fix For: 1.17 > > > The parse-(metatags|html) plugin "extracts" meta tags like "<meta property=", > but tags like "<meta *name*=" are not processed. > HTML e.g.: > <meta {color:#ff0000}property{color}="og:title" content="Content in this > property..."/> - not extracted > <meta *name*="description" content="Content in this meta..."/> - OK > > When using parse-tika plugin for parsing, meta property fields are processed. > <name>plugin.includes</name> > <value>parse-(*html*|tika|metatags)...</value> -- This message was sent by Atlassian Jira (v8.3.4#803005)