Repository: any23 Updated Branches: refs/heads/master 54c461269 -> 7f08621b2
Support attribute content on all fields. <sometag content="something" /> should be considered, regardless if `content` is not a valid attribute of `sometag`. The specification for microdata[1] details that an elements content attribute should be considered before text content. Any23 doesn't currently do this, it only considers `content` for `meta` tags which is the only HTML tag which is suppose to have a `content` but not all sites follow HTML specifications. Updating the microdata parser to be able to get `content` from any element should it exist. [1] https://www.w3.org/TR/microdata/#values Signed-off-by: Ian Duffy <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/any23/repo Commit: http://git-wip-us.apache.org/repos/asf/any23/commit/28a68b53 Tree: http://git-wip-us.apache.org/repos/asf/any23/tree/28a68b53 Diff: http://git-wip-us.apache.org/repos/asf/any23/diff/28a68b53 Branch: refs/heads/master Commit: 28a68b535285f9d084725728f758272a3eda21be Parents: dfcccad Author: Ian Duffy <[email protected]> Authored: Wed Nov 8 13:59:42 2017 +0000 Committer: Ian Duffy <[email protected]> Committed: Wed Nov 8 14:04:44 2017 +0000 ---------------------------------------------------------------------- .../java/org/apache/any23/extractor/microdata/MicrodataParser.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/any23/blob/28a68b53/core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java ---------------------------------------------------------------------- diff --git a/core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java b/core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java index 8ee1cc6..147fd18 100644 --- a/core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java +++ b/core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java @@ -309,7 +309,7 @@ public class MicrodataParser { if(itemPropValue != null) return itemPropValue; final String nodeName = node.getNodeName().toLowerCase(); - if ("meta".equals(nodeName)) { + if (DomUtils.hasAttribute(node, "content")) { return new ItemPropValue(DomUtils.readAttribute(node, "content"), ItemPropValue.Type.Plain); }
