[ https://issues.apache.org/jira/browse/ANY23-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387623#comment-14387623 ]
ASF GitHub Bot commented on ANY23-247: -------------------------------------- Github user ansell commented on a diff in the pull request: https://github.com/apache/any23/pull/17#discussion_r27442717 --- Diff: core/src/main/java/org/apache/any23/validator/rule/MissingItemscopeAttributeValueRule.java --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.any23.validator.rule; + +import org.apache.any23.validator.DOMDocument; +import org.apache.any23.validator.Fix; +import org.apache.any23.validator.Rule; +import org.apache.any23.validator.RuleContext; + +/** + * This fixes missing attribute values for the 'itemscope' attribute, + * which was be associated with <div> nodes. + * Typically when such a snippet of XHTML is fed through the + * {@link org.apache.any23.extractor.rdfa.RDFa11Extractor}, and + * subsequently to Sesame's {@link org.semarglproject.sesame.rdf.rdfa.SesameRDFaParser}, + * it will result in the following behavior. + * <pre> + * {@code + * [Fatal Error] :23:15: Attribute name "itemscope" associated with an element type "div" must be followed by the ' = ' character. + * } + * </pre> + * This Fix is an effort to mitigate against that happening. + * + */ +public class MissingItemscopeAttributeValueRule implements Fix { --- End diff -- It may be done using a classpath scan. I will look into it further. > FIX Attribute name "itemscope" associated with an element type "html" must be > followed by the ' = ' character. > -------------------------------------------------------------------------------------------------------------- > > Key: ANY23-247 > URL: https://issues.apache.org/jira/browse/ANY23-247 > Project: Apache Any23 > Issue Type: Improvement > Affects Versions: 1.1 > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Fix For: 1.3 > > > In the following markup > {code} > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > "http://www.w3.org/TR/html4/loose.dtd"> > <html xmlns="http://www.w3.org/1999/xhtml" > xmlns:og="http://opengraphprotocol.org/schema/" > xmlns:fb="http://www.facebook.com/2008/fbml" version="HTML+RDFa 1.0" > xml:lang="en" itemscope itemtype="http://schema.org/Product"> > <head> > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> > <meta http-equiv="X-UA-Compatible" content="IE=edge" /> > <meta name="generator" content="ToolTwist" /> > ... > {code} > Due to the absence of any subsequent value for *itemscope*, we get the > following error in our web server logs > {code} > [Fatal Error] :2:185: Attribute name "itemscope" associated with an element > type "html" must be followed by the ' = ' character. > {code} > Although the markup semantics are incorrect, Any23 should simply perform a > check for the itemscope value being null, if this is the case then add *=""*, > there is a precedent for us doing something like this before, I just cant > find the ticket right now! > The code we need to add is present within either > core/src/main/java/org/apache/any23/extractor/microdata/ItemScope.java > core/src/main/java/org/apache/any23/extractor/microdata/MicrodataParser.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)